Richard S. Sutton: Learning to Predict by the Methods of Temporal Differences. Mach. Learn. 3: 9-44 (1988)