Learning to predict by the methods of temporal differences

Learning to predict by the methods of temporal differences