Cookies Policy

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.

I accept this policy

Find out more here

Memory for Reward in Probabilistic Choice: Markovian and Non-Markovian Properties

No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.

Brill’s MyBook program is exclusively available on BrillOnline Books and Journals. Students and scholars affiliated with an institution that has purchased a Brill E-Book on the BrillOnline platform automatically have access to the MyBook option for the title(s) acquired by the Library. Brill MyBook is a print-on-demand paperback copy which is sold at a favorably uniform low price.

Access this article

+ Tax (if applicable)
Add to Favorites
You must be logged in to use this functionality

image of Behaviour

Pigeons were rewarded with food for pecking keys in various forms of two-armed bandit situation for an extended series of daily sessions in two experiments. The average daily preference (S=R/[R+L]) is very well fit by a markovian linear model in which predicted preference today is an average of predicted preference yesterday and reinforcement conditions today: s(N+1) = as(N) + (1-a)A(N+1), where A(N+1) is set equal to 1 when all rewards are for the Right response, and 0 when all are for the Left, and a is a longterm memory parameter. This linear model explains some apparent paradoxes in earlier reports of memory effects in two-armed bandit experiments. Nevertheless, closer examination of the details of preference changes within each experimental session showed several kinds of non-markovian effects. The most important was a regression at the beginning of each experimental session towards a preference characteristic of earlier sessions (spontaneous recovery). This effect, but not a smaller, less reliable non-markovian reminiscence effect, is consistent with a very simple rule, namely that the effect on preference of each individual reward for a Right or Left response is inversely related to how long ago the reward occurred. Thus, animals learn to prefer the rewarded side each day because these rewards are recent; but they regress to earlier preferences overnight because the most recent rewards become relatively less recent with lapse of time.

Affiliations: 1: (Department of Psychology, Duke University, Durham, N. Carolina, U.S.A. 27706


Full text loading...


Data & Media loading...

Article metrics loading...



Can't access your account?
  • Tools

  • Add to Favorites
  • Printable version
  • Email this page
  • Subscribe to ToC alert
  • Get permissions
  • Recommend to your library

    You must fill out fields marked with: *

    Librarian details
    Your details
    Why are you recommending this title?
    Select reason:
    Behaviour — Recommend this title to your library
  • Export citations
  • Key

  • Full access
  • Open Access
  • Partial/No accessInformation