Humans make decisions in part by drawing on their previous related experiences. But human memory isn’t exhaustive and complete. Empirically, it appears that people tend to focus on particular snapshots from the past. One notion, the peak-end rule, suggests that individuals often associate the pain or benefit of an event with a weighted average of the peak or most intense experience and the final parts of the episode. Among other things, this implies a “duration neglect” as retrospective evaluations of experiences tend to ignore duration and rest strongly on extreme and final snapshots. Evidence for the peak-end rule has been found in peoples’ experience of medical procedures and satisfaction with material goods.
An important question is how the peak-end rule should influence future decisions. Some efforts have explored this issue with toy models in which an agent repeatedly takes one of two choices with probabilities which are functions of the maximum utilities previously experienced. Studies of such models have found the possibility of “trapping” behaviour, in which an agent becomes locked into making one sub-optimal choice repeatedly, as if developing a habit. Similar trapping behaviour is known from models in the behavioural economics literature, or in studies of opinion dynamics, where it links to a tendency of a population to segregate into different viewpoints. The generality of this effect, and the conditions under which it might appear, remain unclear.
In a new paper, LML External Fellow Rosemary Harris and Evangelos Mitsokapas of Queen Mary University of London address this issue by extending previous work to the case in which the utility distributions for the two choices are different. In this case, the agent faces the risk of becoming trapped in the worse choice, thereby receiving lower expected long-term returns. Using this model, the authors investigate how different levels of noise affect the decision-making process. Among other results, they establish that, for exponential distributions, there is an optimal value of noise such that the agent can always escape the trapping pitfall and make the choice that maximizes its returns. The analysis exploits a mapping between the discrete-time decision model and a random walker with a particular kind of memory, and reveals a link between a simplified version of the model and the well-known elephant random walk; both are mathematically Pólya urns with the distinction that in the present case there is a non-linear dependence on previous steps.
A preprint of the paper is available at https://arxiv.org/pdf/2108.05918.pdf