# Maximization of expected utility, and bounded rationality

Models of bounded rationality are justified by the awareness that the assumption that agents take decisions guided by subjective probabilities equal to objective probabilities is unrealistic in many cases. In particular, 'when implemented numerically or econometrically, rational expectations models impute much more knowledge to the agents within the model ... than is possessed by an econometrician, who faces estimation and inference problems that the agents in the model have somehow solved' (Sargent, 1993, p. 3). In addition, when strategic rationality requires economic agents to be perfectly informed, they must have rational faculties that are even more unrealistic than those admitted by individual rationality.^{4} They must also foresee all the ideas that can be conceived by the other agents. Since it is quite difficult to admit that in the future what is now inconceivable will instead be conceivable, it is more realistic to assume that subjective probabilities are different from objective probabilities, and that the objective probability of each possible event is unknown to the rational agent.

Sargent (1993) proposes to build models of boundedly rational agents by relaxing only the second requirement of the REH given by the mutual consistency of perceptions. The decision procedure is the MEU, but the condition in which it is applied changes since the agent's behaviour is modelled inductively (Arthur, 1994). Fully rational agents are replaced with boundedly rational agents who recognize the difficulty of knowing the distribution of objective probabilities and maximize their expected utility function by understanding the economic environment in a Bayesian context, where subjective probabilities - which may differ from individual to individual - are estimates of the objective probabilities. Under conditions of incomplete information, a rational Bayesian decision maker ascribes a subjective probability to each of the possible strategies adopted by other agents. She or he must establish subjective probabilities according to the best available information. In this way she or he maximizes expected utility and has good reasons to believe that the other agents behave in the same way.

Harsanyi, too, admits Bayesian rationality postulates, and in his RU model states that a moral action is chosen in two steps. First, all agents choose the *moral rule* or code that maximizes the expected social utility out of the set of all possible moral rules; in this step the ultimate criterion of morality is the consequentialist criterion of social welfare maximization. Second, each agent chooses a personal act consistent with the socially optimal code, and it is admitted that a code may evaluate individual acts by a non-consequentialist criterion, 'if such a moral code is judged to yield higher social utility' (Harsanyi, 1986, p. 59).

If we admit that agents can learn about probabilities, adaptive or learning models in a dynamic context deal with the formation of expectations when imperfectly informed agents are confronted with situations of which they do not know the objective probabilities. According to Pesaran (1987, p. 19), adaptation is 'a plausible "rule of thumb" for updating or revising expectations in the light of past observed expectations errors'. Agents' actions generate new information, and they adapt and learn; they can also learn from the experience of other agents.^{5 }The basic idea is that, through a learning process, agents may correct subjective probabilities until they are equal to objective probabilities. In order to face the uncertainty that characterizes the economic situation considered, they try to behave like econometricians, whose task is to transform the sample information and the conceptual probability model associated with it 'into more specific knowledge about the unknown model components and parameters' (Mittelhammer, Judge and Miller, 2000, p. 6).

Learning rational agents must be able to solve models more mathematically and econometrically demanding than those based on substantive rationality. In order to present the logic of a learning process in a very simple way, consider two agents who interact: the policymaker and the private sector. The policymaker maximizes utility under the constraint given by the model of the economic system that represents the behaviour of the private sector. Three potentially different probabilities exist: the a priori subjective probability that the policymaker uses in order to compute its optimal policy; the a priori subjective probability that the private sector assigns to the model of the economic system; and the objective probability that governs the economic system. Assume that the private sector is well informed and therefore 'substantively rational' because its subjective probabilities distribution is equal to the objective probabilities distribution; while the policymaker behaves like an econometrician who is boundedly rational and attempts to learn about the objective probabilities distribution. In this situation, rationality is bounded by the lack of a commonly understood environment. The policymaker's learning process may concern: (a) given the structure of the model of the economic system, the true value of (some of) its parameters; or (b) the correct structure of the model. In particular, when the policymaker must learn about the true form of the model, we can imagine that, by using some regression technique for learning, at time *t* the policymaker myopically maximizes the social utility function under the constraint of the incorrectly estimated model, and behaves according to the myopic decision rule obtained from the model solution. The policymaker observes the true reaction of the private sector to its policy, and at time *t +* 1, aware of its false perception of the economic system's functioning, updates its model estimate and behaves according to the estimated new decision rule (Sargent, 1993). By gaining experience, if this updating process eliminates the foreseen errors, the policymaker reaches the REH equilibrium, and the learning process is fully rational. Nevertheless, when the decision maker increasingly makes forecasting errors, the learning process is not fully rational and the REH equilibrium is not attainable; it is therefore realistic to assume that subjective probabilities are different from objective probabilities (Ghosh and Masson, 1991, p. 466).

A final question remains to be answered: how do we have to behave when it is also difficult to establish subjective probabilities? Harsanyi's (1997, p. 111) answer is that in practical decisions, if we do not know the objective probabilities, if we do not even have sufficient information for establishing the subjective probabilities, and if we have no reason for preferring one alternative to another, then it is reasonable to act as if their probabilities are equal. This means resorting to the Bernoullian principle of indifference, the application of which is analysed by Kyburg (s. 2.2.3), Levi (ss. 3.4 and 3.7), Fano (s. 4.5), Costantini and Garibaldi (s. 8.7), and Kregel and Nasica (s. 11.3.2) in this volume.