Ir al contenido

Documat


Resumen de Equilibrium Properties of Reinforcement Learning by Imitation

Antonio J. Morales Siles Árbol académico

  • We investigate the ability of pure strategy imitation rules to lead the decision maker towards optimal actions when he faces repeatedly a decision problem.

    We study the perfomance of a given imitation rule in two different frameworks:

    in an arbitrary environment, i.e. in an environment in which the population behaviour is given and fixed, and in an environment in which all other agents use this rule. In the first framework, we show that there are no approximately maximising rules, i.e. rules which lead the decision maker to play in the long run the expected payoff maximising action with probability arbitrarily closed to one, independent of what the true payoff distribution is and regardless of the given and fixed population behaviour. In the second framework, we find that there are no maximising rules, i.e. rules that lead the entire population to play the expected payoff maximising strategy.

    In the absence of a maximising rule, we investigate individual agents� incentives to adopt any particular rule. We define an appropriate payoff function and for given population and fixed imitation rule, we define the set of imitation rules which are best responses to the fixed imitation rule, An imitation rule is called an equilibrium rule if it belongs to the set of best responses to itself. We characterise the set of equilibrium rules in some restrictive settings.


Fundación Dialnet

Mi Documat