This research project explores the hypothesis that, given a bounded number of steps in an environment, agents that most efficiently optimize their model of the environment are more likely to induce emergent intelligent behavior in a reward- free scenario. We refer to this as the optimal explorer hypothesis. The project aims to formalize and analyze this hypothesis, investigating its theoretical impli- cations and connections to related areas such as open-ended learning and active inference. Building on this foundation, we will develop a practical implementation of an approximate “optimal explorer” agent by formulating it as a combinatorial optimization problem and leveraging established methods from the field. Finally, we will conduct extensive experiments to evaluate whether the proposed agent induces emergent behaviors in diverse and challenging environments.
© 2008-2025 Fundación Dialnet · Todos los derechos reservados