Publication: Aprendizaje automático en conjuntos de clasificadores heterogéneos y modelado de agentes
Loading...
Identifiers
Publication date
2004
Defense date
2004-12-17
Authors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Una de las áreas que mas auge ha tenido en los últimos años dentro del aprendizaje
automático es aquella en donde se combinan las decisiones de clasificadores
individuales con la finalidad de que la decisión final de a que clase pertenece un
ejemplo sea realizada por un conjunto de clasificadores. Existen diversas técnicas
para generar conjuntos de clasificadores, desde la manipulación de los datos de
entrada a la utilización de meta-aprendizaje. Una de las maneras en las que se clasifican
estas técnicas es por el numero de algoritmos de aprendizaje diferentes que
utilizan con el fin de generar los miembros del conjunto. Aquellas técnicas que
utilizan un único algoritmo para generar todos los miembros del conjunto se dice
que generan un conjunto homogéneo. Por otra parte, aquellas técnicas que utilizan
mas de un algoritmo para generar los clasificadores se considera que generan
un conjunto de clasificadores heterogéneo. Entre los algoritmos de generación de
conjuntos heterogéneos se encuentra Stacking, el cual, además de generar los clasificadores
del conjunto a partir de distintos algoritmos de aprendizaje, utiliza dos
niveles de aprendizaje. El primer nivel de aprendizaje o nivel-0 utiliza los datos
del dominio de manera directa, mientras que el meta-nivel o nivel-1 utiliza datos
generados a partir de los clasificadores del nivel-0.
Un problema inherente a Stacking es determinar la configuración de los parámetros
de aprendizaje del algoritmo, entre ellos, qué y cuántos algoritmos deben ser
utilizados en la generación de los clasificadores del conjunto. Trabajo previos han
determinado que no hay un numero exacto de algoritmos a utilizar que sea el optimo
para todos los dominios. Tampoco está perfectamente definido qué algoritmos
se deberían utilizar, aunque existen trabajos que utilizan algoritmos representativos
de cada tipo.
Uno de los objetivos de esta tesis doctoral es la utilización de algoritmos genéticos
como técnica de optimización para determinar los algoritmos que deben ser
utilizados para generar el conjunto de clasificadores, al igual que la configuración
de los parámetros de aprendizaje de estos. De esta manera el método que se propone
es independiente del dominio, mientras que la configuración de los parámetros
de Stacking encontrada, dependería del dominio.
El crecimiento del comercio electrónico y las aplicaciones en la World-Wide-
Web ha motivado el incremento de los entornos en donde intervienen agentes. Estos
entornos incluyen situaciones competitivas y/o colaborativas en donde el conocimiento
que se posea sobre los individuos involucrados en el entorno, proporciona
II
III
una clara ventaja a la hora de tomar una decisión sobre qué acción llevar a cabo.
Existen diversas formas de adquirir este conocimiento. Una de ellas es a través del
modelado del comportamiento de los agentes.
A su vez, existen diversas formas de construir el modelo de un agente. Algunas
técnicas utilizan modelos previamente construidos y su objetivo es intentar emparejar
el comportamiento observado con un modelo existente. Otras técnicas asumen
un comportamiento optimo del agente a modelar con el fin de crear un modelo de
su comportamiento.
Un segundo objetivo de esta tesis doctoral es la creación de un marco general
para el modelado de agentes basándose en la observación del comportamiento del
agente a modelar. Para ello se propone la utilización de técnicas de aprendizaje
automático con el propósito de llevar a cabo la tarea de modelado basándose en la
relación existente entre la entrada y la salida del agente.
____________________________________________ In the last years, one of the most active research areas in Machine Learning is that of ensembles of classifiers. Their purpose is to combine the decisions of individual classifiers so that all classifiers in the ensemble are taken into account in order to classify new instances. There are many techniques that generate such ensembles. Some manipulate the input data, while others use meta-learning. In general, ensembles can be homogeneous or heterogeneous. Homogeneous ensembles consist of several classifiers generated by the same learning technique, whereas heterogeneous ensembles contain classifiers generated by different algorithms. A well-known approach to generate heterogeneous ensembles is Stacking. Stacking uses two levels of learning. The first learning level or level-0 uses direct data from the domain, whereas the meta-level or level-1 uses data generated by classifiers from level-0. An inherent problem to Stacking is to determine the right configuration of the learning parameters, like how many classifiers, and which learning algorithms, must be used in the generation of the ensemble of classifiers. Previous work have shown that there is no optimal decision for all the domains, although there are works that use representative algorithms from each type. One goal of this thesis is to use Genetic Algorithms as an optimization technique in order to determine the type and number of algorithms to be used to generate the ensemble of classifiers, as well as the configuration of the learning parameters of these algorithms. The proposed method is domain independent, and the Genetic Algorithm will be able to adapt to particular domains. The growth of the e-commerce and applications over the World-Wide-Web has motivated the increase of environments where agents can interact. These environment include competitive and/or colaborative situations where the knowledge about other individuals involved in the environment, provides a clear advantage when making decision about actions to perform. There are several ways to acquire this knowledge. One of them is by modeling the behavior of other agents. There are several ways to construct an agent’s model. Some techniques use previously constructed models and its goal to match the observed behavior with an existing model. Other techniques assume that the agent to model carries out an optimal strategy in order to create a model of its behavior. In this thesis, a second approach to model agents will be used based on the observation of other agents behavior. In order to do this, a general framework that uses machine learning techniques for agent modeling is proposed.
____________________________________________ In the last years, one of the most active research areas in Machine Learning is that of ensembles of classifiers. Their purpose is to combine the decisions of individual classifiers so that all classifiers in the ensemble are taken into account in order to classify new instances. There are many techniques that generate such ensembles. Some manipulate the input data, while others use meta-learning. In general, ensembles can be homogeneous or heterogeneous. Homogeneous ensembles consist of several classifiers generated by the same learning technique, whereas heterogeneous ensembles contain classifiers generated by different algorithms. A well-known approach to generate heterogeneous ensembles is Stacking. Stacking uses two levels of learning. The first learning level or level-0 uses direct data from the domain, whereas the meta-level or level-1 uses data generated by classifiers from level-0. An inherent problem to Stacking is to determine the right configuration of the learning parameters, like how many classifiers, and which learning algorithms, must be used in the generation of the ensemble of classifiers. Previous work have shown that there is no optimal decision for all the domains, although there are works that use representative algorithms from each type. One goal of this thesis is to use Genetic Algorithms as an optimization technique in order to determine the type and number of algorithms to be used to generate the ensemble of classifiers, as well as the configuration of the learning parameters of these algorithms. The proposed method is domain independent, and the Genetic Algorithm will be able to adapt to particular domains. The growth of the e-commerce and applications over the World-Wide-Web has motivated the increase of environments where agents can interact. These environment include competitive and/or colaborative situations where the knowledge about other individuals involved in the environment, provides a clear advantage when making decision about actions to perform. There are several ways to acquire this knowledge. One of them is by modeling the behavior of other agents. There are several ways to construct an agent’s model. Some techniques use previously constructed models and its goal to match the observed behavior with an existing model. Other techniques assume that the agent to model carries out an optimal strategy in order to create a model of its behavior. In this thesis, a second approach to model agents will be used based on the observation of other agents behavior. In order to do this, a general framework that uses machine learning techniques for agent modeling is proposed.
Description
Keywords
Inteligencia artificial, Aprendizaje, Algoritmos genéticos