Fairness and Robustness in Machine Learning

Ashneet Khandpur Singh

Ayuda

Fairness and Robustness in Machine Learning

Autores: Ashneet Khandpur Singh
Directores de la Tesis: Josep Domingo Ferrer (dir. tes.) , Alberto Blanco Justicia (dir. tes.)
Lectura: En la Universitat Rovira i Virgili ( España ) en 2023
Idioma: inglés
Número de páginas: 118
Tribunal Calificador de la Tesis: Pino Caballero Gil (presid.) , Maria Bras Amorós (secret.) , Javier Parra Arnau (voc.)
Enlaces
- Tesis en acceso abierto en: TDX
Resumen
- español
  Los modelos de aprendizaje automático aprenden de datos para modelar entornos y problemas concretos y predecir eventos futuros, pero si los datos están sesgados, darán lugar a predicciones sesgadas. Por lo tanto, es fundamental asegurarse de que sus predicciones sean justas y no se basen en la discriminación contra grupos o comunidades específicos. El aprendizaje federado, una forma de aprendizaje automático distribuido, debe estar equipado con técnicas para abordar este gran desafío interdisciplinario. Aunque el aprendizaje federado ofrece mayores garantías de privacidad a los clientes participantes que el aprendizaje centralizado, este es vulnerable a algunos ataques en los que clientes maliciosos envían malas actualizaciones para evitar que el modelo converja o, más sutilmente, para introducir sesgos artificiales en sus predicciones o decisiones (envenenamiento o poisoning). Una desventaja de las técnicas contra el envenenamiento es que pueden llevar a discriminar a grupos minoritarios cuyos datos son significativamente y legítimamente diferentes de los de la mayoría de los clientes. En este trabajo, nos dedicamos a lograr un equilibrio entre la lucha contra el envenenamiento y dar espacio a la diversidad para contribuir a un aprendizaje más justo y menos discriminatorio de modelos de aprendizaje federado. De este modo, evitamos la exclusión de diversos clientes y garantizamos la detección de los ataques de envenenamiento. Por otro lado, para desarrollar modelos justos y verificar la equidad de estos modelos en el área de ML, proponemos un método, basado en ejemplos contrafactuales, que detecta cualquier sesgo en el modelo de aprendizaje automático, independientemente del tipo de datos utilizado en el modelo.
- català
  Els models d'aprenentatge automàtic aprenen d'aquestes dades per modelar entorns i problemes concrets, i predir esdeveniments futurs, però si les dades presenten biaixos, donaran lloc a prediccions i conclusions esbiaixades. Per tant, és fonamental assegurar-se que llurs prediccions són justes i no es basen en la discriminació contra grups o comunitats específics. L'aprenentatge federat, una forma d'aprenentatge automàtic distribuït, cal equipar-se amb tècniques per afrontar aquest gran repte interdisciplinari. L'aprenentatge federat proporciona millors garanties de privadesa als clients participants que no pas l'aprenentatge centralitzat. Tot i així, l'aprenentatge federat és vulnerable a atacs en els quals clients maliciosos presenten actualitzacions incorrectes per tal d'evitar que el model convergeixi o, més subtilment, per introduir biaixos arbitraris en les prediccions o decisions dels models (enverinament o poisoning). Un desavantatge d'aquestes tècniques de enverinament és que podrien conduir a la discriminació de grups minoritaris, les dades dels quals són significativament i legítimament diferents de les de la majoria dels clients.En aquest treball, ens esforcem per trobar un equilibri entre combatre els atacs d'enverinament i acomodar la diversitat, tot per a ajudar a aprendre models d'aprenentatge federats més justos i menys discriminatoris. D'aquesta manera, evitem l'exclusió de clients de minories legítimes i alhora garantim la detecció d'atacs d'enverinament. D'altra banda, per tal de desenvolupar models justos i verificar-ne la imparcialitat en l'àrea d'aprenentatge automàtic, proposem un mètode basat en exemples contrafactuals que detecta qualsevol biaix en el model de ML, independentment del tipus de dades utilitzat en el model.
- English
  The rise of the IoT and other distributed environments are causing an increase in the number of devices that constantly collect and exchange data. Machine learning models learn from these data to model concrete environments and problems and predict future events but, if the data are biased, they may reach biased conclusions.
  
  Such models can be used to make essential and life-changing decisions in a variety of sensitive contexts.
  
  Therefore, it is critical to make sure their predictions are fair and not based on discrimination against specific groups or communities, like those of a particular race, gender, or sexual orientation.
  
  Federated learning, a type of distributed machine learning, has become one of the foundations of the next-generation AI in distributed settings and needs to be equipped with techniques to tackle this grand and interdisciplinary challenge.
  
  Even if FL provides stronger privacy guarantees to the participating clients than centralized learning, in which the clients' raw data are collected in a central server, it is vulnerable to some attacks whereby malicious clients submit bad updates in order to prevent the model from converging or, more subtly, to introduce artificial biases in the models' predictions or decisions (poisoning).
  
  Poisoning detection techniques compute statistics on the updates sent by participants to identify malicious clients.
  
  A downside of anti-poisoning techniques is that they might lead to discriminating against minority groups whose data are significantly and legitimately different from those of the majority of clients.
  
  This would not only be unfair but would yield poorer models that would fail to capture the knowledge in the training data, especially when data are not independent and identically distributed.
  
  In this work, we strive to strike a balance between fighting poisoning and accommodating diversity to help learn fairer and less discriminatory federated learning models.
  
  In this way, we forestall the exclusion of diverse clients while still ensuring the detection of poisoning attacks.
  
  Additionally, we explore the impact of our proposal on the performance of models on non-i.i.d local training data.
  
  On the other hand, in order to develop fair models and verify the fairness of these models in the area of machine learning, we propose a method, based on counterfactual examples, that detects any bias in the ML model, regardless of the data type used in the model Objectives: Our contributions are mechanisms to reconcile security with fairness in FL on non-i.i.d data and a method, based on counterfactual examples, that detects any bias in the ML model. - We propose three methods to distinguish members of minority groups from attackers. A first method based on microaggregation, a second one that uses GMM, and a third method based on DBSCAN. - We propose a method, based on counterfactual examples (CE), that detects bias regardless of the data type, in particular for image and tabular data.
  
  Material and methods: For the realization of this work, Python has beed used as a programming language, implemented in two different interactive environments such as Jupyter Notebook and Google Colaboratory. Some Python libraries and modules, such as pandas, numpy, matplotlib and scikit-learn have been integrated in different stages of this work. Furthermore, for the experimental part, some standard machine learning datasets have been used.
  
  We conducted experiments to examine the effectiveness of our proposed mechanisms in FL with minority groups and non-i.i.d. data. To that end, we chose three publicly available data sets, namely (i) the Adult Income data set, (ii) the Athletes data set, and (iii) the Bank Marketing data set.
  
  We evaluated the performance of the proposed approach on two ML tasks: tabular data classification (on Adult data set) and image classification (on CelebA data set). For each task, we trained a baseline model with the original data set and a bias model after we did some alterations to the data set. In both data sets, the baseline and bias models had the same architecture.
  
  Conclusions: In this work we have developed methods for fair and robust ML. We want to ensure that no minority in the data set is unfairly impacted by the model's prediction. To this end, we propose methods that tackle this situation.
  
  We have dealt with the problem of distinguishing abnormal/malicious behaviors from legitimate ones in federated learning. We focus on scenarios with clients having legitimate minority data, whose updates are likely to be classified as outlying/malicious by the standard attack detection mechanisms proposed in the literature. To make progress towards fair attack detection, we propose three different methods, one based on microaggregation, another based on the Gaussian mixture model and the third one based on DBSCAN.
  
  To measure fairness in generalized ML models, we propose a method, based on generating counterfactual examples. For the case of tabular data, to create these counterfactual examples, we make use of adversarial examples. By using these, we can create scenarios that are similar to real-life situations, but slightly different to force our models to make an incorrect prediction. Also, they are useful for testing the robustness of our models. When dealing with image data, to generate counterfactual examples, we leverage GANs. They also provide robustness to our models. The results show that the biased models were precisely biased against the individuals in the targeted datasets.