Ir al contenido

Documat


Resumen de Strategies for the generation and exploitation of synthetic training data in pixel-level computer vision

Javier Montalvo Rodrigo

  • In recent years, the introduction of machine learning in computer vision methods has increased the demand for data that can be used to train them. Labeling data can be an expensive process, particularly for computer vision tasks such as semantic segmentation, where images have pixel-level labels for classification. For this reason, synthetic data plays a crucial role in the field of computer vision by offering data at scale, with automatic, perfect labels without the costs associated with data collection and labeling, enabling researchers and developers to create large-scale datasets with precise annotations, that cover a wide range of scenarios, including rare or dangerous situations that would be impractical to capture otherwise. Additionally, synthetic data helps mitigate privacy concerns, as it does not involve real individuals or sensitive information. However, synthetic data also has some notable drawbacks: A reduced variability compared to the real world, along with inherent differences between synthetic and real images, can substantially reduce the transferability and generalization of models trained exclusively on synthetic data. To address these challenges, this thesis explores different approaches to synthetic data generation and its applications in computer vision. It examines different techniques for creating synthetic datasets, starting with hand-crafted algorithms to generate synthetic data for semantic segmentation. Then, it explores how to adapt and exploit different simulation tools with multiple purposes: from generating a novel semantic segmentation dataset to discussing alternate methods to generate data that can produce more general models and finally introducing a novel simulation tool designed from scratch for generating spacecraft imagery with navigation tasks in mind. This work closes with a proposal on how to effectively utilize generative artificial intelligence to complete datasets for training computer vision models with novel, previously unseen classes, and investigates how these methods can be integrated into existing workflows to enhance model performance and generalization


Fundación Dialnet

Mi Documat