Ir al contenido

Documat


Resumen de Variational and deep learning methods in computer vision

Iván Ramírez Díaz

  • In this thesis we address Computer Vision problems in real scenarios from two perspectives with the usage of: (1) Variational Methods and (2) Deep Learning techniques. The former is a powerful tool that gives an extraordinary control over the expected outcomes with very accurate results if some hyperparameterization is carried out properly. However, this required (usually manual) hyper-parameterization constitutes a huge shortcoming in practice, and a limitation for a wide use by non experts. The later relies mainly on data and solves, until a certain point, an high-dimensional interpolation problem with astonishing results that, however, are sometimes unpredictable (and thus dangerous) when unseen data from different distribution are tested (extrapolation).

    To this end:

    1. We start using Variational Methods to solve a Saliency detection problem that leads to a nearly binary image (segmentation) through a novel non local non convex variational model. Such method is applied to Magnetic Resonance images, where the goal is to detect and segment tumoral tissues.

    Then, the hyper-parameterization of such a model is addressed using Deep Learning. To this end, the numerical resolution is re-interpreted as extra layers embedded in a global Neural Network architecture. This constitute a first attempt to combine Variational Methods with Deep Learning.

    Consequently, drawbacks of a technique can be circumvent by the goods of the other.

    2. Then, a Deep Learning method is considered to face an image classification task for automatically recognize public dumpsters. The difficulty of such a task, out from a theoretical viewpoint and/or laboratory, is the lack of data. Deep Learning is data hungry. Acquiring all these data is often an expensive investment out of reach for many companies or public entities, even more when data must be structured and labeled for supervised learning. We propose a semi-automatic method for selecting appropriate images candidates to leverage the manual labelling procedure and reach top performance results. We also show that the predictions uncertainty may be address in order to improve the robustness of such Deep Neural Networks.

    3. With the acquired experience of Variational and Deep Learning methods, we derive both techniques from a more general framework known as Bayesian Inference, showing that the majority of novel techniques that succeed in one domain can be explained trough this perspective. In fact, Variational Methods and, Deep Learning or Machine Learning, are two sides of the same coin. To test the power of such generalist methodology, we consider a very ill-posed problem: 3D Human Pose Estimation from 2D Images. Using recent Deep Learning architectures as Capsule Networks and novel approaches as Bayesian Deep Learning, we propose a simple end-to-end Bayesian Capsule Network. This proposal makes use of Deep Learning techniques and Variational Inference to reach state of the art results while keeping a general purpose approach.

    Variational and Deep Learning methods are shown to be very powerful and performing tools. However, both have several drawbacks that limit their usage. In the case of Variational methods, despite the very accurate results they provide, the need of an optimal hyper-parameterization to achieve those performances makes them impracticable. On the other hand, Deep Learning methods manage to avoid this inconvenient relying on data but sacrifices robustness. The general results show that, by combining both methods, it is possible to keep accurate predictions and robustness. Finally, as a consequence of our research and results, we conclude that in the future, Variational and Deep Learning Methods in Computer Vision are condemned to get along. The same applies for experts in both fields.


Fundación Dialnet

Mi Documat