his thesis designed and implemented efficient deep learning methods to solve classification and segmentation problems in two major area of health informatics domains, namely pervasive sensing and medical imaging. In the area of pervasive sensing, this thesis focuses only on food and related scene classification for health and nutrition analysis. Recent studies show that it is important to know: “where we eat?” and “what we eat?” for properly monitoring our health conditions. To address those issues, deep learning models are employed by classifying food and related places. Moreover, the entire environment (e.g. create new datasets, models selection, parameter optimization, etc.) is prepared for this research.
1. To handle the first issue, “where we eat?”, a new dataset is developed, named ``FoodPlaces'', which consists of 35 food-related places form different public datasets. Later, different state-of-the-art convolutional neural network (CNN) models are evaluated on this dataset by fine-tuning their parameters using transfer learning. Inspired by the outcomes of the first analysis, another dataset is developed, named ``EgoFoodPlaces'', using the wearable camera with 22 food places where a user of the camera, called ``first-person'', often visited. Afterwards, a new architecture based on multi-scale atrous convolutional networks is designed, named ``MACNet'', for evaluating image-level classification on this dataset. An overall comparable accuracy is achieved in this experiment for all classes of this dataset, where each class refers to a different food place, such as bar, coffee shop and restaurant, etc. In order to study the temporal information and correlation between the frames captured by the egocentric camera, the problem is redefined based on the appropriate temporal intervals (period of stay). This period is then split into a set of events which is a sequence of correlated frames. Thus, a novel attention-based deep network, named ``MACNet+SA'', is introduced using previously defined ``MACNet'' model with self-attention mechanism for improving the classification rate of food places. The model ``MACNet+SA'' has set a state-of-the-art classification result using event-level analysis of egocentric photo-streams using ``EgoFoodPlaces'' dataset.
2. To deal with the second issue, “what we eat?”, another new dataset is developed with food attributes, called ``Yummly48K'', which aims to analyze food nutrition by classifying cuisine and food flavour. Eventually, a multi-scale convolutional network is presented, named ``CuisineNet'', which is designed by aggregating convolution layers with various kernel sizes followed by residual and pyramid pooling module with two fully connected pathway. This model is introduced to solve the multi-modal classification problems for cuisine and flavours.
In the field of medical imaging, this thesis targets skin lesion segmentation problem in the dermoscopic images. In this research, two novel deep learning models are introduced to accurately segment the skin lesions.
1. Firstly, a robust deep learning model is designed as an encoder-decoder network, called ``SLSDeep''. The encoder network in ``SLSDeep'' is composed of dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. Moreover, a new loss function is formed by fusing both Negative Log-Likelihood (NLL) and End Point Error (EPE) to accurately segment the melanoma regions.
2. Secondly, a lightweight and efficient model based on Generative Adversarial Networks (GANs), called ``MobileGAN'', is proposed for skin lesion segmentation. The ``MobileGAN'' combines 1D non-bottleneck factorization networks with position and channel attention modules in a conditional Generative Adversarial Networks (cGANs) model. The proposed model has only a few (2.35 million) parameters and is faster than the other state-of-the-art models.
The International Symposium on Biomedical Imaging (ISBI) 2016, 2017 and International Skin Imaging Collaboration (ISIC) 2018 benchmark datasets are used for the skin lesion segmentation task for evaluating the proposed models. Both proposed models present comparable and better segmentation accuracy than the state-of-the-art skin lesion segmentation models.
© 2008-2024 Fundación Dialnet · Todos los derechos reservados