Ir al contenido

Documat


Resumen de Segmentation and indexation of complex objects in comic book images

Rigaud Christophe Francis

  • Born in the 19th century, comics is a visual medium used to express ideas via images, often combined with text or visual information. It is considered as a sequential art, spread worldwide initially using newspapers, books and magazines. Nowadays, the development of the new technologies and the World Wide Web is giving birth to a new form of paperless comics that takes advantage of the virtual world freedom. However, traditional comics still represent an important cultural heritage in many countries. They have not yet received the same level of attention as music, cinema or literature about their adaptation to the digital format. Using information technologies with classic comics would facilitate the exploration of digital libraries, accelerate their translation, allow augmented reading, speech playback for the visually impaired etc.

    Heritage museums such as the CIBDI (French acronym for International City of Comic books and Images), the Kyoto International Manga Museum and the digitalcomicmuseum.com have already digitized several thousands of comic albums that some are now in the public domain. Despite the expanding market place of digital comics, few research has been carried out to take advantage of the added value provided by these new media. A particularity of documents is their dependence on the type of document that often requires specific processing. The challenge of document analysis systems is to propose generic solutions for specific problems. The design process of comics is so specific that their automated analysis may be seen as a niche research field within document analysis, at the intersection of complex background, semi-structured and mixed content documents.

    Being at the intersection of several fields, combines their difficulties. In this thesis, we review, highlight and illustrate the challenges in order to give to the reader a good overview about the last research progress in this field and the current issues. We propose three different approaches for comic book image analysis relying on previous work and novelties. The first approach is called ``sequential'' because the image content is described in an intuitive way, from simple to complex elements using previously extracted elements to guide further processing. Simple elements such as panel text and balloon are extracted first, followed by the balloon tail and then the comic character position in the panel from the direction pointed by the tail. The second approach addresses independent information extraction to recover the main drawback of the first approach: error propagation. This second method is called ``independent'' because it is composed by several specific extractors for each elements of the image content. Those extractors can be used in parallel, without needing previous extraction. Extra processing such as balloon type classification and text recognition are also covered. The third approach introduces a knowledge-driven system that combines low and high level processing to build a scalable system of comics image understanding. We built an expert system composed by an inference engine and two models, one for comics domain and another one for image processing, stored in an ontology. This expert system combines the benefits of the two first approaches and enables high level semantic description such as the reading order, the semantic of the balloons, the relations between the speech balloons and their speakers, and the interaction between the comic characters.

    Apart from that, in this thesis we have provided the first public comics image dataset and ground truth to the community along with an overall experimental comparison of all the proposed methods and some of the state-of-the-art methods.


Fundación Dialnet

Mi Documat