Ir al contenido

Documat


Resumen de Building task-oriented machine translation systems

Germán Sanchis Trilles

  • The main goal of this thesis is to develop computer assisted translation and machine translation systems which present a more robust synergy with their potential users. Hence, the main purpose is to make current state-of-the-art systems more ergonomic, intuitive and efficient, so that the human expert feels more comfortable when using them. For doing this, different techniques are presented, focusing on improving the adaptability and response time of the underlying statistical machine translation systems, as well as a strategy aiming at enhancing human-machine interaction within an interactive machine translation setup. All of this with the ultimate purpose of filling in the existing gap between the state of the art in machine translation and the final tools that are usually available for the final human translators.

    Concerning the response time of the machine translation systems, a parameter pruning technique is presented, whose intuition stems from the concept of bilingual segmentation, but which evolves towards a full parameter re-estimation strategy. By using such strategy, experimental results presented here prove that it is possible to achieve reductions of up to 97% in the number of parameters required without a significant loss in translation quality. Being robust across different language pairs, these results evidence that the pruning technique presented is effective in a traditional machine translation scenario, and could be used for instance in a post-editing setup. Nevertheless, experiments carried out within a simulated interactive machine translation environment are slightly less convincing, since a trade-off between response time and translation quality is needed.

    Two orthogonally different approaches are presented with the purpose of increasing the adaptability of the statistical machine translation systems. On the one hand, we investigate how to increase the adaptability of the language model, by subdividing it into several smaller language models which are then interpolated in translation time according to the source sentence to be translated. The specific sub-models are built either by taking advantage of supervised information present in certain bilingual corpora, or by performing unsupervised clustering on the training set, with the aim of uncovering specific sub-topics or language styles present. On the other hand, Bayesian predictive adaptation is elucidated as an efficient strategy for adapting the translation models present in state-of-the-art machine translation systems.

    Although adaptation experiments are only performed within the traditional machine translation framework, the results obtained are compelling enough for implementing them within an interactive setup, and such work will be done in the near future. Nevertheless, it should be noted that the techniques developed may be readily implemented within a computer assisted translation scenario, in which a statistical machine translation system is providing the translations that the user needs to modify and validate.

    Finally, special attention is devoted to increasing the synergy between the human expert and the interactive machine translation system. With this purpose, two different forms of weaker feedback are studied, which intend to increase the productivity of the human translator. For doing this, two different changes to the traditional interaction scheme are presented. The first one aims at anticipating the user's actions, and the second one is targeted at increasing the flexibility of the system whenever the user signals that there is an error he wants the system to correct.


Fundación Dialnet

Mi Documat