Resumen de Exploring the Use of Target-Language Information to Train the Part-of-Speech Tagger of Machine Translation Systems

When automatically translating between related languages, one of the main sources of machine translation errors is the incorrect resolution of part-of-speech (PoS) ambiguities. Hidden Markov models (HMM) are the standard statistical approach to try to properly resolve such ambiguities. The usual training algorithms collect statistics from source-language texts in order to adjust the parameters of the HMM, but if the HMM is to be embedded in a machine translation system, target-language information may also prove valuable. We study how to use a target-language model (in addition to source-language texts) to improve the tagging and translation performance of a statistical PoS tagger of an otherwise rule-based, shallow-transfer machine translation engine, although other architectures may be considered as well. The method may also be used to customize the machine translation engine to a particular target language, text type, or subject, or to statistically "retune" it after introducing new transfer rules. © Springer-Verlag Berlin Heidelberg 2004.

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: