Integration of Machine Translation Paradigms
Summary & Objectives
Machine Translation (MT) is a highly interdisciplinary and multidisciplinary field since it is approached from the point of view of engineering, computer science, informatics, statistics and linguists. Unfortunately, the cooperation and interaction among these fields in relation to MT technologies is still very low. The goal of this research project is to approach the different profiles in the MT community by providing a new integrated MT paradigm which mainly includes linguistic technologies and statistical algorithms.
Basically, our research will be focused on the problem of dynamically integrating the two most popular MT paradigms: the rule-based and the statistical-based. We will use linguistic technologies developed either for the rule-based MT systems or other natural language processing tasks into statistical MT systems. Linguistic technologies include: bilingual dictionaries, transfer rules, statistical parsing, word sense disambiguation, morphological and syntactic analysis. The new paradigm will provide solutions to current MT challenges such as unknown words, reordering and semantic ambiguities.
The project will focus on the three most spoken languages in the world: Chinese, Spanish and English; and all translation combinations among them. These language pairs do not only involve many economic and cultural interests, but they also include some of the most relevant MT challenges such as morphological, syntactic and semantic variations.
Systems and Applications
Zh-Es Rule-based MT system based on Apertium