Recognition and Segmentation of English Long and Short Sentences Based on Machine Translation

Tiehu Zhang

Abstract


With the advent of the information age, long sentences which include many words and have more complex structures.. The translation of long sentences in English-Chinese machine translation has always been the focus of research. In this study, 400 long sentences were randomly selected from NTCIR-9 patent corpus for testing the recognition and segmentation effects of regular match method and error-driven method, and the accuracy rate of the translation was compared on Baidu Online Translation Platform. The results demonstrated that the regular matching method was effective in recognizing and segmenting long sentences, nevertheless there were many defects; the error-driven method was more effective in recognizing and segmenting long sentences; the former increased by 4.8% of the BLEU value of the translated text on Baidu Online Translation Platform and the latter increased by 12.1%, which showed that the error-driven method was more effective in machine translation.

Keywords


machine translation, long sentence, regular match, error-driven method

Full Text:

PDF


Copyright (c) 2020 Tiehu Zhang


International Journal of Emerging Technologies in Learning (iJET) – eISSN: 1863-0383
Creative Commons License
Indexing:
Scopus logo Clarivate Analyatics ESCI logo EI Compendex logo IET Inspec logo DOAJ logo DBLP logo Learntechlib logo EBSCO logo Ulrich's logo Google Scholar logo MAS logo