编辑: hyszqmzc | 2017-09-18 |
edu Heng Ji Computer Science Department Rensselaer Polytechnic Institute Troy, NY 12180, USA [email protected] Abstract We develop a system that lets people over- come language barriers by letting them speak a language they do not know. Our system accepts text entered by a user, translates the text, then converts the trans- lation into a phonetic spelling in the user'
s own orthography. We trained the sys- tem on phonetic spellings in travel phrase- books.
1 Introduction Can people speak a language they don'
t know? Actually, it happens frequently. Travel phrase- books contain phrases in the speaker'
s language (e.g., thank you ) paired with foreign-language translations (e.g., спасибо ). Since the speaker may not be able to pronounce the foreign-language orthography, phrasebooks additionally provide phonetic spellings that approximate the sounds of the foreign phrase. These spellings employ the fa- miliar writing system and sounds of the speaker'
s language. Here is a sample entry from a French phrasebook for English speakers: English: Leave me alone. French: Laissez-moi tranquille. Franglish: Less-ay mwah trahn-KEEL. The user ignores the French and goes straight to the Franglish. If the Franglish is well designed, an English speaker can pronounce it and be under- stood by a French listener. Figure
1 shows a sample entry from another book―an English phrasebook for Chinese speak- ers. If a Chinese speaker wants to say 非常 感谢你这顿美餐 , she need only read off the Chinglish 三可 油否热斯 弯德否 米欧 , which approximates the sounds of Thank you for this wonderful meal using Chinese characters. Phrasebooks permit a form of accurate, per- sonal, oral communication that speech-to-speech Figure 1: Snippet from phrasebook translation devices lack. However, the user is lim- ited to a small set of ?xed phrases. In this paper, we lift this restriction by designing and evaluating a software program with the following: ? Input: Text entered by the speaker, in her own language. ? Output: Phonetic rendering of a foreign- language translation of that text, which, when pronounced by the speaker, can be under- stood by the listener. The main challenge is that different languages have different orthographies, different phoneme inventories, and different phonotactic constraints, so mismatches are inevitable. Despite this, the system'
s output should be both unambiguously pronounceable by the speaker and readily under- stood by the listener. Our goal is to build an application that covers many language pairs and directions. The current paper describes a single system that lets a Chinese person speak English. We take a statistical modeling approach to this problem, as is done in two lines of research that are most related. The ?rst is machine transliteration (Knight and Graehl, 1998), in which names and technical terms are translated across languages with different sound systems. The other is re- spelling generation (Hauer and Kondrak, 2013), where an English speaker is given a phonetic hint about how to pronounce a rare or foreign word to another English speaker. By contrast, we aim Chinese 已经八点了 English It'
s eight o'
clock now Chinglish 意思埃特额克劳克闹 (yi si ai te e ke lao ke nao) Chinese 这件衬衫又时髦又便宜 English this shirt is very stylish and not very expensive Chinglish 迪思舍特意思危锐思掉利失安的闹特危锐伊克思班西五 Chinese 我们外送的最低金额是15美金 English our minimum charge for delivery is ?fteen dollars Chinglish 奥儿米尼们差只佛低利沃锐意思发五听到乐思 Table 1: Examples of tuples from a phrasebook. to help people issue full utterances that cross lan- guage barriers.