A Realistic Visual Speech Synthesis for Indonesian Using A Combination of Morphing Viseme and Syllable Concatenation Approach to Support Pronunciation Learning

Authors

  • _ Aripin Informatics Department of Dian Nuswantoro University, Semarang, Indonesia
  • Hanny Haryanto Informatics Department of Dian Nuswantoro University, Semarang, Indonesia

DOI:

https://doi.org/10.3991/ijet.v13i08.8084

Keywords:

morphing viseme, realistic, syllable concatenation, visual speech synthesis for Indonesian

Abstract


This study aims to build a realistic visual speech synthesis for Indonesian so that it can be used to learn Indonesian pronunciation. In this study, We used the combination of morphing viseme and syllable concatenation method. The morphing viseme method is a process of deformation from one viseme to another so that the animation of the mouth shape looks smoother. This method is used to create the transition of animation between viseme. The Syllable Concatenation method is used to assemble viseme based on certain syllable patterns. We built a syllable-based voice database as a basis for synchronization between syllables, speech and viseme models. The method proposed in this study consists of several stages, namely the formation of Indonesian viseme models, designing facial animation character, development of speech database, a synchronization process and subjective testing of the resulting application. Subjective tests were conducted on 30 respondents who assessed the suitability and natural movement of the mouth when uttering the Indonesian texts. The MOS (Mean Opinion Score) method is used to calculate the average of respondents' scores. The MOS calculation results for the criteria of Synchronization and naturalness are 4,283 and 4,107 on the scale of 1 to 5. This result shows that the level of Synchronization and naturalness of the synthesis of visual speech is more realistic. Therefore, the system can display the visualization of phoneme pronunciation to support learning Indonesian pronunciation.

Downloads

Published

2018-08-30

How to Cite

Aripin, _, & Haryanto, H. (2018). A Realistic Visual Speech Synthesis for Indonesian Using A Combination of Morphing Viseme and Syllable Concatenation Approach to Support Pronunciation Learning. International Journal of Emerging Technologies in Learning (iJET), 13(08), pp. 19–37. https://doi.org/10.3991/ijet.v13i08.8084

Issue

Section

Papers