ACM - Computers in Entertainment

Using Text-to-Speech to Prototype Game Dialog

By Henrik Engström, Per Anders Östblad
FINAL EDITION, [Vol. 16, No. 4]

DOI: 10.1145/3276321

Voice acting is common in computer games in many genres. The recording and processing of voice acting is a time-consuming process that involves, for instance, voice actors, directors, audio engineers, and game writers. Changes to the script of a game after the voice acting has been recorded are expensive. At the same time, playtests of games without voice acting may give different results than testing where it is present. This creates a situation where improvements identified from play testing are either ignored or leads to extensive re-recording of voice acting. This article presents a design science research project where text-to-speech (TTS) synthesis is used as a substitute for recorded voice acting in the early stages of game production. We propose a set of design principles that have been evaluated in a sharp game production. Our results indicate several benefits of using TTS as a prototyping tool: It can be a source of inspiration for game writers, it gives good estimations on timing and pacing of the game, and it allows for early tests of how the dialog will be perceived by players. The quality and characteristics of the voices provided by the TTS system play an important role in this process. The rapid development in the speech technology field opens many future possibilities.

