Today we will discover how to set up a simple but effective text-to-speech in TouchDesigner by using Python scripts and DAT components. Let’s go!
Text-to-speech systems are now widespread. They are used for a plethora of applications and rely on artificial intelligence algorithms to (almost) perfectly recreate the human voice.
Hundreds – or maybe thousands – of softwares and apps are now available on the market. Open source and free alternatives are available as well.
Since our goal is to find new ideas to integrate in our beloved immersive installations, in this tutorial we will design a text-to-speech in TouchDesigner to create an interactive framework for users. So let’s go!
Text-to-speech in TouchDesigner: the Python code
Since the TouchDesigner patch is heavily based on Python scripts, first of all we need to install the gTTS (Google Text-to-Speech) library, that will be in charge of the audio synthesis side. Check how to install it on your operative system.
So, first of all we need to create a Script DAT that will be the engine of our system.
Here is the full Python code:
from gtts import gTTS
import os
def onSetupParameters(scriptOp):
page = scriptOp.appendCustomPage('Custom')
p = page.appendFloat('Valuea', label='Value A')
p = page.appendFloat('Valueb', label='Value B')
return
# called whenever custom pulse parameter is pushed
def onPulse(par):
return
def onCook(scriptOp):
mytext = op('user_input_text_dat').text
language = 'en'
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("welcome.mp3")
op('audio_file_in').par.file = "welcome.mp3"
op('audio_file_in').par.reloadpulse.pulse()
scriptOp.clear()
return
Let’s analyze it.
First of all we import the Google Text-to-Speech and OS libraries. In the onCook function we point out the text we want to turn into speech – see below – and the language we want to use. Then we specify that the converted audio must be at audio rate speed, in order to pass it to the synthesis algorithm.
If we want, we can save the audio file in the dedicated folder through the myobj.save function. In the next lines, we play the audio file directly into TouchDesigner via the Audio File In CHOP.
So, as you can see, the system per se is quite simple and elegant. The gTTS library does not require authentication or API keys, so it is very suitable for fast prototyping and experimentation.
Get Our 7 Core TouchDesigner Templates, FREE
We’re making our 7 core project file templates available – for free.
These templates shed light into the most useful and sometimes obtuse features of TouchDesigner.
They’re designed to be immediately applicable for the complete TouchDesigner beginner, while also providing inspiration for the advanced user.
The TouchDesigner patch
In order to make the system more efficient, we can integrate other features.
First of all, we can store the texts we want to turn into speech on a Table DAT, in order to easily access them. We can trigger the desired rows through a CHOP Execute DAT. The component is linked to a Constant CHOP that we use to trigger the selected row.
Here is the CHOP Execute DAT Python code:
def onValueChange(channel, sampleIndex, val, prev):
textVariable = op('constant1').par.const0value
print(textVariable)
a = op('user_input_text_dat')
a.clear()
a.text = op('table1')[textVariable, 0]
Here we define our variable to retrieve the Constant CHOP – called textVariable – and update a Text DAT – user_input_text_dat – with the selected row and text.
The Text DAT is called into the Script DAT and sent to the audio synthesis algorithm.
Last but not least, we create an Audio Device Out CHOP and connect it to the Audio File In CHOP.
We can now select the text we want and the patch will automatically turn it into speech.
Of course we can start from here to develop more complex environments, such as adding reactive visuals or integrating audio effects for voice manipulation.
Wrap Up
There are several ways to create a text-to-speech system in TouchDesigner. Robusts and high-end solutions are available, as well as simpler and lean ones. By relying on trustable libraries such as the gTTS one, we can easily integrate text-to-speech solutions in our interactive and immersive environments without authentication or API keys issues.
Feel free to experiment and, as usual, sky is the limit!