Sharing Knowledge about ARM, Linux and Development Platforms as Beaglebone Black.

29 May 2014

Using a Text to Speech (TTS)

Our Beaglebone Black is smaller, powerful and has a lot of things to say! In our LAST entry we configured Bluetooth and microphone so now is time to give it the needed speech!! :-)


The Speech synthesis is a great and complex world. Nowadays we can hear synthetic voices and maybe we wouldn't say 100% sure if it is human or synthesized. Unfortunately for us, open source seekers, there are not many options out there for TTS on Linux. For sure we are looking for high quality offline open source voices! If we discard some requirements a lot of options appear!

We are going to focus on open source/free to use text to speech applications. For an expanded list check in HERE.

- eSpeak: 50 languages, Offline, Robotic, Different Voices, Easy to use.
- Festival: 3 languages by default, Offline, Robotic, Different Voices, Uncomfortable but powerful.
- Mary: 8 languages and growing, Offline, Almost Natural, Different Voices.
- SVOX Pico: 5 languages, Offline, Not as bad as Robotic, 'Tricky' open source adapted from Android.
- Google TTS: 50 languages Aprox., Online, Some languages almost Natural and others Robotic.

Let´s start to test them. As usual we will begin running a system upgrade:
 pacman -Syyu
After that we will start installing eSpeak TTS:
 pacman -S espeak
To list all supported languages type this:
 espeak --voices
And this for a list of different voices for a language:
 espeak --voices=en
Finally let´s say something!
 espeak -ven "Hello, world!"
Now we are going to use Google Translate TTS services! We will install a player able to stream audio called mpg123. Remember that in this case TTS won´t work unless you have an Internet connection.
 pacman -S mpg123
And in this case is just simple as running as this:
 mpg123 -q "http://translate.google.com/translate_tts?tl=en&q=Hello, world!"
If you try to read more than 100 bytes you won´t be able to do it unless you try THIS.

These two solutions above for me are the best right now but if you want to try that SVOX Pico (Android) TTS you will have to download and compile the package from the AUR repository. Let´s take a minutes to explain how to do that. We will begin installing the full developers package for ArchLinux:
 pacman -S base-devel
Once the installation is finished let´s create a build directory, download tarball package and extract it:
 mkdir builds
 cd ~/builds
 curl -O https://aur.archlinux.org/packages/sv/svox-pico-git/svox-pico-git.tar.gz
 tar -xvzf svox-pico-git.tar.gz
 cd svox-pico-git
Now we will compile it! Relax because it will take a while:
 makepkg -Acs --asroot
After it finishes is time to install it, so that check the proper name inside compilation folder:
 ls -l
 pacman -U svox-pico-git-android.4.4.2_r2.34.g47af76b-1-armv7h.pkg.tar.xz
And finally let´s try how it works!
 pico2wave --lang=en-US --wave=/tmp/test.wav "Hello, world!";paplay /tmp/test.wav
As you see it is a matter of "How is my project and what I want for it?". For sure are paid solutions that are really awesome about audio quality although normally are heavy! A great solution for a mid-term is MaryTTS, sounds nice and is able to simulate feelings! Unfortunately has no many languages for now... :-(

So come on start to think about open possibilities that a Beaglebone with a TTS brings to you and remember: The only limit is the one you set yourself!! Thanks for reading me. :-)
No comments

No comments :