Text-to-Speech software is a game-changer for language learning. I’m using TTS to learn several languages at once. The Holy Grail of language resources—sentences with audio—is now something I can mass-produce on my computer. Not every language is included in the TTS functionality of the Mac (no Tibetan or Mongolian, for instance), but Turkish, Indonesian, Cantonese, and many other languages are included and work quite well. The last update added varieties of Spanish, so I vary my Spanish audio with some Argentinian speech.
TTS is so powerful because you can turn text sources into audio sources quickly and automatically. I use the Lonely Planet phrasebooks for most of the languages I study. I can type the text myself, copy it if I have a PDF, or even use the dictation function of TTS to input the text using my own voice. Using dictation is interesting, because it requires that the computer recognize my pronunciation. (For longer phrases, the computer recognizes what I am saying for the most part. For shorter phrases and single words, especially in Korean, it often fails to render.)
I put the phrases as text into an Excel spreadsheet, because Excel spreadsheets are easy to organize and can be converted into tab-delimited plain text files that can be imported into Anki. I started with Spanish and English, copying the Spanish via dictation and the English by typing. Then, it occurred to me I could add Korean in a third column. Then I thought, Why not put them all in one spreadsheet?
All the languages in one place
In one of his videos, Michael Campbell of Glossika in Taiwan explains how he learned eighteen aboriginal Taiwanese languages at once. He has a spreadsheet with glossary items and phrases in each language; to review, he goes down the spreadsheet and reads the phrases aloud. He records himself reading and listens to the recordings repeatedly. Now, I think listening to my own (non-native) pronunciation to memorize phrases is less than ideal, but Mr. Campbell is a pro and is working with languages that are nearly extinct. And, as a memory aide within a greater regimen, this may be a handy technique. I once memorized a Japanese speech that I wrote in this way, by reading it aloud, sentence by sentence, over and over again, and listening to a recording of myself reading it on repeat. I recited it with perfect delivery after only two days. Listening to my pronunciation of Japanese wasn’t ideal, but I had heard so much Japanese that a little foreign accent didn’t throw me off. Similarly, TTS pronunciations are approximations of native/natural speech and thus should not be relied upon solely, but they are far better than silence, and an excellent supplement to memorization. Ideally, we would have native speakers on call to read whatever we need and produce high-quality audio, but this is not feasible. (Mr. Campbell also reminds us that “memorization” is a misnomer, because we don’t need to actually recall the sentences exactly, only build connections in the brain.)
Two practical examples of using Text-to-Speech
- Spanish phrases from Lonely Planet
The first way I’m using TTS is for entering Spanish phrases from my Lonely Planet phrasebook into Anki. I type or dictate dissonances into excels in Microsoft excel, and then I add the English in another column. Then, I export the file as a UTF-16 plain text file, then convert it to a UTF-8 plain text file, then imported it into Anki. Then, I use the “Awesome TTS” plugin in Anki to mass generate MP3 files for the sentences using the Mac’s built-in TTS software. (The plugin also works with Google’s software, available for free via Internet.)
- Input Spanish phrases into Excel
- Input English translations
- “Save As” UTF-16 plain text file
- Open in TextEdit and “save as” UTF-8 plain text file
- Install & configure the Awesome TTS plugin
- Make sure Spanish TTS voice is installed (Mac)
- Make sure your Spanish card type has an Audio field
- Import UTF-8 file in Anki using the CTRL+I import function
- Select all new cards & use Awesome TTS mass generate MP3 function, using the Spanish field as the source and the audio field as destination
10. Make sure you put the audio field on the card, and you should have audio for every phrase
NB: In a later post I’ll show screenshots of how I organize my many note types.
2. Korean phrases from TTMIK intermediate dialogues
I think everyone and their mother studying Korean is using the Talk to Me in Korean series, with good reason. If you are, absolutely download the Anki public decks for the sentences from the grammar lessons, and the sentences for the “Iyagi” intermediate dialogues. These premade decks have saved me thousands of hours of work, and I wish I could tip the people who made them. As I go through the sentences in these two decks, I add the phrases that I don’t understand or need to memorize to new Anki cards, and then use TTS to generate audio for them.
A workflow for Korean:
- Download the public decks and convert the card types to match your Korean cards
- Add unknown phrases or phrases you’d like to memorize from the public deck cards onto new cards
- Use Awesome TTS to generate audio for the new cards
- Make sure your new cards are in your main deck and not in the public deck to ensure that you get those reviews as new cards first before getting new cards from the public deck you imported, otherwise you have to wait until you get to the very end of the public deck to see them, at which point they won’t be relevant to you anymore
NB: The TTMIK recordings and Korean transcripts are freely available online. Crowdsourced translations are being made at the Korean Wiki Project, but the official translations produced by TTMIK are for sale on their website. It’s hard to copy and past from their PDFs, though, as I don’t think they were designed with Anki in mind.
So what’s the big deal about Text-to-Speech?
This method of adding cards with audio from text sources is now my default, go-to method for learning languages. If you hear the phrase or the word, you are 1000 times more likely to remember. I just made up that number, but it’s probably pretty close.
Hear it. Speak it. Memorize it. – Carlos Douh
Being able to create audio from the text, rather than the other way around, gives you more control over what materials you study. I think this method combined with talking to native speakers after a few months of study is the most effective and sustainable method I’ve yet devised. It’s also scalable, so that you can simply add more cards when you want more material.