Using Text to Speech Programs for Editing

Today I researched, installed, and played with a text to speech (TTS) program to help with editing. This is what I was thinking when I made the decision:

  • A foreign voice will help me to disassociate myself with my own work, thereby making it easier to look at fresh.
  • The program will stumble over misspelled/wrong words (though not homonyms) and I can fix those.
  • The program will show where I need clarification or where I fall out of character.

Now, what I wasn’t expecting was the incredibly robotic voices some of these TTS programs have. Some are horrendous! Surprise, surprise. I did find  few with passable voices.

Anyway, I’m rather paranoid about where I get my files and I tend to trust referrals more than anything else. In that light, I through some bizarre linkage of articles that I can’t trace, came across The Best Text to Speech Programs from How To Geek. Most of them I crossed off the list since I wanted free and the ability to export the file (just in case I wanted to keep a copy or something?), and from there I played the demo clips and picked out the better voices.

I am currently using Balabolka and am pleased with it. The program obviously doesn’t have the voice acting abilities of a human, but even some of the stranger names I put into it were pronounced to satisfaction. Sometimes it will hit a very poorly misspelled word, and it will have to spell out the letters for me. Oops. But otherwise it does the best it can. “Reguards” came out pretty amusing, for example. The thing that gets me the most is that it has what feels like long pauses over commas. There are options to fiddle with the speed of the voice, and I haven’t done that yet.  So, let’s talk about the voice of the program. It’s mostly human. There’s a distinct tin sound to it, and the program sometimes takes longer to go from one consonant to another. Despite this, I found that I could get lost in the story after a few minutes. Is it worthy of audiobook? Nuh-uh. Nope, don’t dream of it. Maybe I’ve been pampered with too many excellent podcasts from my favorite places (like PodCastle).

But, as an editorial tool? It’s pretty awesome. Right, so I told you earlier what I thought I would discover by doing a TTS program. This is what I found:

  • Using, say, a foreign accent can trip me up. “Feral” was pronounced “FEE-ral”, and I’m used to soft sounds all the way through. Not sure if it was the accent (UK) or if the program wasn’t accustomed to that particular word. Next time I’ll try (US) accent.
  • That said, the accent could be a powerful tool if you wanted to “cast” your narration in a different accent.
  • I was surprised how much I noticed things like a missing -s at the end of a word, and any tense shifts were blatant.
  • YES, typos stood out like a hammerhead on the thumb. Since words that sound the same but mean different things are pronounced the same way, this is not going to help you with homonyms! They’re, there, their will come out of the program identically, but only one of those words is what you think it means in that particular context.
  • It was harder to tell dialogue from the rest because of tone monotony.
  • Probably one of the most amazing things was the computer fell into a sort of melody. Rather like a song. And when I had written something that didn’t mesh right, that beat was disrupted. Now, this isn’t nearly as nice as a person would do it, but if there is anything which will be brutally honest with you, it’s a computer program.
  • It really is rather cool to copy-paste text into a window and listen to a narration. That’s just fun.
  • What is also fun is to put some well-known and often-recited piece of literature in the computer to see what happens. Sorry, Poe: ‘The Raven’ does not play well with Balabolka. It’s just missing that oomph.

The first time I played the sound, I wasn’t sitting attentively at the computer and I missed some things I wished I would have made note about. In the future, I’m going to follow along with my work open in another window and make corrections as I hear the mistakes.

Is this a first-revision tool? No. It’s more like a third draft tool. It’s something to help polish the work. Preferably, this is something you would do after you’ve run spellcheck and done your wrecking ball revisions. This is when you think it’s ready to go, and you’ve put your work aside for a while. My recommendation is to take a manuscript which has been sitting for a couple of weeks, load it up into a TTS program, and listen. Keep track of where you are, and make the changes that need made. You’ll hear them.

Just remember…homonyms don’t count in an oral reading. Also remember: There are three, yes, three ways to spell the word ‘weather’. Weather, whether, wether. “I often wondered whether the weather caused the wethers to shed their winter fur.” Now in the last wether, it’s best to remember that these goats or sheep are missing two things. I had to bring up the poor, forgotten word ‘wether’ because so few people address it in homonym posts.

Well, that was today’s adventure.



(PS…while I’m on the subject, it isn’t billy goat. A male goat with his gonads intact is called a buck, and the female is called a doe. More on the proper ways to title animals later, in a separate post which will include a rant about geldings having foals.)



3 thoughts on “Using Text to Speech Programs for Editing

    1. Probably not for everyone (my husband hates the computer voice, and I’m actually wondering if I might have a tool to chase him out of the house on occasion…) but I really like to hear a narration.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s