Language Learning

The Audio in My First Books Is Wrong

A Portuguese speaker listened to my reader app for about a minute and caught a problem I never could have caught myself.

Yesterday I published a whole post about the reader app I built to teach myself Portuguese. Today someone who actually speaks Portuguese used it. It took them about a minute to find a problem.

The word “tá” is wrong in my first books. The audio says something closer to “teh.” Every page that has “tá” in it, which in this series is a lot of pages, has a robot voice confidently teaching the wrong sound.

Why this one hurts

“Tá” isn’t some rare word I can quietly fix later. It’s one of the most common words in the whole series. It’s the spoken Brazilian version of “está,” and I chose it on purpose because I wanted the books to sound like real spoken Portuguese instead of textbook Portuguese.

So the word I picked specifically to sound natural is the word the audio says unnaturally. That’s a pretty good joke at my expense.

It hurts for a second reason. The audio in these books is load-bearing. Each page is one sentence, and a beginner is supposed to be able to trust the voice completely. A beginner will copy whatever sound they hear, over and over, twelve books in a row. Wrong audio in a learning tool isn’t a typo. It’s teaching the mistake.

The part I can’t fix about myself

Here’s the uncomfortable lesson. I listened to those files plenty of times and never noticed. Of course I didn’t. I’m the beginner. My ear is exactly as untrained as the ears of the people the app is for.

I can check a lot of things myself. I can check that every word in a book was properly introduced. I can check the word counts, the grammar rules, the images. I wrote scripts that check most of it for me. But “does this actually sound like Brazilian Portuguese” is the one question where my own judgment is worth nothing. That check needs a human with the right ears, and I’m not that human yet.

The weird clue

Strangest part: the later books are mostly fine. Same word, same robot voice setup, better “tá.” I don’t know why yet. The audio was generated in batches over time, so my best guess is that something changed along the way, maybe the voice, maybe how the sentences were fed in. I’ll figure out where the good batches start before I regenerate anything.

The plan

  1. Find out which books have the bad “tá” and which don’t. That means actually listening, with a checklist.
  2. Regenerate the audio for the bad books, and this time compare the new files against the books that sound right.
  3. Add the missing step to my process: before audio counts as done, a person who speaks Portuguese hears it. Not the guy who built the app. Not the robot that made it.

The books are still up while I fix this. That’s a judgment call. Broken “tá” is a real flaw, but thirty books of understandable Portuguese with one bad vowel still beats zero books. If you’re using the app in the meantime, now you know more than the audio does on at least one word.

Field notes, not instructions. This one was a field note.

#portuguese#leitura#failure