The take was perfect. The audio wasn't.

It was a typical morning shooting for Where the Food Comes From. We had a place to be, we didn’t have enough time to get there, and we needed one last good shot of the host walking toward the camera, talking.

In the interest of speed, I slapped a lav mic on Chip, slapped the receiver on the audio recorder, and later slapped myself for not having actually listened to the audio while it was being recorded.

After all, my trusty Sennheiser G3s (now 15 years old) would never let me down, would they? And Chip was only 10 feet away delivering his line. It’ll be fiiiiiiiine. The headphones are just an extra step that’ll slow down the process.

We got the perfect take in one shot, jumped into the vehicles, and screamed away to the next location.

It wasn’t until later that I discovered the lav audio had a combination of some pretty nasty radio interference static bursts, plus a couple of straight-up dropouts for good measure.

Reshooting wasn’t an option. Chip was back in Tampa. I was up in Georgia. What I had: the video of the take, the busted audio, and the craptactular on-camera audio of the same take.

Just as I was resigning myself to some Frankensteined ADR setup, I hit on an idea. Because there was a fourth thing I had: hours of footage containing Chip’s voice, recorded at full lav quality over multiple seasons of production.

I’d recently discovered ElevenLabs and subscribed at the lowest tier to play around. They’ve got excellent audio tools, including one called Voice Changer.

Here’s how it works. Voice Changer takes an audio recording and re-renders it through a voice clone you’ve trained. The input audio carries all the performance. The exact inflection, timing, and energy Chip had on that take. The clone carries the quality. The clean, consistent version of his voice trained on hours of good lav recordings.

I wasn’t replacing Chip’s voice with someone else’s. I was using his craptactular on-camera audio as a blueprint, and his best archived lav voice as the output. I fed ElevenLabs a few hours of his isolated speech from past shows, got a clone, then ran the on-camera take through Voice Changer with that clone as the target.

The result sounded like Chip. Because it was.

It matched close enough that I saw no need to inform the host of the incident. Chip, if you’re reading this: it was on the Shuman dock, when you were walking toward the camera.

The tools have only gotten better since then. ElevenLabs has expanded clone quality and lowered the bar for how much source audio you need. DaVinci Resolve Studio’s Fairlight has voice isolation that’s now aggressive enough to pull clean dialogue out of some pretty rough recordings, handling the interference problem at the source before you ever need a clone. Descript has a dialogue regeneration feature worth knowing if you work in a text-based edit workflow.

None of these save a take where the performance is gone. You can’t add what wasn’t captured. But if you have the performance and you just lost the quality to RF interference, a dropout, or a cable bump, there’s more you can do about it now than there used to be.

The actual moral is: put the headphones on and listen to your audio while you’re recording. The G3s were 15 years old. Of course there was interference on the dock.

But sometimes you forget. And now you know what to try next.*

*You’ve got options for the gear problem too.

The take was perfect. The audio wasn’t.

Related posts: