While I was traveling through Atlantic Canada in July, this news story broke in Radio World:
“iHeartMedia’s plan to use Veritone’s voice-cloning technology for its podcast platform has some radio industry observers asking the obvious questions: How good does it sound and is broadcast radio far behind?
The largest radio company in the United States says that for now, the synthetic voice solution will only be used to translate podcasts from English to other languages for use on the iHeartPodcast Network, first for Spanish-speaking audiences. But Veritone officials confirm its technology could someday be used for advertising to reduce time-to-market and production costs for radio.”
You can read the complete article here: https://www.radioworld.com/news-and-business/headlines/veritone-synthetic-voice-gets-an-audition
It reminded me of an article I wrote on this very subject in December 2021. I thought readers might find my article of more interest now that the deployment of this technology is happening at warp speed.
Is it Live, or is it Memorex?
I remember when the audio quality of tape recorders became so improved with audio reproduction, that the question of the day was, “Is it live or is it Memorex?” Memorex was a company established in 1961 for selling magnetic computer tapes. In the 70s Memorex moved into producing quality audio tape for recording music and voice.
TV commercials at that time featured Ella Fitzgerald singing a note that shattered a glass, while simultaneously being recorded on an audio cassette. The recorded audio would then be played back and the recording would also shatter a glass, to which the announcer would ask, “Is it live, or is it Memorex?”
Is AI Going to Replace Voicetracking?
Then Radio Ink published a story that got many of the people in my radio, podcasting and other social media groups talking about, titled “Is AI going to replace voicetracking?”
Voicetracking technology has been used to replace live radio personalities for decades, but what AI presents the industry with is the possible ability to bring back the big name radio personalities.
Dan Ingram, Larry Lujack, Robert W. Morgan, The Real Don Steele…
Imagine your radio market’s favorite radio personality returning to the airwaves. It’s not out of the realm of possibility.
A company called WellSaid Labs has created dozens of human voice avatars where all one needs to do to get them to talk, is type text into a computer and the voice will say it.
Imagine how having a creative person, who has studied the style of an iconic personality, and then creating new, contemporary material to be delivered in that personality’s voice might sound.
Now you might be wondering why anyone would want this type of technology. Well, Netflix now streams content worldwide and buys new content from producers all over the world. Much of that content is produced in the country’s native language and so Netflix has to show that content with either subtitles or voice-dubbing the dialog with voice actors speaking in the language of the country the material will air in.
It might not surprise you to learn that when Netflix has offered viewers two ways of viewing a program, Americans in particular, prefer voice-dubbing to subtitles. (I know I do.)
To speed up the process of voice-dubbing and to have voices that sound the same as the original actors, companies like WellSaid are developing artificial intelligence technology that by voice sampling can then re-create the voice automatically.
I already have conversations with Alexa and have wondered what she might sound like as a DJ on a radio station, haven’t you?
The afternoon DJ on KCSN, Andy Chanley, has been on-the-air there for over 32 years. Now using a robot DJ named ANDY (Artificial Neural Disk-Jockey), Chanley’s voice will continue to be heard in many places throughout Southern California. During a demonstration for Reuters, reporters say that Chanley’s AI voice was hard to distinguish from his human voice.
You can listen to these computer generated voices WellSaid has created for yourself by clicking on this link: https://wellsaidlabs.com/?#actors-preview-list
Is Your Favorite DJ Already a Robot?
WellSaid says its voice avatars are doing more than just DJ work, they are being used extensively in corporate training material and the creation of audio books.
Do I think I will live to see radio’s great personalities coming back to life? No, because I think there will be too many legal issues that might complicate that from happening anytime soon.
But I do think that original voice avatars, teamed up with creative content developers, might just come into existence sooner than we imagine and provide us with an entirely new form of radio entertainment.
(This article was originally published on December 19, 2021)
6 responses to “Voice Cloning Technology”
Well, now we will have to copyright our voice (Good luck on that) and get it into your contracts (The lawyers will like the extra income).
LikeLiked by 2 people
It certainly appears that’s going to be the case.
LikeLiked by 1 person
You already hear YouTube videos with fake voice narration, and I imagine radio conglomerates would love the idea of not even having to pay one person to voice track a station, but I’ll take my memories and not embrace the brave new world, thank you.
LikeLiked by 2 people
I think it was Mick Fleetwood who said automated drummers fail by being too perfect.
It’s the imperfection of a human drummer, that makes it exciting for the listener. You never know what they’re going to do next and that’s what makes a live performance so magical.
I think that’s what makes listening to a LIVE DJ more exciting.
LikeLiked by 1 person
No denying it, this is happening already. Radio has always tried “the next thing” -(as has other tech) and the good ones (digital delivery, Orban/Omnia) have continued to thrive. The bad ideas will go away. If today’s radio is okay with sounding less personable than a 6th grade Coronet Film from the 60s, then so be it. If the world will adopt that, so be it. I’m beyond trying to guess at how these things will work or not. It boils down to personal taste. I’m already on a station that uses AI for its weather-and it’s as engaging as a bowl of prunes. There are great applications for this (vending machines for the vision impaired, instructions on utilizing an iphone app, etc.) but nothing can match the emotional attachment of a real person. Could voice-synthesis replace Sammy Hagar’s rock and roll? I doubt it.
LikeLiked by 2 people
We agree Dave.