Digital transformation

AI Gives Back Voice to the One Who Lost It

VN (according to VnExpress) February 19, 2025 20:08

AI recreates surprisingly realistic voices, even creating expressive avatars, for patients who cannot communicate normally.

Jules Rodriguez biểu diễn hài kịch trên sân khấu. Ảnh: ElevenLabs
Jules Rodriguez performs comedy on stage

Jules Rodriguez, a 40-year-old Miami man, has been losing his ability to speak since he was diagnosed with amyotrophic lateral sclerosis (ALS) in 2020. In 2024, doctors feared Rodriguez would no longer be able to breathe on his own. So he had a small tube inserted into his windpipe to help him breathe. The tracheotomy extended his life, but also took away his voice.

Rodriguez and his wife, Maria Fernandez, thought they would never hear his voice again. However, artificial intelligence (AI) has worked a miracle, allowing Rodriguez to communicate using his old voice.

“Hearing my own voice again after a long time was really refreshing,” Rodriguez said. He now communicates by typing sentences using an eye-tracking device. These sentences are then spoken aloud by a replica of Rodriguez’s voice, enhancing his ability to interact and connect with others. He even uses it to perform comedy on stage.

Rodriguez is one of more than 1,000 people with speech difficulties who have used a voice-cloning tool developed by the US company ElevenLabs and offered free of charge to patients. Like many new technologies, these AI voice clones are not perfect, and some people find them impractical in everyday life.

Still, they are a huge improvement over older communication technology and are improving the lives of people with motor neurone disease, according to Richard Cave, a speech and language therapist at the Motor Neurone Disease Association in the UK. “This is really AI for good,” Cave said.

Rodriguez began experiencing symptoms of ALS in the summer of 2019. Like other ALS patients, he was advised to “archive” his voice—saying hundreds of phrases and recording them. These recordings were used to create a “archive voice” for communication devices. However, the voice was choppy and robotic.

ElevenLabs was founded in 2022 and began developing AI voices for use in movies, TV shows, and podcasts. The initial goal was to improve the quality of voiceovers, making voices in other languages ​​sound more natural, according to Sophia Noel, who oversees the company’s partnerships with nonprofits.

But then the technical lead for Bridging Voice, an organization that helps ALS patients communicate, said that ElevenLabs' voice transcriptions were very helpful to them. In August 2024, ElevenLabs launched a program to provide free technology to people with speech difficulties.

The technology makes recreating patients’ voices much quicker and easier. Instead of having to record hundreds of phrases, users can upload voice recordings from old voice messages or videos. “It takes at least a minute to create anything, but ideally it’s about 30 minutes. You upload it to ElevenLabs. After about a week, the voice is created,” Noel said.

While the archived voice sounded robotic, the voice clone sounded very natural. While the words were still a bit too fast and the emotional quality was lacking, it was a huge step forward.

Cave introduced the technology to people with motor neurone disease (MND) a few months ago. 130 of them have started using it, and the feedback is positive. The voice clones sound much more realistic than the stored voices. “They have pauses for breath, ums and uhs, sometimes stuttering. To me, that sounds very real because I want a stuttering synthetic voice too. That’s who I am,” says Cave, who has a mild stutter.

Joyce Esser cùng chồng đi nghỉ ở Maldives. Ảnh: Joyce Esser
Joyce Esser and her husband vacation in the Maldives

Voice transcription is not yet a perfect aid to speech. To compose speech for a voice transcription, words must be typed. There are devices that allow MND patients to type using their fingers, eyes or tongue movements. This works well for prepared text, but typing is not instantaneous and creates pauses in every face-to-face conversation.

Joyce Esser, one of the 130 people Cave introduced, was happy to be able to recreate her old voice. But she found the technology impractical. “It’s good for prepared speeches, but not for conversation,” she said. Esser also found that when using voice clones, the volume was too low for people to understand, and the voice was too fast and not expressive enough. She wished she could use emojis to express excitement or anger.

“The problem I had was that when I wrote something long, the AI ​​voice seemed to get tired,” Rodriguez shared.

“We seem to have the authenticity of the voice. What we need now is the authenticity of the delivery,” Cave said.

The charity, the Scott-Morgan Foundation, is looking to link ElevenLabs' voice clones with an additional technology - hyper-realistic avatars for MND patients. These digital clones have a human appearance and voice, and can speak on screen.

Creating the avatar wasn’t easy. Erin Taylor, who was diagnosed with ALS when she was 23, had to speak 500 sentences on camera and stand for five hours to create it. But the results were impressive. Taylor introduced her avatar at a tech conference in January with a pre-written speech.

“Facial expressions are such an important part of communication, so the idea of ​​an avatar seemed really cool. The avatar doesn’t obscure the user’s face… you can still see into their eyes and soul,” Esser said.

The Scott-Morgan Foundation will continue to work with technology companies to develop more communication tools for those who need them. ElevenLabs also plans to partner with other organizations that support people with speech difficulties to help more patients access new technology. “Our goal is to give the power of voice to 1 million people,” Noel said.

Meanwhile, Cave, Esser and Rodriguez are eager to spread the word about voice cloning to others in the MND community. “This is really a game-changer for us. It doesn’t take away a lot of what we’re dealing with, but it really strengthens the bond we have as a family,” Fernandez said.

VN (according to VnExpress)
(0) Comments
Latest News
AI Gives Back Voice to the One Who Lost It