Amazon uses a child’s dead grandma in a morbid Alexa Audio deepfake demo

amazon echo dot gen 4
Enlarge / The 4th generation Amazon Echo Dot smart speaker.


Amazon figures out how to get its Alexa voice assistant to deepen the voice of anyone, dead or alive, with just a quick recording. The company screened the feature at its re:Mars conference in Las Vegas on Wednesday, using the emotional trauma of the ongoing pandemic and grief to spark interest.

Amazon’s re:Mars will focus on artificial intelligence, machine learning, robotics and other emerging technologies, with technical experts and industry leaders taking the stage. During the keynote on day two, Rohit Prasad, Senior Vice President and Head Scientist of Alexa AI at Amazon, demonstrated a feature being developed for Alexa.

In the demo, a child asks Alexa, “Can granny finish reading to me The Wizard of Oz?” Alexa responds in her signature feminine robotic voice, “Okay.” But next, the child’s grandma’s voice comes out of the speaker to read the story by L. Frank Baum.

You can check out the demo below:

Amazon re: MARS 2022 – Day 2 – Keynote.

Prasad only said Amazon is working on the Alexa skill and didn’t specify what work is left and when/if it will be available.

However, he provided meticulous technical details.

“This required an invention where we had to learn to produce a high quality voice in less than a minute of recording time as opposed to hours of recording in a studio,” he said. “The way we did it was that we framed the problem as a language conversion task rather than a language generation task.”

Prasad very briefly went into how the feature works.
Enlarge / Prasad very briefly went into how the feature works.

Of course, deepfaking has acquired a controversial reputation. Still, some effort has been made to use the technology as a tool rather than a means of creepiness.

In particular, as noted by The Verge, audio deepfakes have been used in the media to compensate for when, for example, a podcaster messes up a line or when a project’s star suddenly dies, as was the case with the Anthony Bourdain documentary road runner.

There are even cases of people using AI to create chatbots that communicate as if they were a lost loved one, the publication noted.

Alexa wouldn’t even be the first consumer product to use deepfake audio to represent a family member who can’t be there in person. Takara Tomy’s smart speaker, as highlighted by Gizmodo, uses AI to read children bedtime stories with a parent’s voice. Parents are reportedly uploading their voices, so to speak, by reading a script for about 15 minutes. This differs significantly from Amazon’s demo, however, as the product’s owner chooses to provide their voice rather than the product using the voice of someone who is unlikely to be able to provide their permission.

Aside from concerns about deepfakes being used for scams, rip-offs, and other nefarious activities, there are already some troubling things about how Amazon is designing the feature, which doesn’t even have a release date yet.

Before showing the demo, Prasad talked about Alexa giving users a “companionship relationship.”

“In this companion role, human attributes like empathy and affect are key to building trust,” the manager said. “These qualities have become even more relevant during these times of the ongoing pandemic, when so many of us have lost someone we love. While AI can’t remove that pain of loss, it can definitely make their memories lasting.”

Prasad added that the feature “enables lasting personal relationships.”

It’s true that countless people are seriously seeking human “empathy and affection” to respond to emotional distress brought on by the COVID-19 pandemic. However, Amazon’s AI voice assistant is not the place to meet these human needs. Nor can Alexa enable “lasting personal relationships” with people who are no longer with us.

It’s not hard to believe that there are good intentions behind this evolving feature, and that hearing the voice of someone you miss can be a great comfort. In theory, we could even imagine having fun with such a feature. Getting Alexa to make a friend sound like they said something stupid is harmless. And as we discussed above, there are other companies using deepfake technology in ways similar to what Amazon has demonstrated.

But designing an evolving Alexa skill to reconnect with deceased family members is a giant, unrealistic, and problematic leap. Meanwhile, tugging at hearts by bringing in pandemic-related sadness and loneliness feels unnecessary. There are some places Amazon doesn’t belong, and grief counseling is one of them.

Leave a Comment