Its already making the rounds in various online news outlets that Reddit banned deepfakes (AI assisted fake pornograph videos), and naturally its causing all manner of consternation as people on every side of the issue get all twisted up and yell at each other incoherently. Whats slipping through the cracks however is that there is also technology out there to fake voice as well and while its not great its not absolutely terrible as one might expect. Since I have no desire to see myself superimposed on the body of another I figured I might as well see how good a computer was at faking out my voice since so many things take only very brief conversations to authorize these days.
In order to prime the software you have to record yourself reading a bunch of sentences, enough material for at least 30 seconds according to the prompts. Once you have that corpus of material ready you tell the service to go build your voice (I was imagining Bene Gesserit Voice training while it processed) and when its done you can type in anything you want and the synthesized version of your voice spits out the phrase for better or for worse.
Generated with Lyrebird.ai
Naturally there are some modulated sounds in the generated one, however having reviewed recorded phone calls of myself it sure could pass for me if the phone mic was bad. What is scary is that it correctly hit the emphasis that I naturally put on some words, enough that I suspect had I not been sick when recording this and had a better soundproof room to do it in it might have done better. Of course for like 30 minutes of screwing around on the site re-recording my various gaffs I think it did an admirable job of spoofing my voice and I suspect given enough time to refine the software it could probably get pretty good, fortunately I’m broke compared to the Hollywood folks who are turning coal into diamonds right now worrying about faking technology producing sex tapes that they never actually starred in.