Blog: Artificial Intelligence Helps Neuroscientist Transform Brain Waves into Physical Speech – TechNewsObserver
The medical establishment has long believed that people who lose the ability to speak—from a stroke or from another medical condition—my never fully regain it. A new study, however, suggests there may be a new hope in full rehabilitation that goes beyond traditional physical pathology.
A team of researchers from the University of California (San Francisco) say that technology might soon be able to harness the specific brain activity that produces speech. The scientists implanted electrodes into the brains of some volunteers and were able to decode signals transmitted within cerebral speech centers which could guide a computer-simulated version of the vocal tract—the lips, jaw, tongue, and, of course, larynx—to generate a synthesized speech pattern.
“For the first time, this study demonstrates that we can generate entire spoken sentences based on an individual’s brain activity,” explains lead study authr Edward Chang, in a press release.
And, more importantly, this speech was mostly intelligible, meaning this method could lead to a truly excellent alternative treatment for those with speech issues. Sure, some passages may have been slurred, but the overall result is quite impressive.
The UCSF professor of neurological surgery goes on to say, “This is an exhilarating proof of principle that with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.”
Lead researcher Gopala Anumanchipalli explains that the breakthrough links brain activity with physical movements in the mouth and throat during speech, instead of associating specific brain signals with acoustics.
The speech scientist notes, “We reasoned that if these speech centers in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals.”
Finally, UCSF bioengineering student Josh Cartier remarks that this is only the beginning. There is still so much to learn from the study, but it is a very good place to start. For one, he says, “We’re quite good at synthesizing slower speech sounds like ‘sh’ and ‘z’ as well as maintaining the rhythms and intonations of speech and the speaker’s gender and identity, but some of the more abrupt sounds like ‘b’s and ’p’s get a bit fuzzy. Still, the levels of accuracy we produced here would be an amazing improvement in real-time communication compared to what’s currently available.”
The results of this study have been published in the journal Nature.