Looking to Listen: Audio-Visual Speech Separation

Very interesting approach by Google’s researchers to the “cocktail party problem”. The team trained a CNN to determine which person is speaking in a video with multiple overlapping sounds, and to amplify that speech while reducing other noise. Applications include better automated subtitles, and improved hearing aids. https://research.googleblog.com/2018/04/looking-to-listen-audio-visual-speech.html
https://research.googleblog.com/2018/04/looking-to-listen-audio-visual-speech.html
Posted by Inbar Mosseri and Oran Lang, Software Engineers, Google Research People are remarkably good at focusing their attention on a par…

Discover more from Notes from the field