Skip to content
Notes from the field

Notes from the field

Looking to Listen: Audio-Visual Speech Separation

Looking to Listen: Audio-Visual Speech Separation

Very interesting approach by Google’s researchers to the “cocktail party problem”. The team trained a CNN to determine which person is speaking in a video with multiple overlapping sounds, and to amplify that speech while reducing other noise. Applications include better automated subtitles, and improved hearing aids. https://research.googleblog.com/2018/04/looking-to-listen-audio-visual-speech.html
https://research.googleblog.com/2018/04/looking-to-listen-audio-visual-speech.html
Posted by Inbar Mosseri and Oran Lang, Software Engineers, Google Research People are remarkably good at focusing their attention on a par…

Share this:

  • Click to email a link to a friend (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on WhatsApp (Opens in new window)
2018-05-07

linkedin cross-post

Post navigation

PREVIOUS
Stripe’s AI fraud detector is crazy smart
NEXT
Machine Learning’s ‘Amazing’ Ability to Predict Chaos | Quanta Magazine
Comments are closed.

Archives

The standard disclaimer…

The views, thoughts, and opinions expressed in the text belong solely to the me, and not necessarily to the my employer, organization, committee or other group that I belong to or am associated with.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2023 Rob Aleck, licensed under CC BY-NC 4.0
Go to mobile version