SPS Chicago Chapter Seminar – Speech Recognition: What’s Left? – by Dr. Michael Picheny (09-05-2019)

SPS Chicago Chapter Seminar
Department of Electrical and Computer Engineering
Speaker: Dr. Michael Picheny, IBM T. J. Watson Research Center

Date: Thursday, September 5, 2019
Time: 11:00 AM – 12:00 PM
Location: 1000 SEO, 851 S. Morgan St, Chicago, IL 60607

Title: Speech Recognition: What’s Left?

AbstractRecent speech recognition advances on the SWITCHBOARD corpus suggest that because of recent advances in Deep Learning, we now achieve Word Error Rates comparable to human listeners. Does this mean the speech recognition problem is solved and the community can move on to a different set of problems? In this talk, we examine speech recognition issues that still plague the community and compare and contrast them to what is known about human perception. We specifically highlight issues in accented speech, noisy/reverberant speech, speaking style, rapid adaptation to new domains, and multilingual speech recognition. We try to demonstrate that compared to human perception, there is still much room for improvement, so significant work in speech recognition research is still required from the community.

Bio: Michael Picheny is a Distinguished Research Staff Member in Speech Technologies in IBM Research AI at the IBM TJ Watson Research Center. Michael has worked in the Speech Recognition area since 1981, joining IBM after finishing his doctorate at MIT. He has been heavily involved in the development of almost all of IBM’s recognition systems, ranging from the world’s first real-time large vocabulary discrete system in 1984 through IBM’s product lines for telephony and embedded systems in the 1990s, and most recently was responsible for putting out a set of Speech Services for both Speech Recognition and Speech Synthesis during his tenure in IBM’s Watson Group. He has published numerous papers in both journals and conferences on almost all aspects of speech recognition. He has received several awards from IBM for his work, including a corporate award, three outstanding Technical Achievement Awards and two Research Division Awards. He is the co-holder of over 50 patents and was named a Master Inventor by IBM in 1995 and again in 2000. Michael served as an Associate Editor of the IEEE Transactions on Acoustics, Speech, and Signal Processing from 1986-1989, was the chairman of the Speech Technical Committee of the IEEE Signal Processing Society from 2002-2004, and is a Fellow of the IEEE. He served multiple times as an Adjunct Professor in the Electrical Engineering Department of Columbia University and co-taught a course in speech recognition. He was a member of the board of ISCA (International Speech Communication Association) from 2005-2013 and named an ISCA Fellow in 2014. He was the co-general chair of the IEEE ASRU 2011 Workshop in Hawaii. He is currently a Distinguished Industry Speaker of the Signal Processing Society of the IEEE. 

Michael recently stepped down from management to return to full time research. Prior to this, he had been a manager for 35 years in the Speech area at IBM, and led the Speech team in Yorktown Heights since 2007. Activities in the broader speech team currently cover a multitude of areas in speech and language processing, including core technology innovation in all aspects of speech recognition and synthesis, applications of speech recognition to customer care and media captioning, and in conjunction with the MIT-IBM Watson AI Laboratory and MIT Researchers,  exploring unsupervised methods in joint modeling of speech and video.

More info on his work can be found at: https://researcher.watson.ibm.com/researcher/view.php?person=us-picheny

Hosts: Prof. Karen Livescu klivescu@ttic.edu, Prof. Mojtaba Soltanalian msol@uic.edu
Chapter Chair: Rashid Ansari
Chapter Vice Chair: Mojtaba Soltanalian

***Refreshments will be provided***

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.