- Clement Onime
- James Uhomoibhi
- Solomon Tulu
- KENNETH OKEREAFOR
- Anamaria Meshkurti
Description
This course is designed to teach students to understand the current state of the art of Automatic Speech Recognition (ASR) and Natural Language Processing. The first part provides background review and discussion on ASR background and introduction to probability. Student will learn key algorithms such as HMM, DNN, Hybrid (HMM/DNN) and Baum-Welch training algorithm. Students will also learn about representations of the acoustic signal like MFCC coefficients, and the use of Gaussian Mixture Models (GMMs) and context-dependent triphones for acoustic modeling. Finally, we will cover N-gram language modeling and perplexity. The students will be engaged in detail ASR system development tools such as HTK, Sphinx and ESPRESSO. Students also gain an understanding of Text-to-Speech (TTS): Grapheme-to-phoneme and Prosody (Intonation, Boundaries and Duration). The students will compare and contrast past ASR techniques and the current approaches to develop ASR system.