EE 627A: Speech Signal Processing

The lectures could not be conducted after the midsem break due to COVID 19. Following the e_resourse (PDF with the audio file) for these lectures.

Sl. no Lecture Lecture note (pdf) Audio (mp3)
1 Pattern Recognization pdf audio
2 Gaussian Mixture Modeling pdf audio
3 Hidden Markov Model for Speech Recognition. Part-1 pdf audio1, audio2, audio3
4 Hidden Markov Model for Speech Recognition. Part-2 pdf audio1, audio2

Instructions to use E_resources :

  • Open/down load the .pdf file.
  • Open/download the corresponding audio (in .mp3 format) file.
  • Listen to this audio while referring to the pdf file.

Announcements for uploading Assignments:

Announcements for Term Projects:

  • Be ready with your term paper (code, report, and presentations).
  • The evaluation date and submission process will be updated soon.

Instructor : Prof. R M Hegde, Dept of EE IIT Kanpur

Welcome to the course web page of EE 627A @ IIT Kanpur

This course will deal with both theory and practical aspects of Speech signal processing. The course requires the basics of digital signal processing and probability theory. Although the course will include math, the key idea is to get the participants to appreciate the math behind the practice and not get lost in the math itself. The Assignments will include reading, math, and implementation assignments while making it challenging for participants who like math also. Although the course content lists several books, the quizzes and the finals will be based only on what is delivered in the class, assignments and the class notes (which is basically portions of specific textbooks). The projects listed will cover various topics in speech and audio processing in general and will include tools like HTK, CMU Spinx, Deep Learning, Voice XML, Matlab. Each participant will be expected to turn in a report, demo and present the particular project assigned to him. The Instructor will provide support to the best extent possible with the projects.

Primary References

book1.jpg book2.gif book3.jpg book4.jpg

Practical References

prac1.gif prac2.jpg keras.png
  1. http://htk.eng.cam.ac.uk/
  2. http://cmusphinx.sourceforge.net/
  3. https://keras.io/

Other References on Speech Recognition

  1. Elsevier Speech communication
  2. Eurasip JASP and JASMP
  3. Elsevier Signal processing
  4. Wiley books on Speech Recognition
  5. Pearson books on Speech Recognition
  6. JASA

Notable Speech Conferences

  1. ICASSP
  2. INTERSPEECH
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License