CBM366 Speech and Audio Signal Processing Syllabus:
CBM366 Speech and Audio Signal Processing Syllabus – Anna University Regulation 2021
COURSE OBJECTIVES
The objective of this course is to enable the student to
Provide students with basic knowledge about speech production and hearing.
Understand time-frequency analysis concepts.
Learn fundamentals of audio coding and transform coders.
Understand time and frequency domain methods for speech processing.
Study linear predictive analysis of speech.
UNIT I MECHANICS OF SPEECH AND AUDIO
Introduction – Review of Signal Processing Theory-Speech production mechanism – Nature of Speech signal – Discrete time modelling of Speech production – Classification of Speech sounds – Phones – Phonemes – Phonetic and Phonemic alphabets – Articulatory features. Absolute Threshold of Hearing – Critical Bands- Simultaneous Masking, Masking-Asymmetry, and the Spread of Masking- Nonsimultaneous Masking – Perceptual Entropy – Basic measuring philosophy -Subjective versus objective perceptual testing – The perceptual audio quality measure (PAQM) – Cognitive effects in judging audio quality.
UNIT II TIME-FREQUENCY ANALYSIS: FILTER BANKS AND TRANSFORMS
Introduction -Analysis-Synthesis Framework for M-band Filter Banks- Filter Banks for Audio Coding: Design Considerations – Quadrature Mirror and Conjugate Quadrature Filters- TreeStructured QMF and CQF M-band Banks – Cosine Modulated “Pseudo QMF” M-band Banks – Cosine Modulated Perfect Reconstruction (PR) M-band Banks and the Modified Discrete Cosine Transform (MDCT) – Discrete Fourier and Discrete Cosine Transform – Pre-echo Distortion- Preecho Control Strategies.
UNIT III AUDIO CODING AND TRANSFORM CODERS
Lossless Audio Coding-Lossy Audio Coding- ISO-MPEG-1A,2A,2A Advanced, 4AudioCoding – Optimum Coding in the Frequency Domain – Perceptual Transform Coder -Brandenburg-Johnston Hybrid Coder – CNET Coders – Adaptive Spectral Entropy Coding -Differential Perceptual Audio Coder – DFT Noise Substitution -DCT with Vector Quantization -MDCT with Vector Quantization.
UNIT IV TIME AND FREQUENCY DOMAIN
Time domain parameters of Speech signal – Methods for extracting the parameters: Energy, Average Magnitude – Zero crossing Rate – Silence Discrimination using ZCR and energy Short Time Fourier analysis – Formant extraction – Pitch Extraction using time and frequency domain methods
HOMOMORPHIC SPEECH ANALYSIS: Cepstral analysis of Speech – Formant and Pitch Estimation – Homomorphic Vocoders.
UNIT V LINEAR PREDICTIVE ANALYSIS
Formulation of Linear Prediction problem in Time Domain – Basic Principle – Auto correlation method – Covariance method – Solution of LPC equations – Cholesky method – Durbin’s Recursive algorithm – lattice formation and solutions – Comparison of different methods – Application of LPC parameters – Pitch detection using LPC parameters – Formant analysis – VELP – CELP.
TOTAL: 45 PERIODS
COURSE OUTCOMES
Upon successful completion of the course, students will be able to
CO1: Examine auditory models to design perceptual audio quality measure.
CO2: Design analysis-by-synthesis model for speech perception.
CO3: Analyze and design algorithms for speech and audio coding.
CO4: Analyze and design algorithms for extracting parameters from the speech signal.
CO5: Implement pitch detection and formant analysis in speech signals.
TEXT BOOKS
1. Rabiner. L. R and Schaffer. R. W., “Digital Processing of Speech signals”, Prentice Hall, 1978
2. Andreas Spanias, Ted Painter, Venkatraman AttiWayne Tomasi, “Audio signal processing and coding”, John Wiley & Sons, 2007
REFERENCES
1. Udo Zölzer , Digital Audio Signal Processing, A John Wiley& sons Ltd Publication, Second Edition, 2008.
2. Mark Kahrs, Karlheinz Brandenburg, “Applications of Digital Signal Processing to Audio And Acoustics”, KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW, 2002.
3. Blake, “Electronic Communication Systems”, Thomson Delmar Publications, 2002.
4. Martin S. Roden, “Analog and Digital Communication System”, Prentice Hall of India, 3rd Edition, 2002.
5. Sklar. B, “Digital Communication Fundamentals and Applications” Pearson Education, 2nd Edition, 2007.
