Improving Speech Recognition in Stuttering: A Technical Analysis

1519 Words4 Pages

Stutter Speech Analysis for Speech Recognition

Abstract—: Stuttering can be defined as speech with involuntary disruption, specially initial consonants. This paper focuses on MFCC (Mel Frequency Cepstral Coefficients) and different methods such as spectrogram analysis and speech waveform for stutter speech analysis. We use Cepstrum analysis to distinguish between a normal person’s speech and that of a stuttering subject. The database is recorded without noise to improve clarity and accuracy in determining Mel Frequency Cepstral Coefficients. We also use a spectrogram to show the clear difference between formant peak changes and how to estimate them for speech analysis and applications for disfluencies. These features can be used for enhancing speech recognition techniques such as security systems, call detection and automated identification for people with stuttering.
Keywords— Cepstral Analysis, Mel Frequency Cepstral Coefficients, Spectrogram, Stuttering. Introduction
Only 5% to 10% of the human population has a completely normal mode of verbal communication with respect to various speech features and healthy voice; and the remaining 90 % to 95 % suffer from one disorder or the other, such as stuttering, cluttering, dysarthria, apraxia of speech, etc. [1], [2]. Stuttering can be identified in people from their childhood and it can last till the end in some cases. It affects the fluency of the language. Most people produce brief disfluencies from time to time. For instance, some words are repeated and others are preceded by "um" or "uh." Disfluencies are not necessarily a problem; however, they can impede communication when a person produces too many of them. There are various types of stuttering which can be classified as shown in the

Open Document