Speech Recognition

1021 Words3 Pages

Speech recognition is the act of a computer listening to what you are saying and converting it to written text. This may seem like a very simple task to do, knowing that computers are astonishingly fast and powerful but this is quite the contrary. Most recognition software can achieve between 98% to 99% accuracy if operated under optimal conditions. Optimal conditions assume that users have speech characteristics which match the training data, can achieve proper speaker adaptation, and work in a clean noise environment (e.g. quiet office or laboratory space). The two essential steps that a speech recognition system must accomplish are training and decoding. There are two classes of speech recognition, one called speaker independent, which has a small vocabulary of words/commands, and the other called speaker dependent, which has a very large vocabulary but must be trained for each and every user. This training step might involve a user reading a book aloud to the computer, while the system is following along to the words that are being enunciated. It can also involve the input of prerecorded speech and transcribing the audio to the corresponding text word. The speaker independent system's training involves the collecting of different commands and configuring them for different accents and for the differences in the male and female voice, slang, acronyms, articulation in the words, and temporal non-uniformity. An intriguing hurdle that speech recognition must overcome is homonyms, which are words that have sound the same but have different meanings. The common solution to this problem is understanding the context the possible words will be used and picking the corresponding word. This solution can also be used in all forms o... ... middle of paper ... ...of the voice. One recent application of voice recognition technology in entertainment is the horror movie Last Call. When viewers buy their tickets they are asked to provide their cell phone number. Before the movie starts the database of phone numbers for the movie showing are sent to to the company. Sometime, during the movie, an audience member’s cellular phone will ring, and it is up to this audience member to give the character on screen directions. Astonishingly the movie is controlled by a random viewers voice. Also the software has to overcome the loud background noise of the movie. Voice recognition have even reached the video game market. Their defining feature is that the player controls the game entirely by using a microphone to speak commands to the on-screen characters. the commands are interpreted by the in-game voice recognition software.

Open Document