Documents

A survey on Speech Recognition

Description
The Speech is most prominent & primary mode of Communication among human being. The communication among human computer interaction is called human computer interface. It is the study of speech signals and the processing of methods of the signals. The signals are usually processed in a digital representation. It is closely tied to Natural Language Processing (NLP) Example is Speech-To-Text Synthesis. Since even before the time of Alexander Graham Bell’s revolutionary invention, engineers and scientists have studied the phenomenon of speech communication with an eye on creating more efficient and effective systems of human-to-human and human-to-machine communication. our goal is to provide a useful introduction to the wide range of important concepts that comprise the field of digital speech processing. Speech processing in an effort to provide a more efficient representation of the speech signal. Speech Processing is divided in to various categories such as Speech recognition, Speaker recognition, Speech coding, Voice Analysis, Speech Synthesis and Speech Enhancement. This paper is mainly discussed with Speech. Speech Recognition is the process of speaking words into the computer, and having text appear on the screen, or having the computer perform various functions, is one of the most exciting and potential-filled technologies available for students with special needs.
Categories
Published
of 2
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
   International Journal of Computer Trends and Technology (IJCTT) – volume 4 Issue 9– Sep 2013 ISSN: 2231-2803 http://www.ijcttjournal.org Page 3036 A survey on Speech Recognition V.Malarmathi   M.C.A  1 , Dr.E.Chandra M.Sc, M.phil, Phd  2   1  Research Scholar, Dr.SNS Rajalakshmi College of Arts & Science, Coimbatore, India. 2    Director, Department of Computer Science, Dr.SNS Rajalakshmi College of Arts & Science, Coimbatore-32, India  Abstract — The Speech is most prominent & primary mode of Communication among human being. The communication among human computer interaction is called human computer interface. It is the study of speech signals and the processing of methods of the signals. The signals are usually processed in a digital representation. It is closely tied to Natural Language Processing (NLP) Example is Speech-To-Text Synthesis. Since even before the time of Alexander Graham Bell’s revolutionary invention, engineers and scientists have studied the phenomenon of speech communication with an eye on creating more efficient and effective systems of human-to-human and human-to-machine communication. our goal is to provide a useful introduction to the wide range of important concepts that comprise the field of digital speech processing. Speech processing in an effort to provide a more efficient representation of the speech signal. Speech Processing is divided in to various categories such as Speech recognition, Speaker recognition, Speech coding, Voice Analysis, Speech Synthesis and Speech Enhancement. This paper is mainly discussed with Speech. Speech Recognition is the process of speaking words into the computer, and having text appear on the screen, or having the computer perform various functions, is one of the most exciting and potential-filled technologies available for students with special needs. Keywords — Speech Recognition, Automatic Speech Recognition (ASR), Voice Analysis, Speech Synthesis I.   I  NTRODUCTION  Speech Recognition is a process used to Recognize Speech uttered by a Speaker and it has been in the field of Research. Voice communication is the most effective mode of communication used by humans. The significance of Speech recognition lies in its simplicity. It can be used in many applications like Security devices, Household Appliances, Cellular Phones, ATM, Machines and Computers. Speech Recognition is the process of translating spoken words into text information on the computer. Through a speech recognition program or an application, the computer is able to  process words you say and turn them into text on the screen. In computer science, speech recognition (SR) is the translation of spoken words into text. It is also known as automatic speech recognition , ASR , computer speech recognition , speech to text , or just STT . Some SR systems use training where an individual speaker reads sections of text into the SR system. These systems analyze the person's specific voice and use it to fine tune the recognition of that  person's speech, resulting in more accurate transcription. Systems that do not use training are called Speaker Independent systems. Systems that use training are called Speaker Dependent systems. Most speech recognition systems can be classified according to the following Categories [1][2]. II.   SPEECH RECOGNITION  A.    Isolated-Word Recognition Isolated-Word can be introduced with speaker trained and speaker independent. This technology opened up   a class of applications called ‘command-and control’ applications in   which the system is capable of recognizing a single word command (from a small vocabulary of single word commands), and appropriately responding to the recognized command. One key problem with this technology is the sensitivity to background noises (which were often recognized as spurious spoken words) and extraneous speech which was inadvertently spoken along with the command word. Various types of ‘keyword spotting’ Algorithms evolved to solve these types of problems [1][2].  B.   Connected Word Recognition Connected Word Recognition can be introduced with speaker trained and speaker independent. This technology was  built on top of word recognition technology, choosing to exploit the word models that were successful in isolated word recognition, and extend the modeling to recognize a concatenated sequence (a   string) of such word models as a word string. This technology opened up a class of applications  based on recognizing digit strings and alphanumeric strings, and led to a variety of systems for voice dialing, credit card authorization, directory assistance lookups, and catalog ordering[1][2]. C.   Continuous or Fluent Speech Recognition Continuous or Fluent can be introduced with speaker trained and speaker independent. This technology led to the first large vocabulary recognition systems which were used to access databases (the DARPA Resource Management Task), to do constrained dialogue access to information to handle very large vocabulary read speech for dictation (the DARPA  NAB Task), and eventually were used for desktop dictation systems for PC environments [1][3].   International Journal of Computer Trends and Technology (IJCTT) – volume 4 Issue 9– Sep 2013 ISSN: 2231-2803 http://www.ijcttjournal.org Page 3037  D.   Speech Understanding Systems Speech Understanding Systems (so-called unconstrained dialogue systems) which are capable of determining the underlying message embedded within the speech, rather than  just recognizing the spoken words. Such systems, which are only beginning to appear recently, enable services like customer care and intelligent agent systems which provide access to information sources by voice dialogues (the AT&T Maxwell Task)[1][2].  E.   Spontaneous conversation Systems Spontaneous Conversation is able to recognize the spoken material accurately and understand the meaning of the spoken material? Such systems, which are currently beyond the limits of the existing technology, will enable new services such as ‘Conversation Summarization’, ‘Business Meeting Notes’, Topic Spotting in fluent speech (e.g., from radio or TV  broadcasts), and ultimately even language translation services  between any pair of existing languages[1][4]. F.    Applications Speech Recognition applications include voice user interfaces such as voice dialing (e.g., Call home ), call routing (e.g., I would like to make a collect call ), demotic appliance control, search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word  processors or emails), and aircraft (usually termed Direct Voice Input)[1][2]. The Applications include automation of complex operator- based tasks, e.g., customer care, dictation, form filling applications, provisioning of new services, customer help lines, e-commerce[3][1]. G.    Issues in Speech Recognition The term voice recognition   refers to finding the identity of who is speaking, rather than what they are saying. Recognizing the speaker can simplify the task of translating speech in systems that have been trained on specific person's voices or it can be used to authenticate or verify the identity of a speaker as part of a security process[6][2]. Accurately and efficiently convert a speech signal into a text message independent of the device, speaker or the environment. It is easy to measure extracted speech features. It should be stable over time [3][1]. . III.   AUTOMATIC SPEECH RECOGNITION (ASR) FEATURES  A.    Advantages Speech input is easy to perform because it does not require a specialized skill as does typing or push button operations. Information can be Input even when the user is moving or doing other activities involving the hands, legs, eyes or ears. ASR is divided in to major categories. Speaker-dependent and Speaker-independent. Automatic Speech Recognition requires Speaker Training or enrollment prior to use and the primary user trains the Speech Recognizer with samples of his or her own speech. In Speaker independent Automatic Speech Recognition does not Require Speaker Training prior to use. The Speech recognizer is pre-trained during system development with speech samples from a collection of Speakers.[5][3]. IV.   CONCLUSION In this review, we have discussed the types of speech recognition system. We also presented the applications and issues consider under speech recognition system. Speech recognition technology as evolved for more than 40 years, spurred on by advances in signal processing, algorithms, architectures, and hardware. Today high quality speech recognition technology packages are available in the form of inexpensive software only desktop packages (IBM via Voice, Dragon Naturally Speaking, Kurzweil etc.), technology engines that run on either the desktop or a workstation[1] . R  EFERENCES   [1].   B. H. Juang Ed., “The past, present, and future of speech  processing”, IEEE Signal Processing Magazine, 24-48, May 1998. [2].   Murty, K.S.R., Yegnanarayana, B., “Epoch Extraction From Speech Signals”, IEEE Tnsactions on Audio, Speech, and Language Processing, ;  Nov. 2008 Volume: 16 Issue:8; 1602 – 1613. [3].   Kenneth Thomas Schutte “Parts-based Models and Local Features for Automatic Speech Recognition” B.S.,University of Illinois at Urbana-Champaign (2001) S.M.,V Massachusetts Institute of Technology (2003). Bain, K. Paez, D. Speech Recognition in Lecture. [4].   Theatres. Proceedings of the Eighth Australian International Conference on Speech Science and Technology. Canberra, Australia (2000). [5].   Fundamentals of Speech Recognition, L. R. Rabiner and B. H. Juang,Prentice Hall Inc., 1993 [6].   Connectionist Speech Recognition-A Hybrid Approach, H.A.Bourlard and Kluwer Academic Publishers, 1994
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks