Voice Recognition

Simply speaking

"Commonly used applications like word that is essential though tedious. However, speech recognition technology can make life a lot easier as it does away with the chores of dictation and data entry"


            Thanks to the personal computer, tedious tasks such as maintaining records and the medical history of patients got automated- it was not considered time Ėconsuming and monotonous anymore. Back in the dark ages, records were maintained by painstakingly writing out each and every detail and then filing it alphabetically. But with the PC, all that the user does is enter the data in the requisite fields and maintain the details in digital format. But soon even this seemingly easy method will lose its shine as the user gets more and more accustomed to automation, and sets out in search of simpler method of doing things. 

            The dawn of speech recognition technology has given the option to dictate text instead of typing the data thereby making the task of data entry even easier. With swift strides being made in this area, donít be surprised if very soon simply thinking would initiate data entry!


Working of the speech recognition engine

Thanks to the immense demand for speech recognition packages form all over the world, there are at least a dozen packages available in the market today Developers have gone a step further and begun customizing packages to even recognize dialects within a country. Wonder why would they need to do something like that? 

            Speech recognition technology works on the basis of an in-built dictionary; on the same lines of a digital version of standard dictionary like the Websterís. The user needs to train the software for the speech recognition engine to be able to understand the accent as well as method of pronunciation. For this process, there is a standard amount of text provided by the developer that is already mapped into the application. When the user reads out this text to software, the voice recognition engine is able to map the pronunciation of the words have been pronounced. The more the user trains the software, the better would be the level of accuracy yielded by the application.


So Which one should you buy?

This could be one tough decision as there are many options available in the market today. However, there are a couple of aspects that one needs to keep in mind while selecting a speech recognition application. The first feature, as mentioned earlier, is that one should look for the customization of the package for Asian accent recognition. This will automatically provide you with a greater level of accuracy with a lower amount of training. The second feature is hardware related and is concerned with the input device, also called the microphone. Many speech recognition applications are marketed with their own specialized microphone and these would be a better choice over the ones that do not have microphone. Regardless that the speech recognition package would work with any microphone, the one that is packaged with the software is the one that has been designed to cut out the surrounding sound thereby increasing the sensitivity and accuracy of the application. Does  this mean that microphones are interchangeable amongst speech recognition packages? No, that may not be possible as each microphone has been particular speech recognition packages? No, that may not be possible as each microphone has been designed and tested to work with just that particular speech engine and therefore would not provide the desired result with other packages. 

            Dragon NaturallySpeaking 3.0 has been one of the most popular packages in the market. Ever since itís launch, this package has been able to attain an average accuracy of about 91 percent, and most users can expect to get 87 to 95 percent  accuracy after completing the general training. Its New user Wizard is thorough and easy to follow. You can dictate directly into most applications, and the program integrates with Microsoft word and Corel WordPerfect. It also includes some useful shortcuts that streamline dictation. 

            The latest continuous- speech recognition offering from IBM, ViaVoice 98 includes a smarter speech recognition engine, modeless operation, and extensive command and control capabilities. However, it falls short on accuracy. Intelligently designed and easy to learn ViaVoice features a 64,000 word base vocabulary that can be expanded to 128,000 words. Once you computer the setup a wizard provides a short tour helps you configure the microphone and speakers, and walks you through the quick Training module. To enroll, you simply read from your choice of several texts and then let the software process your voice information. The system supports multiple users and multiple enrolments per user to accommodate different vocabularies or acoustical environments or both. 

            Targeted for home and SOHO users, Philips FreeSpeech 98 is a limited speech recognition solution. The setup is straightforward. The program walks you through an audio tuning process and some initial training. To set up your voice profile, you  must read into the microphone for about 15 minutes. The system spends another 15 minutes processing your input. A green light on FreeSpeech 98 s toolbar lets you know when youíre in a program that can accept voice input. 

            Voice Direct Professional from IMSI provides the user with the option to dictate directly in virtually any windows application, including Excel, Word, WordPerfect, Lotus 1-2-3, PowerPoint, even Internet Explorer, Use voice commands to dictate, edit and control application functionality- everything from data entry to formatting to printing, without even touching the keyboard. The Voice Direct Professionalís proprietary Mouse Grid technology  enables hands free cursor control with pinpoint accuracy. It even includes a high quality noise canceling microphone that eliminates background noise so you can be sure your message gets through. Voice Direct includes a 120,000 word total vocabulary for maximum accuracy and minimum keyboarding time and you can even build macros for redundant tasks using BASIC like scripting language.


What can you use It for ? 

            There are various functions that can be performed using speech recognition software. Infact, the options range right form inputting data such as text and number to logging into your machine as well as shutting it down. Speech recognition software not only allows the user to input text and numeric data into documents and forms but also allows adding punctuation such as tabs, commas, paragraph changes and full stops. Control of basic windows functions can also be performed that are generally done using the mouse and or the keyboard. Functions like opening and closing applications, saving and deleting files can be done using voice commands. Most speech recognition packages have a read back function built into the engine that would allow the system to read back dictated files to the user. The user can use this feature to read back documents including those that have just been typed in. This feature give the user the facility to even have their e mails read back to them.

Copyright © 2002 Dr. Subrahmanyam Karuturi