PEABODY, Mass., April 11, 2003 – ScanSoft, Inc. (Nasdaq: SSFT), a leading supplier of imaging, speech and language solutions, today announced that Dragon NaturallySpeaking® 7 - Naturally Speaking - Preferred Edition has received a TECH Edge magazine Editors’ Choice Award - Review - for Speech Voice Recognition Technology. TECH Edge editors, in a review of several dictation applications, commended the extraordinarily high accuracy rates of Dragon NaturallySpeaking 7 - Naturally Speaking - and noted that the product’s usability and abundance of new features were the deciding factors in bestowing the award. - Software Review

Dragon NaturallySpeaking version 7 - Naturally Speaking, a new release of the world's best-selling speech voice recognition application for Microsoft Windows, delivers the accuracy, speed, ease-of-use and innovative features needed to make speech recognition a mainstream application, in the home and at the office. In its review, TECH Edge reported, “Dragon NaturallySpeaking - Naturally Speaking - improves on the 95 percent accuracy rate of version 6 by examining documents and e-mails on your hard drive to better understand how you write…Dragon NaturallySpeaking - Naturally Speaking - bests [other products] with its arsenal of features and smooth user interface.” The complete review by Bill Mann can be found in the May 2003 issue of TECH Edge. - Software Review

“We are pleased with the positive response Dragon NaturallySpeaking 7 - Naturally Speaking - has received from our customers and industry experts like TECH Edge,” said Robert Weideman, chief marketing officer at ScanSoft. “We’re honored to receive this award, as it underscores the Company’s commitment to delivering value to our customers through innovative productivity solutions.” - Voice Speech Recognition Software Review

Dragon NaturallySpeaking 7 - Naturally Speaking - is a significant improvement over the previous release, increasing accuracy by 15 percent and enabling users to achieve accuracy levels of up to 99 percent. Its speed has been improved by more than 50 percent for dictation initialization. Accuracy and performance are further enhanced by new features such as the Vocabulary Optimizer, which evaluates previously authored emails and documents to personalize itself to the writing style of the user, and the Performance Assistant, which optimizes the efficiency of speech recognition based on a user's workflow. In total, the accuracy and performance improvements within Dragon NaturallySpeaking 7 - Naturally Speaking - deliver an unmatched level of productivity and ease of use for speech recognition products. Software Review

About TECH Edge Magazine

TECH Edge spotlights technological innovations reviewing both consumer electronics and software products keeping consumers up to date of the changes in technology.

About ScanSoft, Inc.

ScanSoft, Inc. (Nasdaq: SSFT) is a leading supplier of imaging, speech and language solutions that are used to automate a wide range of manual processes – saving time, increasing worker productivity and improving customer service. For more information regarding ScanSoft products and technologies, please visit www.ScanSoft.com.

Trademark reference: ScanSoft, the ScanSoft logo, and Dragon NaturallySpeaking are registered trademarks or trademarks of ScanSoft, Inc. in the United States and other countries. All other company or product names may be the trademarks of their respective owners.

The statements in this press release that relate to future plans, events or performance are forward-looking statements that involve risks and uncertainties, including risks associated with market trends, competitive factors, and other risks identified in ScanSoft’s SEC filings. Actual results, events or performance may differ materially. Readers are cautioned not to put undue reliance on these forward-looking statements that speak only as of the date hereof.

- Dragon Naturally Speaking review, Voice Recognition software review -

The impact of these significant pronunciation divergences – in stress placement, varying numbers of syllables and in vowel length – on speech recognition is perhaps not the most obvious one. ASR providers should know these variants and load appropriately different grammars (with their associated pronunciation models) into the localized software used in the U.S., Canada or the UK. The real problem lies with physicians and medical technologists who have learned English (perhaps as a second or other language) outside North America or the British Isles, but who are resident in the U.S. or the UK. Linguistic speculation accounts for these varying pronunciations by assuming that (native) speakers of English draw different analogies according to their perception of the morphological origins of these neologisms, and by regularizing with the stress patterns preferred in their dialect. Speakers of Indian or Singaporean English will have learned primarily British English but they may practice in Chicago or Vancouver; similarly, Australian English doctors and dentists who studied in Hong Kong may have moved to London. Their accented varieties of English will be one impediment to reliable recognition built for other standard accents, and their learnt/preferred pronunciation of the terminology will add another layer of potential confusion or failure. Speech Recognition software review

UNSPEAKABLE NAMES
For legal purposes names and trademarks need to be spelled correctly. However, it is not possible to legally dictate how they are pronounced. This has important and varied repercussions when names are (re)produced using text-to-speech (TTS). In naming a new company or product, it is now de rigeur to combine upper and lower-case characters in one alphabetic string, with no white space, or to alter the spelling for eye appeal. This typographical rulebreaking also comes from company mergers, giving rise to such unwieldy strings as exemplified in the following list of some pharmaceutical giants and their product brand names. Bold face sequence show non-English spelling names; the hash mark (#) shows a TTS normalized text string that breaks the normal spelling (phonotactic) rules of English, which may in turn cause the TTS system to produce an unpredictable or weird interpretation. Review of Voice Recognition software

Some drug names are familiar enough to physicians and patients alike that they should not present pronunciation/recognition difficulties for an automated spoken system (e.g. aspirin, codeine, Valium™). For native speakers of English, however, other drug and/or compound names range from fairly unambiguous, to opaque/ambiguous, to those speakers having no idea with regard to either pronunciation or stress placement. The three lists below illustrate these issues, in descending order of difficulty for humans, and by deduction, those which present increasing difficulties for TTS systems:

- Dragon Naturally Speaking review -

In a vain attempt to help speakers with unpredictable stress placement and/or vowel quality in drug names, pharmaceutical companies and health management providers (HMOs) sometimes give pronunciation hints, in a random dictionary-style transcription. For example, the following are taken from product advertisements and prescription leaflets from the HMO:

- Voice Recognition software review -

This information is completely unsystematic: note three different renditions of unstressed syllables, of post-positioned single quote to indicate stress or upper case, and the unjustified or inconsistent use of upper case in general. It is not helpful to native nor non-native speakers of English, or to those confused by quasi-phonetic notation.

Problems with the unknowables (the great majority) remain unalleviated by drug manufacturers providing such pseudo-pronunciations. More often than not, we are left to our own (wobbly) intuitions about stress placement, short vowel /I/, long vowel /i/, or diphthong /aI/; ‘hard’ or ‘soft’ letter “c” i.e. /s/ or /k/, etc. Anyone who has listened to a radio doctor’s call-in show, where people question a physician about the drugs they have been prescribed, knows that lay people (us) stumble and hesitate with the pronunciation of the drugs they’re taking, and ultimately resort to spelling them for the doctor. Dragon Naturally Speaking review

Given these many (socio)linguistic variables, is it impossible to attribute a degree of certainty in attempts to recognize many names of drugs. All commercial recognizers rely on certainty/confidence factors to supply a match. Recently Walter Rolandi (2003) supplied a useful, critical analogy for this recognition problem: “Imagine an English-only speaker being asked a question by someone speaking in French ... The English speaker instantly knows that what the other person said was not English, i.e. that the speakers’ utterance was not in the listener’s grammar ... having a recognizer capable of accurately determining whether ... an utterance is in its grammar would be a significant step toward more intelligent voice user interfaces.”

Having medical and healthcare-based systems capable of accurately determining whether diseases, procedures, and the names of drugs have been recognized accurately by speaking them back using TTS (to prompt checking and re-entry by hand if necessary) would not only be an intelligent and significant step. It is a vital, preventative step if these devices are to be used more widely by all medical practitioners. Computerized order entry systems typically offer physicians and medical institutions the ability to “streamline workflow, reduce error, save time, money and lives” (www.validus.com). With the many and varied linguistic and phonetic barriers given, it is not clear how errors can be avoided, let alone reduced, and how lives may be saved. Voice Recognition software review

RX FOR REMEDIES
There are still three hurdles to wider adoption of digital dictation devices to increase efficiency for health-care professionals. First, there are understandable concerns about confidentiality/security. Second, the fragility or fallibility of recognition accuracy. Third is the lack of immediate spoken guidance cum confirmation. What can we suggest to mitigate these factors? The first is the easiest: users need to be sensitized to the need to enter the data in a quiet, semiprivate location. Walking out from a consultation, or from a patient’s room, or standing near the nurses station in the center of a bustling ward are not ideal environments in which to speak delicate, private facts about a patient’s prognosis or prescriptions. These are also very noisy places, which in turn will affect the accuracy of the recognizer adversely, leading to repeated attempts and giving rise to increased frustration rather than efficiency. The second problem will then be tolerated, if not solved. The last, and most important improvement in these speech scenarios, is for the user to have some guidance and immediate confirmation of what they have spoken. Speech Recognition software review

Many early adopters in U.S. radiology departments have since abandoned spoken record keeping, because the need for repetition and high failure rates were simply too frustrating. According to Philips Speech Recognition Systems, however, their product SpeechMagic™ (available in 22 languages) is now used in some European countries by more than 60 percent of radiologists (STM NewsBlast, December 10, 2003). The product has recently expanded into other specialized areas, such as cardiology, pathology and surgery. Clearly the speech recognition component has improved over the past 15 years. And perhaps the working conditions of these non-U.S. professionals provide better, quieter, privacy.

There remain skeptics in the U.S. medical profession who simply do not trust that doctor-patient confidentiality is not being violated, and who also do not trust the accuracy of the speech recognition. This may be because the ability to talk back is NOT there. None of the current instantiations include TTS, which is capable of talking back. TTS can guide users to speak a personal or product name correctly (i.e. the way the name has been entered phonemically in the recognizer’s dictionary), and it can safely confirm entries that have been made using ASR and/or the graphical interface. Every doctor, specialist and pharmacist would welcome such a system if it contained such features and IF their HMO accountants or company paid for the installation, training and setup fees. Dragon Naturally Speaking review.

References
Henton, C. (2002) You say ‘zee’, and I say ‘zed’. Issues in localizing voice-driven applications. Speech Technology Magazine, May/June, 28-31.
Henton, C. (2003) The name game: pronunciation puzzles for TTS. Speech Technology Magazine, September/October, 32-35.
Rolandi, W. (2003) When you don’t know when you don’t know. Speech Technology Magazine, July/August, p.28.

Dr. Caroline Henton is Founder and CTO of Talknowledgy.com. Dr. Henton can be reached at carolinehenton@hotmail.com or 831.457.0402.

About Us Contact Us Site Map

Dragon Naturally Speaking Medical Transcription

Voice Recognition Voice Recognition Software Transcription

Dragon Naturally Speaking Review, Voice Recognition Software Review, Speech Recognition Software Review