Imagine talking to your cousin in China. On her end, she hears Chinese.
On your end, you hear English. In the middle, a computer is providing real-time
translation. It's just like the "universal translator" in Star Trek.
This scenario is a real possibility within the next 10 or 20 years,
say experts in voice technology. In the meantime, voice technology is making
"It's such a dynamic technology," says Doug Alexander. He works with Speech
Technology Magazine. "It has applications in so many industries that it's
inevitable that it's going to continue to grow."
Voice technology promises to help consumers order products and services.
It could help international businesspeople communicate. And it could increase
Experts are needed in a variety of fields. The most sought will be those
with degrees in electrical and electronics engineering and linguistics.
Speech technology incorporates a variety of disciplines. Signal processing,
for instance, is the process of using digital computers to process analog
signals, such as music or speech.
One specialty area in signal processing is automatic speech recognition.
That's the process of using a computer to understand human speech.
Another discipline in speech technology is text-to-speech synthesis. That's
where you take text input and generate synthetic speech.
These technologies have been around since the 1960s. That's when computers
became powerful enough to do this sort of work. Technological changes within
the last 10 years have made it all come together.
Computer processing power has increased greatly for very low cost. The
algorithms that do the processing on both the speech recognition and the speech
synthesis side have also improved drastically.
Plus, researchers now have the ability to build good speech models that
model the phonetic quality of a specific language. That takes a lot of data.
"Within the last decade, companies have been able to record a tremendous
amount of speech data," says Ed Bronson. He's in charge of speech engineering
for a voice laboratory.
"For North American English, you need to be able to get enough speech samples
from all over the country to pick up all the dialect variations, such as the
New York twang...and the southern drawl in Texas."
Another big development in the last three years is the advent of VoiceXML.
It's a web-based language for doing voice interactive services. It allows
websites to take voice input.
Instead of entering data into fields, you can just speak. For example,
you could tell a courier company your tracking number. Then the system could
tell you the location of your package.
Companies want to reduce the cost of handling customer calls. This is a
major driving force in the development of voice technology. Calls can be shortened.
They could use less human intervention. That can mean big savings.
Advocates say the technology will also benefit consumers. They say it's
easier and quicker to enter and request information using your voice. And
some people, such as those on the road and those operating machinery, can't
easily punch numbers into their phone.
Daryle Gardner-Bonneau is the editor-in-chief of the International Journal
of Speech Technology. She says some of the technology seems like "technology
in search of an application." In other words, critics might say some voice
technology innovations have little practical use.
But there's no denying, she says, that voice technology has many valuable
A big area is corporate security. Voice verification ensures that only
authorized people can access sensitive information, such as financial data.
In addition to a PIN number, a computer can match your voiceprint to confirm
Currently, voice recognition is good at understanding most people if the
pool of possible words is limited. In other words, if the software can predict
what you'll say, such as with airline reservations, then the accuracy is good.
But voice recognition gets trickier when it comes to understanding natural
conversation, with all its slang, varied tones, accents and so on.
Telecommunications and software companies are among the primary developers
of voice technology.
"I'd guess there are several hundred [companies] at any one time in the
U.S., many of them small players in particular niches," says Gardner-Bonneau.
She estimates there are at least a few thousand employees of speech technology
One company has applications that allow people to use speech to access
e-mail, voicemail and fax. You log in using secure voice authentication. You
can have your e-mail messages read to you. You can also respond to e-mails
The program will convert the message to text and send it off. It can also
be used to track courier packages or for room service in hotels, for example.
Speech technology companies tend to hire people with bachelor's and master's
degrees in electronic engineering and computer science. Since the technology
is relatively new, marketing people are important to inform people of the
benefits. Linguists are also in demand.
"We're looking for speech scientists at the PhD or master's level in speech
recognition or something related to that," says Marie Ruzzo. She works with
a company that develops software for text to speech, speaker verification
and speech recognition.
Speech recognition software is their primary product. It's used by major
airlines and car rental agencies.
Ruzzo says voice recognition is a very cost-effective way to provide customer
service. If someone wants to find out when a train is arriving, they can find
out in seconds. They don't have to wait on hold for a live operator.
"I would say some of the hottest job opportunities...[with my employer]
right now would be people who are linguists," says Ruzzo. "Because we operate
around the world, we're creating speech recognition systems for languages
for other countries."
As far as salary levels, estimates are hard to come by.
"It can go all over the board," says Carvill. "Entry level can go anywhere
from the $40,000 to $50,000 range right up to the sky's the limit. It really
depends on your experience level and what you bring to the equation.
"A couple of years ago, it was probably a little higher, but that's [the]
ballpark that you can expect for people coming out of school with the desired
skill set, if you come out with a master's in engineering."
Ruzzo says that salaries are about $50,000 and up. "It depends on a number
of factors," she says, "[including] any past experience working with others
in the industry."
Bronson predicts that with VoiceXML becoming more prevalent, programmers
are going to be in demand. "I see it as a really nice opportunity for people
to get in and work in the applications area," he says.
So if you have an ear for language and a good technical education, speech
technology could be your field. Maybe you can help make that universal translator
Midwest Speech Technology Association
Members include engineers, marketing and sales professionals,
end users, consultants and other professionals
Event listings, news, an explanation of VoiceXML and more