ZDNet UK


Skip to Main Content

ZDNet.co.uk - Winner of Best Business Website 2007
  1. Home
  2. News
  3. Blogs
  4. Reviews
  5. Prices
  6. Resources
  7. Community
  8. My ZDNet

 

ZDNet UK RSS Feeds


IT Jobs

Application development Toolkit

Intel talks up lip-reading software

Michael Kanellos, CNET News.com CNET News.com

Published: 29 Apr 2003 08:51 BST

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

Intel has released software that lets computers read lips, a step forward that could lead to better voice recognition applications.

The Audio Visual Speech Recognition (AVSR) software tracks a speaker's face and mouth movements. By matching these movements with speech, the application can provide a computer with enough data to respond to voice recognition commands, even when these are given in noisy environments. The AVSR program is part of the OpenCV computer vision library, a collection of open-source applications and tools that help computers interpret visual data.

Computer companies have tried to popularise voice recognition applications for years, but have been stymied by a shortfall in processing power in most computers any by the restricted performance of their software.

Both of these factors are changing rapidly. Average processors now run at over 1.5GHz, while top-of-the-line chips run at 3GHz. Additionally, researchers are getting a better handle on how to write applications that will work with voice commands.

One way to improve such applications is, like Intel, to incorporate a visual signal into the voice recognition scheme. Microsoft Research, for example, has developed a prototype application called GWindows, with which a person can scroll through files or move windows though a combination of voice commands and hand gestures, said Andy Wilson, the project's designer.

With GWindows, a video camera mounted on a television monitor follows moving objects, such as a hand or pointer, that come within 20 inches of the screen. The application interprets any hand movements (or pointer gestures) as computer commands: placing a finger over a window and then moving a finger left will move the window left, for example. If a voice command such as "scroll" is given, the computer will combine the finger and voice commands and scroll down. No special gloves are needed.

Microsoft's prototype application works better than a simple voice recognition system because the gestures improve accuracy, according to Wilson, who has demonstrated that the computer can follow voice commands in a crowded room filled with multiple conversations and lots of interference.

Such visual signal software relies in part on Bayesian mathematics, which is influencing other interface and artificial intelligence projects at Microsoft. In Bayesian maths, computers essentially rely on statistics. If a computer "sees" a sweeping hand gesture toward the left a number of times, it will consistently interpret that gesture as a command to move a file toward the left.

Intel has other visual applications to AVSR in the works. The tech giant is looking into an application that uses cameras to monitor hospital patients for risk of strokes and into software that uses a security camera feed to detect potential criminals in a parking lot. The underlying principles of these programs are the same: the computer sends an alert when it sees something unusual -- a slowing in a patient's gait or a person going from car to car instead of into the mall -- in its video stream.

The work on these applications and the development of AVSR is taking place at Intel's China Research Center in Beijing.

In other Intel software research news, the company has released a test version of a technical library for building Bayesian networks, said Gary Bradski, a senior researcher in Intel's Microprocessor Research Labs who helped create the OpenCV library. A final version of the technical library, called the Probability Network Library, will come out by the end of the year, he said.


See the Software News Section for the latest headlines on everything from peer to peer clients to Office software and beyond.

Let the editors know what you think in the Mailroom.

  • Email
  • Trackback
  • Clip Link
  • Print friendly Print with Kyocera

Did you find this article useful?
63 out of 117 people found this useful


Full Talkback thread

0 comments

Company/Topic Alerts

Create a new alert from the list below:






Related Jobs

Purchasing Manager - 40-45K+bonus & car allowance

A car allowance of 513 per month A Contributory Pension scheme where for every 4% that you pay in, they will pay 6.5%. A passionate Purchasing ...

Senior Developer, West Midlands, 40k + 8k car allowance

Salary 33-44k according to experience plus car allowance and pension. A successful Logistics company based in the West Midlands urgently requires a ...

Microsoft Principal Consultant 60-65k +Car, Laptop, Health, Training

My client has put together a very generous package that includes a 65K salary and benefits that include 6K car allowance, healthcare, mobile, laptop ...

Featured Talkback

The fact is: Software developers today are really designers and not coders. The reason that business anlaysts exist today to model solutions is because they understand the value of designing software before writing it. All too often developers create code that has little value because they do not understand that business classes interact with other classes within the confines of a working model or pattern.

By: 1000165269

Read full story:
Making sense of agile modelling