ZDNet UK


Skip to Main Content

ZDNet.co.uk - Winner of Best Business Website 2007
  1. Home
  2. News
  3. Blogs
  4. Reviews
  5. Prices
  6. Resources
  7. Community
  8. My ZDNet

 

ZDNet UK RSS Feeds


IT Jobs

Application development Toolkit

Intel talks up lip-reading software

Michael Kanellos, CNET News.com CNET News.com

Published: 29 Apr 2003 08:51 BST

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

Intel has released software that lets computers read lips, a step forward that could lead to better voice recognition applications.

The Audio Visual Speech Recognition (AVSR) software tracks a speaker's face and mouth movements. By matching these movements with speech, the application can provide a computer with enough data to respond to voice recognition commands, even when these are given in noisy environments. The AVSR program is part of the OpenCV computer vision library, a collection of open-source applications and tools that help computers interpret visual data.

Computer companies have tried to popularise voice recognition applications for years, but have been stymied by a shortfall in processing power in most computers any by the restricted performance of their software.

Both of these factors are changing rapidly. Average processors now run at over 1.5GHz, while top-of-the-line chips run at 3GHz. Additionally, researchers are getting a better handle on how to write applications that will work with voice commands.

One way to improve such applications is, like Intel, to incorporate a visual signal into the voice recognition scheme. Microsoft Research, for example, has developed a prototype application called GWindows, with which a person can scroll through files or move windows though a combination of voice commands and hand gestures, said Andy Wilson, the project's designer.

With GWindows, a video camera mounted on a television monitor follows moving objects, such as a hand or pointer, that come within 20 inches of the screen. The application interprets any hand movements (or pointer gestures) as computer commands: placing a finger over a window and then moving a finger left will move the window left, for example. If a voice command such as "scroll" is given, the computer will combine the finger and voice commands and scroll down. No special gloves are needed.

Microsoft's prototype application works better than a simple voice recognition system because the gestures improve accuracy, according to Wilson, who has demonstrated that the computer can follow voice commands in a crowded room filled with multiple conversations and lots of interference.

Such visual signal software relies in part on Bayesian mathematics, which is influencing other interface and artificial intelligence projects at Microsoft. In Bayesian maths, computers essentially rely on statistics. If a computer "sees" a sweeping hand gesture toward the left a number of times, it will consistently interpret that gesture as a command to move a file toward the left.

Intel has other visual applications to AVSR in the works. The tech giant is looking into an application that uses cameras to monitor hospital patients for risk of strokes and into software that uses a security camera feed to detect potential criminals in a parking lot. The underlying principles of these programs are the same: the computer sends an alert when it sees something unusual -- a slowing in a patient's gait or a person going from car to car instead of into the mall -- in its video stream.

The work on these applications and the development of AVSR is taking place at Intel's China Research Center in Beijing.

In other Intel software research news, the company has released a test version of a technical library for building Bayesian networks, said Gary Bradski, a senior researcher in Intel's Microprocessor Research Labs who helped create the OpenCV library. A final version of the technical library, called the Probability Network Library, will come out by the end of the year, he said.


See the Software News Section for the latest headlines on everything from peer to peer clients to Office software and beyond.

Let the editors know what you think in the Mailroom.

  • Email
  • Trackback
  • Clip Link
  • Print friendly Print with Kyocera

Did you find this article useful?
63 out of 117 people found this useful


Full Talkback thread

0 comments

Company/Topic Alerts

Create a new alert from the list below:






Related Jobs

SQL Server DBA / DW / Data Warehousing. ETL, SQL Scripts. 42k + Car

Full expenses will are paid for any work that needs to be done and a generous Car Allowance is on offer. Market leading supplier of integrated ...

Are you looking for a challenging sales role...uncapped commission??

We have won many prestigious awards in recognition of the excellent recruitment services that we offer. We are proud to have received a number of ...

Audio Recognition / C++ - Music Industry - London

C/C++ / Signal Processing Role. Audio / Pattern Recognition Role. You will be able to adapt to the role as it is not just audio recognition but will ...

Featured Talkback

The fact is: Software developers today are really designers and not coders. The reason that business anlaysts exist today to model solutions is because they understand the value of designing software before writing it. All too often developers create code that has little value because they do not understand that business classes interact with other classes within the confines of a working model or pattern.

By: 1000165269

Read full story:
Making sense of agile modelling