Advertisement
Promo

Server platforms Toolkit

The magic that makes Google tick

Matt Loney ZDNet.co.uk

Published: 01 Dec 2004 12:15 GMT

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

The process
Obviously it would be impractical to run the algorithm once every page for every query, so Google splits the problem down.

When a query comes in to the system it is sent off to index servers, which contain an index of the Web. This index is a mapping of each word to each page that contains that word. For instance, the word 'Imperial' will point to a list of documents containing that word, and similarly for 'College'. For a search on 'Imperial College' Google does a Boolean 'AND' operation on the two words to get a list of what Hölzle calls 'word pages'.

"We also consider additional data, such as where in the page does the word occur: in the title, the footnote, is it in bold or not, and so on.

Each index server indexes only part of the Web, as the whole Web will not fit on a single machine - certainly not the type of machines that Google uses. Google's index of the Web is distributed across many machines, and the query gets sent to many of them - Google calls each on a shard (of the Web). Each one works on its part of the problem.

Google computes the top 1000 or so results, and those come back as document IDs rather than text. The next step is to use document servers, which contain a copy of the Web as crawled by Google's spiders. Again the Web is essentially chopped up so that each machine contains one part of the Web. When a match is found, it is sent to the ad server which matches the ads and produces the familiar results page.

Google's business model works because all this is done on cheap hardware, which allows it to run the service free-of-charge to users, and charge only for advertising.

  • Email
  • Trackback
  • Clip Link
  • Print friendlyPrint with EPSON

Did you find this article useful?
439 out of 942 people found this useful


Company/Topic Alerts

Create a new alert from the list below:





Video icon

Video

Microsoft Futures

Windows 7: Mixed reviews from PDC attendees

As developers received their copies of Windows 7 on Tuesday, they offered varied reactions to the Microsoft operating system update More

Microsoft floats clouds on Windows Azure

At the Professional Developers Conference, Microsoft announced the Azure Services Platform, the company's cloud-computing platform More

Ozzie: Success of Azure comes down to trust

In an interview, Ray Ozzie says businesses will be taking a risk by placing core operations in Microsoft's datacentre, but that the software giant has more to lose if things go bad More


Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters