The magic that makes Google tick
Published: 01 Dec 2004 12:15 GMT
Other technical challenges
Running thousands of cheap servers with relatively high failure rates is not an easy job. Standard tools don't work at this scale, so Google has had to develop them in-house. Some of the other challenges the company continues to face include:
Debugging: "You see things on the real site you never saw in testing because some special set of circumstances that create a bug," said Hölzle. "This can create non-trivial but fun problems to work on."
Data errors: A regular IDE hard disk will have an error rate in the order of 10-15 -- that is one millionth of one billionth of the data written to it may get corrupted and the hard-disk's own error checking will not pick it up. "But when you have a petabyte of data you need to start worrying about these failures," said Hölzle. "You must expect that you will have undetected bit errors on your disk several times a month, even with hardware checking built-in, so GFS does have an extra level of checksumming. Again this is something we didn’t expect, but things happen."
Spelling: Google wrote its own spell checker, and maintains that nobody know as many spelling errors as it does. The amount of computing power available at the company means it can afford to begin teaching the system which words are related - for instance "Imperial", "College" and "London". It's a job that many CPU years, and which would not have been possible without these thousands of machines. "When you have tons of data and tons of computation you can make things work that don’t work on smaller systems," said Hölzle. One goal of the company now is to develop a better conceptual understanding of text, to get from the text string to a concept.
Power density: "There is an interesting problem when you use PCs," said Hölzle. "If you go to a commercial data centre and look at what they can support, you'll see a typical design allowing for 50W to 100W per square foot. At 200W per square foot you notice the sales person still wants to sell it but their international tech guy starts sweating. At 300W per square foot they cry out in pain."
Eighty mid-range PCs in a rack, of which you will find many dozens in a Google data centre, produce over 500W per square foot. "So we're not going to blade technology," said Hölzle. "We're already too dense. Finally Intel has realised this is a problem and is now focusing more on power efficiency, but it took some time to get the message across."
Quality of search results: One big area of complaints for Google is connected to the growing prominence of commercial search results -- in particular price comparison engines and e-commerce sites. Hölzle is quick to defend Google's performance "on every metric", but admits there is a problem with the Web getting, as he puts it, "more commercial". Even three years ago, he said, the Web had much more of a grass roots feeling to it. "We have thought of having a button saying 'give me less commercial results'," but the company has shied away from implementing this yet.
Full Talkback thread
16 comments
-
Hello
One thing that ticks me off about google is... Hilton Santos -
Dude... don't know what you're talking about.... carlos -
Hi Carlos
Thank you for your feedback.
T... Hilton Santos -
Hello
One thing that ticks me off... Trust me i can help -
Open letter to Google.
Hello my dear Googlers... Hilton Santos -
QUOTE FROM PREVIOUS POST
"Now please exc... Frustrated Research and Development specialist -
Very dear Frustrated Research and D... Hilton Santos -
Welcome to www.polorentacar.co... Hilton Santos -
And he calls himself development special... Anonymous -
Dear Hilton,
First of all, please forgiv... Anonymous -
I do not have time for people who d... Hilton Santos -
October 27 Today's Hand-picked Gallery (... hilton santos -
Oh Wilton, Alem de parvo es muito burro!!!!
Epa w... Anonymous -
Sou puto mas tenho colh... para assinar o que escr... Hilton Santos -
Anonymous -
Anonymous






