Advertisement
Promo

Online business Toolkit

British Library plans to archive whole UK Web

Ingrid Marson ZDNet.co.uk

Published: 24 Jun 2004 15:00 BST

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

A trial project to archive 6,000 UK Web sites was announced on Tuesday by the UK Web Archiving Consortium. The consortium, led by the British Library, includes the Wellcome Trust, the National Archives and the Scottish and Welsh national libraries.

Each member of the consortium will choose content relevant to its subject. All types of Web content will be included, from government documents to blogs.

Richard Boulderstone, director of e-strategy at the British Library, said that all types of material will be collected including "informal material" such as discussion forums. "Letters and other informal works tell us how society is actually operating," he said.

The British Library will not censor the material because it does not want to restrict what people can find out about in the future.

"We would like to take a snapshot of every year, as a sample of what the Web looked like", said Boulderstone, suggesting that in the future people could look back to 2004 and see the swear words that Web users were using.

Only a limited number of Web sites will be archived initially but "ultimately, we would like to archive the whole UK Web," said Boulderstone.

One of the problems faced by the consortium is that, due to UK copyright law, permission is needed before a site can be archived. The British Library is working with the government to extend the law to allow them blanket access to all Web sites because "there are 4 million sites that we would like to capture -- we cannot ask everyone for permission," said Boulderstone.

The UK Web Archiving Consortium is not the first to archive the Web. The Wayback Machine, run by US-based Internet Archive, is a service that allows people to visit archived versions of Web sites.

According to Boulderstone, the British Library's approach differs from that of the Internet Archive because his organisation seeks permission from Web sites. In the future, the British Library hopes to improve on Wayback by archiving more frequently and with more depth, and through providing metadata so that information can be found more easily.

  • Email
  • Trackback
  • Clip Link
  • Print friendly
  • Post Comment

Did you find this article useful?
60 out of 118 people found this useful



Company/Topic Alerts

Create a new alert from the list below:





Sentry Posts Blog

Met will not reopen phone hack investi...

The Metropolitan Police will not reopen its investigation into alleged phone hacking by the News of the World. In a press statement delivered outside Scotland Yard on Thursday, Assistant... More

Post a comment

FUD over ChromeOS's security already?

It hasn't taken long for the security vendors to wake to the potential of Google's new ChromeOS. The potential that is, to create FUD – fear uncertainty and doubt. In a release today,... More

Post a comment

Feds take DDoS in their stride

The US Department of Homeland Security has said that a series of distributed denial-of-service attacks began on US government networks on 4 July. However, Amy Kudwa, deputy press... More

Post a comment

Video icon

Video

Google Chrome

Roundup: Full coverage of Google Chrome

The search giant has launched a beta of its own open-source browser, sending a clear challenge to Microsoft in the way it lets users work with applications More

Blog: Google Chrome has Microsoft's code inside, says MS manager

And furthermore, he says, that's a good thing... More

Blog: Google Chrome — nine things we've found since launch

Google must be very happy with the coverage Chrome has gathered. But it's not all good news... More


Skip Sub Navigation Links to CNET Brand Links

Help

Become part of the ZDNet community.

Newsletters