RealTime IT News

1.5 Million Pages Added To Web Each Day, Says Research Company

Publishers are adding one-and-a-half million pages to the World Wide Web every day, according to analysts at Alexa Internet, of San Francisco, California.

Alexa said it archived 12 terabytes of data, an amount equivalent to half the contents of the Library of Congress, to find this and other facts about the growth and scope of the Internet.

Among the other findings announced by Alexa Internet is the current size of public content on the World Wide Web--said to be three terabytes or three million megabytes. The Web doubles in size every eight months and spawned 20 million content areas. However, Web traffic is far from being evenly divided between sites, the report found, with 90 per cent of the traffic going to 100,000 different host machines and 50 per cent going to just 900 top sites.

"We have within the Web the largest library of information ever available to humankind," said Brewster Kahle, president and CEO of Alexa Internet. "There are millions of unique ideas and perspectives represented on the Web with few clear modes for access. Alexa's efforts are focused on finding the most helpful information and making it available to as many Web users as possible."

Alexa said it continually gathers Web content and uses it to provide site statistics and related links to users of its free service. It donates a copy of each "snapshot" of the World Wide Web to the non-profit Internet Archive, which preserves as study material for future generations.

"Alexa's archival efforts mean they've got more to say about the Web in general than any other Web data providers," Chris Shipley, industry analyst and editor of DEMOLetter. "This means businesses and organisations using Alexa's statistics and trend data are tapping a vast data resource pulled from the most comprehensive archive of documents 'born digital'--that is, electronic at conception and through publication--than any currently available source."

Alexa's navigation aid appears as a toolbar at the bottom of the user's screen. Key features include its provision of information about a site's popularity, the number of links to it, its affiliations, etc. It also gives users a list of 10 related links for each site they visit.

Finally: a by-product of Alexa's archiving technique is its virtual abolition of "404 Not Found" messages. It serves the most recently archived version of an unavailable page.