Google is now reporting that its seen more than 1 trillion unique URLs on the Web. That’s a big number and a massive jump from the 26 million URLs it saw in 1998. But even at a trillion pages Google admits that there are many duplicates (in terms of content), and that there well may be an infinite number of pages overall.
Many pages have multiple URLs with exactly the same content or URLs
that are auto-generated copies of each other. Even after removing those
exact duplicates, we saw a trillion unique URLs, and the number of
individual Web pages out there is growing by several billion pages per
day.So how many unique pages does the web really contain? We
don’t know; we don’t have time to look at them all! 🙂 Strictly
speaking, the number of pages out there is infinite.
Thanks to the magic of dynamically generated page content (with or without session IDs) and the fact that Google (despite their best efforts) has never effectively indexed Flash content properly — I personally think the 1 trillion number is on the low side.
Now that doesn’t mean there are more than a trillion Web sites out there — the latest netcraft study reports just over 173 million sites.