Thursday, September 21, 2006

Google aims to index 100 billion pages

Google aims to give access to even more information and has been devoting time and resources to figure out how to realize this goal. It seems that the new patent entitled 'Multiple Index Based Information Retrieval System' filed by Anna Patterson, who is Google employee, might be the answer to the issue. The patent published just few months ago and filed way back in January of 2005 shows that Google might actually be aiming to expand their index size to as much as a 100 billion web pages or even more.

According to the patent’s abstracts, conventional information retrieval systems, known as search engines, are capable to index only a really small part of the documents available on the Internet and web in particular. According to estimates, the existing number of web pages on the Internet as of last year was around 200 billion; however, Patterson claimed that even the best search engine (that is Google) was able to index only up to 6 to 8 billion web pages.

The huge gap between the number of indexed pages and existing documents clearly signaled a demand for a new breed of information retrieval system. Conventional information retrieval systems just weren't capable of doing the job and just wouldn't be able to index enough web pages to give users access to a large enough percentage of the present existing information available on the web.

It is interesting fact to be noted that around the time Patterson, who actually developed that technology, filed the patent back in 2005, Google stopped showing the number of pages indexed on their home page.

One could come to think that this new system might be able to index hundred of billions of pages, thus undermine the Google’s position by that time with only 8/9 billion pages indexed, or give a clue to competitors for such possibilities..

With the new system in place, we can wait and see how fast Google will reach the goal of a 100 billion web pages in its index.