30 Jul 2014

How Google Crawls The Web [VIDEO]

How Google works is a very strange thing indeed. No one quite knows the detailed inner workings of the ever mysterious algorithms, servers, data, and queries that flow through Google every day. But we do know the basics. With words like ‘Googlebots,’ ‘crawling,’ and ‘index’ – it can seem like a world out of a science fiction novel. But we’re in 2014 now, and even though it may sound strange, how Google crawls and indexes the web affects our daily lives – whether we’re looking for information to make a purchase, or reach other searchers like us by creating useful content in order to gain their business. Watch the video here to learn a little more on how Google makes its way through the depths of cyberspace.

Transcription:

Before a search query is entered, the internet and its vast landscape of information must be harvested for the use of such queries.

As thousands of websites are created and updated daily, these pages must be discovered. To do that, the internet must be crawled. Crawling is the exciting discovery of new and updated web content that become a part of Google’s collection of indexed pages. This is all made possible through multiple computers that are linked together through Google’s mastermind program, “the Googlebot.”

“The Google bot,” is an algorithmic process of highly intelligent spider-like computer programs that decide which sites to crawl, how often, and what pages to collect from each site. In the beginning of its crawl process, a list of web pages are created from a previous crawl session and augmented with Sitemap data. When Google bot’s mysterious crawlers approach one of these websites, they scan for any new links and then proceed to add them to their list of pages to crawl. Both resourceful and cunning, nothing escapes its grasp, as information about new sites, site changes, and bad links are all collected in order to update the mother ship of online collections, otherwise known as Google’s “index.”

From this point, the crawler has done its job, gathering what information it can to better assist you in your future searches.