Hi quest ,  welcome  |  sign in  |  registered now  |  need help ?

Thursday, September 8, 2011

Know Google Crawling Machine

Google is still considered as the number one search engine. Besides having a very simple look of the site, Google also provides accurate reply search results. Automated reply indexing system was made Google almost without compromise and fair, it is mean that without human intervention, all sites and blogs for some large and small, new or old players who have almost the same chance.

Google runs on a distributed network of thousands of cheap computers that can perform quickly parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing.

One of the reasons why the existing search engines before Google declined in popularity and usefulness is the emergence of Paid Listings. Where search engines who “hunger” will pay / income to sell a position in the search results to advertisers. The weakening of the objectivity of the poisoned search results and underestimated the popularity of the principle of who owned the website. The difference between search engines, which should display your search results, with the channel browser, which brought me to the affiliate business, blurred. Although many search engines who refused to sell their positions in search results, doubt and distrust already spread in the hearts of the users. Integrity Google looks of their website pages who clean from all kinds of clutter, and merely highlight one thing that is the word “Search”. Google does accept advertising, but advertising that they may receive is separated from the search results. Maybe not everyone agrees with how Google ranks search results, but no one who thinks that the top ranking in Google search results can be bought.

Basic Search Engine How it Works

All the search engines work with the same basic way: they “crawl” (crawl) a web page with an automatic software robot called the Spider (spider) or crawler (crawler) that generate / create the index (list) web content which can be found / discovered by the users. Each search engine allows users to search within the list (index) that the search engines that have, for a Title (keyword) or set of keywords. Search results list is displayed in many forms, but most show little information about each web entrant in the list and link those who lead to the web.

How to make a list of every search engine is very unique, thanks to the programming engine spiders a different one another. The main elements in programming spider is on the search engine algorithms, which determine the ranking of each web page who are listed. The ranking determines how search results are displayed.

How Google Works

Google is a major technology assets in the system they belong algorithm, the complicated formula ranking system gives the users, who search for good results and often seemed as if Google is able to read the mind of every person who searches through the search engine giant. The results of the algorithm summarized in a single statistical rankings so called PageRank, Google’s PageRank formula very secretive about this, but the company is promoting the importance of PageRank, and offered to the general webmaster guidelines to improve PageRank. Google shows the average appraisal system of each site (scale 0-10) in the Google toolbar. Although the exact formulas secret, but the basic ingredients PageRank is publicly known.

When Google indexes all or Crawling?

Google crawls sites on the Internet with different depths and with a schedule more than once. So called Deep Crawl (creep in) is done at least once a month. With regard to the complexity of the process of cataloging and the need for making extensive list of web content, it takes more than a week to do the crawl. Because it takes six weeks for a new website or blog to get listed in Google.

Fresh & Deep Crawler Crawler

Google relies entirely on the deep crawl, but the result of a deep crawl can be quickly expired associated with rapid changes in the Internet world. Google launched a fresh crawl therefore who briefly visited the sites on the Internet more frequently than the deep crawl. It’s fresh crawl results will not change the overall index Google’s proprietary, but will update the contents of some web / blog. Google did not announce the schedule and the fresh crawl your site / blog what was targeted, but the webmaster can find out the schedule through the investigation who thoroughly. Google has no obligation to visit any particular URL, with a fresh crawl them. These sites and blogs can increase the opportunity for more frequent visits Google to update their content regularly. Remember the shallowness of the fresh crawl, Google may visit the front page of your site or blog, but may not visit another page. Deep Crawl is more automated and without consideration as well as more rigorous than the Fresh Crawl. A good chance came when the time schedule of the deep crawl, the links from the new page is listed on the main page, so that the deep crawl will index new pages as well. Not all pages of a site will be included in the index by Google, the process of consideration is the secret of the company. Therefore, if you feel there is a page or your article does not have a vital indexed in Google, which you can do is to maximize the promotion.

One thing Google‘s proud of sophistication of their system is that the index creation process takes place automatically

So there is no interference from humans altogether, including the technicians Google (of course they control the robot Spider, but they did not intervene in the result). So it would be futile if they think yours will respond to complaints about the indexing of your blog or website.


Crawl: The process by which software is owned by search engine robots to explore all the sites and blogs that exist on the Internet.

Spider: The name is owned by software robots that search engines use for indexing. Other software may be called the Crawler robot.

Index: List owned by their respective search engines about the content of every site and blog on the Internet. This list may consist of millions of categories and words. Each of us doing a search through search engines, search engines are concerned they will access the index, to search for sites / blogs that contain information you want.

Other articles you might like;

neovanatica is a well known professional blogger and Information Technology teacher. He holds an Information System degree in Computer Science.

View the original article here

No comments:

Post a Comment