Wednesday, August 1, 2007

Google removes labeling of supplemental results

Google claims that being in the supplemental index will pose less of a problem in the future.

To simplify: Google operates with two main indexes of web search results: (1) the regular database and (2) supplemental results.

Supplemental results consists of what Google considers to be less important pages. These could be pages with no or very few inbound links, duplicates etc.

The Valley of Death

Many webmasters regularly wake up screaming in the night, having dreamed that their whole site has gone supplemental.

Supplemental has been considered the valley of death to search engine marketers, as Google normally only presents results from supplemental if it cannot find a reasonable number of relevant hits in the regular index.

Furthermore, pages in the supplemental index are revisited less often by Google.

Well, from now on webmasters may sleep a little sounder, not because supplemental will go away, but because Google is removing the label that identifies supplemental search results.

You can still search for supplemental

The company has also removed the most used way of finding supplemental results by searching Google, but as Danny Sullivan over at Search Engine Land points out, the following technique is still working:

Do a search for: site:yourdomainnamehere.com/&

This option will probably removed soon.

The difference between the main and supplemental index is narrowing

The supplemental index remains. In a blog post over at the Google Webmaster Central blog Matt Cutts & Co say:

Since 2006, we’ve completely overhauled the system that crawls and indexes supplemental results. The current system provides deeper and more continuous indexing.

Additionally, we are indexing URLs with more parameters and are continuing to place fewer restrictions on the sites we crawl. As a result, Supplemental Results are fresher and more comprehensive than ever.

We’re also working towards showing more Supplemental Results by ensuring that every query is able to search the supplemental index, and expect to roll this out over the course of the summer.

Supplemental results will apparently turn up more frequently in search results, and the results will be fresher.

Is this really a problem?

It helps to keep in mind that Google needs to be able to develop an algorithm that’s makes it possible to sort out the best and most relevant search results to a query. This is for instance what PageRank is all about.

Even if Google decided to merge the two databases, there would be an enormous number of pages that would never or seldom turn up on the first 10 pages of search results. This applies to most queries.

Pages with no inbound links or little content or from web sites with very little authority (spammy scraper sites, for instance), will normally only turn up when people do very specialized searches, hitting the “long tail” of search. This is the way it should be.

This Valley of Death only becomes a problem if Google starts to fill it with well-written, informative, and original articles and blog posts, while at the same time serving junk pages on the first page of results.

This has been known to happen, and by labeling the valley as “supplemental results” it has become easier for webmasters to arrest Google when such mistakes occur.

So, if you are cynical, you may say that Google is now trying to hide its mistakes.

We do not think this is the case. Even without the label, it is easy to document cases of high quality pages being buried and junk floating to the top. All you have to do is to analyze earch results for selected queries.

Google may introduce a search tool for supplemental pages

According to Danny Sullivan Matt Cutts of Google feels that a search syntax for identifying supplemental results cause site owners to needlessly fixate too much on such results, in the same way as they often obsess over PageRank:

“Still, he said that Google would probably come up with a way for people to perform a query within Google Webmaster Central or some other method to find out if a page or pages are in the supplemental index.”

Sullivan asks for a a tool within Google Webmaster Central that provides a list of your own supplemental pages, or a percentage of pages from a site that are deemed supplemental as a health check.

But why stop at supplemental? Why not introduce a search tool that helps webmasters identify different types of “unhealthy pages”. A Webmaster Central could therefore give a diagnosis of individual pages or sets of pages, listing relevant causes for why the pages rank poorly, like for instance:

* Lack of backlinks
* Lack of content
* Duplicate content
* Robots.txt and metatag problems
* Server downtime
* Bad coding
* etc.

We realize that there are limits to how far Google can go in this direction, as any input of this kind will be used by webmasters trying to reengineer the Google algorithm, but in general Google will benefit from webmasters cleaning up their acts. Helping them understand what’s wrong with their pages will contribute to this.

Indeed, the Google Webmaster Central already provides information on

* HTTP errors
* Not found
* URLs not followed
* URLs restricted by robots.txt
* URLs timed out
* Unreachable URLs
* Links

This will mostly be a matter of systematizing the relevant information along the dimension of individual pages.

Source:www.pandia.com

No comments: