Since Google launched the sitemap protocol in June 2005, webmasters and search engine optimizers have to rethink their dealing with search engine crawlers. Assuming Google will remove the sitemap protocol's 'beta' label faster than from it's search engine, this article tries to give web site owners an idea where to place Google sitemaps in their toolset. To define the playground, this article starts as a tutorial on supporting and steering search engine crawlers.



Basic Search Engine Crawler Support

Supporting crawlers in indexing a web site

Identifying and Tracking SE Crawling

Detecting search engine spiders, tracking and analyzing their behavior

The Gatekeeper: robots.txt

Preventing search engine crawlers from fetching particular files and directories

URL Specific Control: the Robots META Tag

Telling search engine spiders how to index and cache a particular page

Link Specific Regulation: REL=NOFOLLOW

Hindring search engines to interpret a link as a vote for the link target

Tagging irrelevant page areas: class=robots-nocontent

How to make cluttered page areas like blocks with ads unsearchable. The class name robots-nocontent can be applied to everything not related to the page's main content.

User and Crawler Friendly Navigation

Leading search engine bots to the content they shall index

Search Engine Friendly Query Strings

If you can't avoid query strings in URLs, keep them short

What Google's Sitemap Protocol May Change

Educating Googlebot and (hopefully, in the future) other crawlers too

Recap: Methods to Support Search Engines in Crawling and Ranking

Webmaster's toolset to support and control search engine spiders

Author: Sebastian
Last Update: Monday, June 20, 2005   Web Feed

· Home

· Internet

· Steering SE Crawlers

· Googlebot-Spoofer

· Google Sitemaps Info

· Web Links

· Link to us

· Contact

· What's new

· Site map

· Get Help

Most popular:

· Site Feeds

· Database Design Guide

· Google Sitemaps

· smartDataPump

· Spider Support

· How To Link Properly

Free Tools:

· Sitemap Validator

· Simple Sitemaps

· Spider Spoofer

· Ad & Click Tracking

Search Google
Web Site

Add to My Yahoo!
Syndicate our Content via RSS FeedSyndicate our Content via RSS Feed

To eliminate unwanted email from ALL sources use SpamArrest!


neat CMS:
Smart Web Publishing

Text Link Ads

Banners don't work anymore. Buy and sell targeted traffic via text links:
Monetize Your Website
Buy Relevant Traffic

[Editor's notes on
buying and selling links

Digg this · Add to · Add to Furl · We Can Help You!

Home · Categories · Articles & Tutorials · Syndicated News, Blogs & Knowledge Bases · Web Log Archives

Top of page

No Ads

Copyright © 2004, 2005 by Smart IT Consulting · Reprinting except quotes along with a link to this site is prohibited · Contact · Privacy