|
||||||
![]() |
![]() |
![]() |
||||
No, Google Sitemaps is a robots inclusion protocol lacking any syntax for deletions. Remove deleted URLs in the XML file, and ensure your server responds 404 or 410 to Googlebot. Google Sitemaps FAQ · Index · Expand · Web Feed
Will a Google Sitemap increase my PageRank? Deleted and renamed pages must be removed from your Google Sitemap. Having invalid or redirecting URLs in a sitemap burns resources, and blows up the crawler problem reports. Once Google knows a URL, Googlebot will try to fetch it more or less until the server dies. HTTP return code 404 - Not FoundThe 404 return code is a generic error code, used if a resource is not available, and the server does not know, or does not want to reveal, whether the resource is permanently gone or just temporarily unavailable respectively blocked. Usually the Web server is configured to send a custom error page to the user agent, which responds with a 404 error code in the header, and provides the visitor with information (e.g. error messages) and options (e.g. links to related resources). On Apache Web servers this can be done in the .htaccess file located in the root directory: ErrorDocument 404 /error.htm?errno=404 Because the user agent does not know whether the resource has vanished or not, it might request it again. Therefore the 404 code is not suitable to tell a search engine crawler that it should forget a resource. HTTP return code 410 - GoneThe 410 return code tells the user agent that a resource has been removed permanantly. Search engine crawlers usually mark resources responding with a 410 code delisted, and do not request them again. That's not always the case with Google's supplemental index, where dead resources can still appear in search results, even years after their deletion. A 404/410 return code may move a cached resource from the current search index to the supplemental index. However, if a page was deleted and there is no forwarding address (e.g. a new page with similar content), the Web server should send a 410 header. It's good style to make use of a custom error page for human visitors. HTTP return code 302 - Found (Elsewhere)The 302 return code tells the user agent that the requested URL is temporarily unavailable, but the content is available from another address. In the 302 header the server gives the user agent a new location (URL), and the user agent will then request this resource. For various reasons a Webmaster should avoid 302 redirects, they lead to all sorts of troubles with search engines. The most common cause for 302 responses is an invalid URL used in internal links and link submissions, e.g. missing trailing slashes etc. (see valid URLs). Unfortunately, 302 is the default return code for most redirects, for example Response.Redirect(location) in ASP, header("Location: $location") in PHP, RewriteRule as well as ErrorDocument 404 http://www.example.com/page(!!) in Apache's .htaccess directives. All server sided programming languages provide methods to set the redirect response code to 301. HTTP return code 301 - Moved PermanentlyThe 301 redirect code tells the user agent that a resource has been moved and will never be available at the old address again. All intentional redirects (e.g. renamed URLs, moved URLs ...) must send the requesting user agent a 301 header with the new permanent address. Many scripts make use of redirects to 'link' to external resources, usually because this is simple way to track outgoing traffic. That's a lazy and wacky hack, but if not avoidable, the script should do a permanent redirect at least. Examples of 301 - redirectsTo ensure your redirects send a 301 response code to the user agent, you can copy and paste the code examples below. The first examples are for Apache's .htaccess files: #1 301-redirects a page: Please note, that with both Redirect(2) and RedirectPermanent(1) the first location parameter (source) is a URL relative to the Web server's root, and the second location parameter (target) is a fully qualified absolute URL. The third .htaccess example makes use of the mod_rewrite module and redirects all requests of URLs on example.com to the corresponding URL on www.example.com. 'VBScript: The ASP page script must be terminated after sending the 301 header, and you must not output any content, not even a single space, before the header. Everything after Response.End will not be executed. $newLocation = "http://www.example.com/other-directory/other-page.php";
<META HTTP-EQUIV=Refresh CONTENT="0; URL=http://www.example.com/other-directory/other-page.htm"> or JavaScript: window.location = "http://www.example.com/other-directory/other-page.htm/"; and intrinsic event handlers: <body onLoad="setTimeout (location.href='http://www.example.com/other-directory/other-page.htm', '0')"> to redirect. Again, do not use any client sided redirects if you're keen on search engine traffic, especially not the sneaky methods from the examples above. Google automatically discovers sneaky redirects and deletes all offending pages or even complete domains from its search index, mostly without a warning. Checklist "Delete a page"
Saturday, October 29, 2005 Will a Google Sitemap increase my PageRank?
Google Sitemaps - The How-To What-Is FAQ · Index · Part 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · Expand · Web Feed
|
||||||
|
neat CMS: Banners don't work anymore. Buy and sell targeted traffic via text links: [Editor's notes on |
|||||
Digg this · Add to del.icio.us · Add to Furl · We Can Help You! |
||||||
|
||||||
![]() ![]() ![]() ![]() ![]() |
Home · Categories · Articles & Tutorials · Syndicated News, Blogs & Knowledge Bases · Web Log Archives Copyright © 2004, 2005 by Smart IT Consulting · Reprinting except quotes along with a link to this site is prohibited · Contact · Privacy
|