If you can't avoid query strings in URLs, keep them short

Steering SE Crawlers · Index · Expand · Web Feed

Previous PageUser and Crawler Friendly Navigation

What Google's Sitemap Protocol May ChangeNext Page


A search engine friendly URL doesn't contain a question mark followed by a list of variables and their values. A search engine friendly URL is short and contains the keywords describing the page's content best, separated by hyphens. This does not only help with rankings, it helps visitors and especially bookmarkers too.

However, it's not always possible to avoid query strings. All the major search engines have learned to crawl dynamic pages, but there are limits:

Search engine spiders dislike long and ugly URLs. They get indexed from very popular sites, but dealing with small web sites spiders usually don't bother fetching the page.
Links from dynamic pages seem to count less than links from static pages when it comes to ranking based on link popularity. Also, some crawlers don't follow links from dynamic pages more than one level deep.
To reduce server loads, search engine spiders crawl dynamic content slower than static pages. On large sites, it's pretty common that a huge amount of dynamic pages buried in the 3rd linking level and below never get indexed.
Most search engine crawlers ignore URLs with session IDs and similar stuff in the query string, to prevent the spiders from fetching the same content over and over in infinite loops. Search engine robots do not provide referrers and they do not accept cookies, thus every request gets a new session ID assigned. Each variant of a query string creates a new unique URL.
Keywords in variables and their values are pretty useless for ranking purposes, if they count at all. If you find a page identified by the search term in its query string on the SERPs, in most cases the search term is present as visible or even invisible text too, or it was used as anchor text of inbound links.
There are still search engine crawlers out there which refuse to eat dynamic spider food.


Some rules of thumb on search engine friendly query strings:

Keep them short. Less variables gain more visibility.
Keep your variable names short, but do not use 'ID' or composites of entities and 'ID'.
Hide user tracking from search engine crawlers in all URLs appearing in (internal) links. That's tolerated cloaking, because it helps search engines. Ensure to output useful default values when a page gets requested without a session ID and the client does not accept cookies.
Keep the values short. If you can, go for integers. Don't use UUIDs/GUIDs and similar randomly generated stuff in query strings if you want the page indexed by search engines. Exception: in forms enabling users to update your database use GUIDs/UUIDs only, because integers encourage users to play with them in the address bar, which leads to unwanted updates and other nasty effects.


Consider providing static looking URLs, for example on Apache use mod_rewrite to translate static URLs to script URLs + query string. Ensure your server does not send a redirect response (302/301) then. Or, on insert of tuples in a 'pages' database table, you can store persistent files for each dynamic URL, calling a script on request. For example a static URL like http://www.yourDomain.com/nutrition/vitamins-minerals-milk-4711.htm can include a script parsing the file name to extract the parameter(s) necessary to call the output script. In this example the keywords were extracted from the page's title and the pageID '4711' makes the URL unique within the domain's namespace.



What Google's Sitemap Protocol May ChangeNext Page

Previous PageUser and Crawler Friendly Navigation


Steering and Supporting Search Engine Crawling · Index · Part 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · 10 · Expand · Web Feed



Author: Sebastian
Last Update: Monday, June 20, 2005   Web Feed

· Home

· Internet

· Steering SE Crawlers

· Googlebot-Spoofer

· Google Sitemaps Info

· Web Links

· Link to us

· Contact

· What's new

· Site map

· Get Help


Most popular:

· Site Feeds

· Database Design Guide

· Google Sitemaps

· smartDataPump

· Spider Support

· How To Link Properly


Free Tools:

· Sitemap Validator

· Simple Sitemaps

· Spider Spoofer

· Ad & Click Tracking



Search Google
Web Site

Add to My Yahoo!
Syndicate our Content via RSS FeedSyndicate our Content via RSS Feed



To eliminate unwanted email from ALL sources use SpamArrest!





neatCMS

neat CMS:
Smart Web Publishing



Text Link Ads

Banners don't work anymore. Buy and sell targeted traffic via text links:
Monetize Your Website
Buy Relevant Traffic
text-link-ads.com


[Editor's notes on
buying and selling links
]






Digg this · Add to del.icio.us · Add to Furl · We Can Help You!




Home · Categories · Articles & Tutorials · Syndicated News, Blogs & Knowledge Bases · Web Log Archives


Top of page

No Ads


Copyright © 2004, 2005 by Smart IT Consulting · Reprinting except quotes along with a link to this site is prohibited · Contact · Privacy