|
||||||
![]() |
![]() |
![]() |
||||
Breaking up the anatomy of the Google Sitemaps submission process leads to a time table, and answers a lot of related questions. Google Sitemaps FAQ · Index · Expand · Web Feed
How long does it take to get indexed by Google? Creating the Google SitemapIt makes sense to choose the XML format, and to develop a suitable procedure to automate sitemap updates on page updates. It is absolutely necessary to double check the sitemap's contents to avoid unintended submissions, e.g. different URLs pointing to the same page like http://www.example.com/ (valid URL) and http://www.example.com/index.html (invalid URL). Tips: If you're keen on image and video search traffic, add your images and movie clips to the sitemap. If you provide alternative formats, for example RSS feeds, add those URLs to the sitemap too. Don't add printer friendly HTML pages and pages optimized for different browsers, those should be non-indexable, that is they should have a robots META tag with a "NOINDEX,FOLLOW" value in the head section. Validating the Google SitemapOpen the sitemap with your browser, most browsers detect coding errors. Click on view source to check for proper UTF-8 encoding, for example dynamic URLs must not contain ampersands (&), replace those with the HTML entity &. If the contents look fine, check the XML structure with an XML validator and correct possible errors. Submitting the Google SitemapLogin to Google and add the URL of your new sitemap to your account. When you follow the link in the 'Sitemaps' column of your account's Site Overview, your sitemap should be listed as "pending". If not, for example because the URL is invalid or so, correct the errors and resubmit. Tips: Try to submit your XML sitemap to MSN Search too, and if you've a plain URL list or RSS feed then submit it to Yahoo! Search. Resubmit on changes, because both search engines don't promise to revisit periodically. Although Google downloads accepted XML sitemaps periodically, resubmissions after content changes may be a good idea, at least with Web sites which aren't updated very frequently. Waiting for the submission receiptWhile you're waiting for the first sitemap check by Google, you should verify your account. Just click the verify-link on the overview page and follow the instructions. Ensure that your Web server can store case sensitive files and returns a 404 error code to requests of unknown resources. If you're a Web designer, you can add your clients' sites to your account and the site owner's account as well. Waiting for Google's crawler GooglebotGoogle's crawler schedule is pretty much ruled by PageRank™. That means, if the average PageRank™ of your Web pages is very low, Googlebot visits every once in a while, if ever, and doesn't crawl everything. If your overall PageRank™ is medium to high, Googlebot is the busiest visitor of your site. Monitoring Google's crawling processYou've two instruments to track Google's crawling, your server logs (or a database driven spider tracker), and Google's crawler statistics. Your server logs tell you which URIs Googlebot has fetched, and what your server has delivered. To track down errors, you need the contents of your log files, because tracking software triggered by page views (= crawler fetches) --e.g. via SSI or PHP includes-- cannot invoke logging of requests which weren't successful, e.g. requests of missing files, and usually they fail when it comes to images or movies. If you've verified your Google Sitemap, you get a random list of HTTP errors in Google's crawler stats. Waiting for the results of Google's indexing processIf your popular and well ranked Web site is crawled daily, you can expect that Google's index reflects updates within a few hours, and new pages should be visible within two days at the latest. Otherwise wait a few weeks before you get nervous. Monitoring the results with Google's search query engineThe query engine is the most visible part of a search engine. It receives the submitted search query, tries to figure out what the heck a user is searching for, and delivers what it thinks are the most relevant results for the given search term. The query engine makes use of attributes stored by the indexer, for example keywords extracted from links pointing to a page, ordered word/phrase lists, assigned PageRank™, trust status, topic relevancy with regard to the search query's identified or guessed context, and so on. Also, the query engine performs a lot of filtering, e.g. omission of near duplicates, suppressing results caused by penalties for cheating or similar results hosted on related servers, and it sorts the results ordered by a lot of different criteria. Actually, it does way more neat things, and one can't say which parts of Google's ranking algorithms are run by the indexer, the query engine, or both. Jump station:
To appoint the time table for your Web site, decide honestly whether your site comes with the prerequisites for each step or not, then add the appropriate processing time estimated above, and summarize all phases. Monday, December 05, 2005 How long does it take to get indexed by Google?
Google Sitemaps - The How-To What-Is FAQ · Index · Part 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · Expand · Web Feed
|
||||||
|
neat CMS: Banners don't work anymore. Buy and sell targeted traffic via text links: [Editor's notes on |
|||||
Digg this · Add to del.icio.us · Add to Furl · We Can Help You! |
||||||
|
||||||
![]() ![]() ![]() ![]() ![]() |
Home · Categories · Articles & Tutorials · Syndicated News, Blogs & Knowledge Bases · Web Log Archives Copyright © 2004, 2005 by Smart IT Consulting · Reprinting except quotes along with a link to this site is prohibited · Contact · Privacy
|