A guide to clever linking for geeks and savvy programmers

This tutorial explains hyperlinks to Web developers. After defining the class Link and its implementations natural link and artificial link, the anatomy of a link gets expressed with various syntax examples. The primary focus is a description of the power of links, and how to get the most out of natural linking. Clever and prominent linking makes the user friendliness of a Web site. High sophisticated linkage extends the value of natural links with regard to search engine placements. Many aspects of internal and outgoing linkage are covered, including the amplification of internal linking by external inbound links, and tips on building an authority status with outbound links.


Index

Defining Links, Natural Linking and Artificial Linkage

The definition of Link (hyperlink, Web link) and its most important implementations as Natural Link and Artificial Link. [Shrink]

The Components of a Link [HTML Element: A]

A commented link compendium explaining proper linking on syntax examples. Each attribute of the A element is described along with usage tips and lists of valid values. [Shrink]

The Components of a Link [HTML Element: LINK]

A syntax compendium of the LINK element, used in HEAD to define relationships, assign stylesheets, enhance navigation etc. [Shrink]

Web Site Structuring

Discussion of poor and geeky structures which confuse the user, and introduction of universal nodes and topical connectors, which solve a lot of weaknesses when it comes to topical interlinking of related pages. [Shrink]

A Universal Node's Anchors and their Link Attributes

About the universal node's primary URI, anchor text and tooltip, and tips on identifying and using alternate anchors, titles, descriptions etc. in various inbound and outbound links. [Shrink]

Linking is All About Popularity and Authority

Learn more about the backgrounds of natural linkage. Linking is all about traffic, so try to build stable traffic streams by creating outstanding authority sites which will become popular by word of mouth. The search engines will follow their users' votes intuitionally. [Shrink]

Optimizing Web Site Navigation

Laying out navigational links to lead users straight to the content they're searching for allows some fair search engine optimizing techniques. [Shrink]

Search Engine Friendly Click Tracking

Counting outgoing traffic per link works fine without a redirect script, which causes all kind of troubles. Here is a ready to use solution. [Shrink]



Defining Links, Natural Linking and Artificial Linkage

[Shrink]

Think of a link as a class. For the understanding of this article we need a universal definition of Link and its two most important implementations. We'll take care of technically driven sub classes like Link and A elements later on.



Ask any Web developer 'What is a link?'. Most peobably the answer is something like 'A link is a connection between hypertext pages' followed by a detailed explanation of link attributes and their values. This answer may be technically correct, but it is incredible incomplete, because human usage and understanding of real life links aren't explained. Developing and deploying links requires parallel context sensitive thinking:

For your visitor a link is a path to more information/content of his interest, and a call for action. For search engines a link is a vote for another page, and a path to more spider food.



To apply natural linking, keep the surfer's interest in mind. Act as a webmaster without commercial goals who tries to give the very best to the visitor. On a per link base decide whether the destination adds value to the source (sentence, paragraph, page, or site). If you doubt, don't link. Users and search engines do honor natural links.



If the linked page (target, destination) is not relevant (to the source), the average surfer will go away, hopefully hitting the back button, but more likely lost forever. Search engine optimizers make use of artificial links to increase a page's link popularity, and to improve a page's search engine placement for a particular term used in the artificial link's anchor text. Search engine crawlers, designed to find valuable content for their users, may ignore or penalize an artificial link.



The Components of a Link [HTML Element: A]

[Shrink]

The most important components of a link are described below. For a complete syntax compendium of the A and LINK elements please refer to the W3C on HTML and XHTML.


<a

href = "URI" | href = "#fragment-identifier" | href = "#ID" | href = "#"

id = "anchor-name" name = "anchor-name"

target = "_self | _parent | _top | _blank | frame/window-name"

title = "tooltip"

class = "CSS reference"

style = "CSS attribute/name pair(s)"

rel = "type of forward relationship"

rev = "type of reverse relationship"

hreflang = "language code"

type = "MIME content type"

charset = "character set"

onclick = "JavaScript code;"
onmousedown = "JavaScript code;"
onmouseup = "JavaScript code;"
onmouseover = "JavaScript code;"
onmouseout = "JavaScript code;"

>anchor text | image element</a>


General notes on link syntax:

  • Element and attribute names (even in HTML 4.01 or lower) must be in lower case for XHTML compatibility (XML is case sensitive).
  • Attribute values must always be quoted, even numeric values. Use either double quotes (standard) or single quotes (sometimes nice to enhance code readability, but not that handy in combination with SQL).
  • Attribute minimization is forbidden. Put name="name", not name.
  • Start tag and end tag are required (A element: <a ...>...</a>) or the start tag must be closed (LINK element: <link ... /> Note: always put a space before '/>').
  • In attributes with pre-defined value sets (e.g. _blank, _parent, _self, _top) the values are case sensitive and defined in lower case (XHTML compatibility).
  • HTML entities (&amp; ...) are case sensitive, hex references to characters are case sensitive and defined in lower case (&#xnn; instead of &#Xnn;) (XHTML compatibility). You must not use the '&' character in literals, URIs etc., if you're dealing with text pulled from databases or dynamic URIs, replace the '&' by '&amp;'.
The attribute descriptions below should cover everything one needs to code a hyper link. However, it's a good idea to follow the W3C-links to learn more about the standardized anatomy of links.

Back



href = "URI"

URI should always be a fully qualified location, that is scheme + host [+ port] [+ user] [+ password] + path [+ query string] [+ fragment identifier]. Although you can omit everything before 'path' when linking within a site, you schould not do it for various reasons. Users save your pages to their local disk, and all relative links become invalid. Scrapers capturing your content for duplication may overlook some absolute URIs and donate inbound links involuntarily. You don't need to convert links in content you make available in RSS feeds. Web robots, especially search engine spiders, can get confused by relative links, but process absolute links dependably. Examples:


<a href="http://www.mydomain.com/">My Company</a>

<a href="http://www.mydomain.com/widgets/">Widgets Index</a>

<a href="http://www.mydomain.com/widgets/tutorial-on-widgets.html#chapter7">Widgets Tutorial - Chapter #7</a>

<a href="https://www.mydomain.com/buy-now/basket.php?sku=20&amp;amp;amp;cust=10">Buy Widget Now!</a>

<a href="http://www.otherdomain.com/widgets/">My Friend's Widgets</a>


A few rules of thumb on URIs:

  • Do not omit the trailing slash when you link to a default document (for example index.html). Although you don't see a difference in your Web browser, your Web server must redirect http://www.mydomain.com to http://www.mydomain.com/ (HTTP response 302 - found elsewhere). 302-redirects are a bad thing for various reasons.
  • Always make use of short but self-explanatory names for directories and pages/scripts. Separate keywords in phrases by hyphens, not underscores or even spaces: /keyword-phrase/tutorial-keyword-phrase.htm is fine, /kwdphr/tkp.htm is poor, /keyword phrase/tutorial on keyword phrase.htm is invalid and results in all kind of troubles. Bookmarking users and search engines honor good naming conventions, and if users drop your links in forums etc., you even gain a raised keyword relevancy in terms of search engine rankings.
  • URIs in general are case-sensitive. There may be URIs, or parts of URIs, where case doesn't matter (for example domain or machine names), but identifying these may not be easy. Web developers should always consider that URIs are case-sensitive to be on the safe side. To avoid any confusion, use lowercase only (UNIX conventions): page.htm, Page.htm and PAGE.HTM are three different documents!
  • Keep query strings short. That means shorten variable names and try to make use of short integer values. Do not use 'ID' as part of composite variable names (product=20 instead of productID=20). Replace spaces in alphanumeric values by '+'. Omit useless variables, try to stick with no more than two or three variable/value pairs.
  • Each entity should have two unique keys, represented by an integer and an UUID. In public areas of your site use the integers in query strings, because search engine crawlers hate ugly URIs. In forms and protected areas use the UUIDs, because bored users and hackers tend to play with values and you don't want them updating data they shouldn't have access to.
  • Avoid tracking footprints in URIs, thus use session cookies instead of query string parameters to pass session IDs and alike. If for some weird reasons you can't avoid session IDs, omit them if the user agent is a Web robot. More information on search engine friendly URIs and query strings.
This article uses the term URI as defined here (see also RFC 1630). Note that URIs include URLs as defined in RFC 1738 and RFC 1808. To develop well formatted links on Web pages, you don't need to study these specifications. Just read URI (Universal Resource Identifier) as URL (Uniform Resource Locator), and follow the simple rules outlined above.

Back



href = "#fragment-identifier"

Fragment identifiers must match the pattern [A-Za-z][A-Za-z0-9:_.-]*, they are case sensitive and must be unique within the scope of a document. Although the attribute name disappears after HTML 4.01, it should be used for backward compatibility: always put id="widget" and name="widget". Don't make use of the outdated syntax <a name="fragment-identifier"></a> to define anchors. It is formally deprecated and will be removed in XHTML 1.0+. Instead link to an element with an id attribute.

Back



href = "#ID"

Within the page link to elements with an id attribute like <h1 id="widget" name="widget">. From other pages link with the URI plus #ID.

On page links:


...
<h2 id="index" name="index">Index</h2>
...
<a href="#chapter7">Chapter #7</a>
<a href="#chapter8">Chapter #8</a>
...
<h3 id="chapter7" name="chapter7">Chapter #7</h3>
...
<a href="#index">Index</a>
<h3 id="chapter8" name="chapter8">Chapter #8</h3>
...


Off page links:


<a href="http://www.domain.com/directory/page.html#chapter7">Chapter #7</a>
<a href="http://www.domain.com/scripts/page.php?article=20&amp;amp;amp;page=10#chapter8">Chapter #8</a>

Back



href = "#"

The '#' value equals an empty location and defaults to the topmost position in the current document. You can use an empty link to have an 'anchor text' string look like a link, but not behave like a link. For example you can use the title attribute to display a tooltip on mouseover, and onclick and/or other events to execute JavaScript code. Examples of empty links on this page: hreflang, type, charset. On mouseover the window's status bar displays a message instead of the link's href value. You must disable the default onClick behavior to prevent the page from scrolling to the top when a user clicks on an empty link. Syntax example:


<a href="#" title="tooltip text" onmouseover="window.status='Empty link, read the tooltip'; return true;" onmouseout="window.status=''; return true;" onclick="return false;">anchor text</a>

Back



id = "anchor-name" + name = "anchor-name"

As expressed above, id and name share the same name space as well as naming conventions, and should be used in conjunction. If you make use of both id and name, the values must be identical. The values must be unique within the scope of the document.

Assigning IDs to links allows neat effects. For example you can change links with JavaScript code: assigning a new value to a 'help' link's href attribute depending on value changes of radio buttons or combo boxes can enhance the usability of forms (note: search engine spiders will see only the default value!) etc. etc.

Back



target = "target-name"

target controls in which frame or window the linked document is loaded. Hint: frames are evil, use CSS instead.

The following target names are reserved and have special meanings:
_blank The user agent (Web browser) should load the designated document in a new, unnamed window.
_self The user agent should load the document in the same window/frame as the element that refers to this target. That's the default if you omit target and there is no BASE target attribute defined.
_parent The user agent should load the document into the immediate FRAMESET parent of the current frame. This value is equivalent to _self if the current frame has no parent. Since _top has issues with some browsers, better use _parent to break out of foreign frames.
_top The user agent should load the document into the full, original window (thus canceling all other frames). This value is equivalent to _self if the current frame has no parent.
_new is a myth and definitely NOT a valid target name. Most Web browsers will interpret _new as user defined window name, but with the leading underscore _new is an invalid window name.
_blur refers to (with the leading '_' invalid but executable) JavaScript syntax and has nothing to do with HTML or XHTML, hence it's not a valid target name. For nasty tricks see blur(), onFocus, onBlur, window.open('stealth-console.htm','_blur'), window.focus() and alike at Sun's client-side JavaScript reference.

To target a named window, use any clean ASCII string beginning with an alphabetic character [a-zA-Z] as window name (you really should use lowercase, although uppercase characters are allowed). The browser will open a new named window, if there is no window with that name assigned, or open the document in a previously opened named window where the window name matches. Named windows are handy for help messages etc. in conjunction with absolute screen positioning.

Back



title = "tooltip"

Web browsers show the value of the link's title attribute as tooltip, when a user moves the mouse over the link without clicking it. You should make use of tooltips whenever you can, because it's good style to tell a user whereto you send him. User friendly Web sites donate a tooltip to every link, and by the way to other HTML elements like headings, row and column labels, footnotes etc. too. Nearly every HTML element you can use in the BODY section knows the title attribute.

Since the tooltip gets displayed for a few seconds only, don't overload it. Preview your pages and check whether a visitor can read the text before it disappears. For better readabilty you can insert line breaks, a new line (LF or CR+LF) in the HTML code works, <br /> doesn't:


...<a href="URI" title="Tooltip Title
Very short and to the point description of the linked document.
Additional information in a longer uninterrupted line of text.">anchor text</a>...


If you pull the tooltip's text from a database, you should strip all HTML tags out, as a precaution. HTML tags accidental left in tooltips look plain weird.

Although there is no evidence we're aware of, search engines may use the content of the title attribute in their ranking algorithms. Hence go ahead and put in decent keywords, but don't stuff the tooltips with keyword phrases. If the tooltip has impact on rankings, most probably its importance is less than surrounding body text.

Back



class = "Reference to a CSS class"

You'll have different sub-classes of links on each page, thus you should encapsulate their layout and behavior. For example a link in a menu bar is different from a link within the body text. Define CSS classes for each link class:


a.menu        {font-weight: normal; font-size: 10pt; color: navy; }
a.menu:hover {font-weight: bolder; font-size: 12pt; color: red; }
a.body        { }
a.body:hover {font-weight: bold;                    color: blue; }


Then in your menu bar code links as follows:


<a class="menu" href="URI">anchor text</a>


On mouseover the link's anchor text will become bigger, bolder, and red.
Within the body text put:


<a class="body" href="URI">anchor text</a>


On mouseover the link's anchor text will become bold and blue.

Back



style = "CSS attribute/name pair(s)"

Everything you can do with the style attribute can better be done with a CSS class referenced by the link's class attribute. Using style on the element level has several disadvantages. For example converters like html2pdf and even some user agents can't handle CSS attributes in HTML elements. 'Copy and modify' is forbidden by rule, a developer breaking this rule loses all brownie points. Once you put the first style attribute in a HTML element, you'll do it further and chances are you will copy and modify. However, here is a syntax example:


<a href="URI" style="text-decoration: none; color: black;">anchor text</a>


If the page's text color is black, 'anchor text' becomes a hidden link. Not underlined and displayed in the same color as other body text, a user won't spot the link before s/he does a mouseover. Although search engines don't penalize this kind of 'hidden links', you should not deceive your visitors, so don't scatter invisible links within text content. Use this technique in menu bars and other areas where the link is obvious and underlining screws the aesthetic sense of your layout.

Back



rel = "Type of forward relationship"

This attribute describes the relationship from the current document to the linked document specified by the href attribute. The value of rel must be a space-separated list of link types. Link types are case-insensitive, "External nofollow" has the same meaning as "External NOFOLLOW". Please refer to the LINK element section for a list of a document's META link types, in the following we'll discuss link specific relationships.

rel="nofollow" rel's 'nofollow' value was introduced by Google in January 2005 to fight comment spam. Search engines don't count nofollow-links for ranking purposes. A nofollow-link is not a negative vote, it just means that the source cannot vouch for the target. Use the 'nofollow' value where users can insert links, for example in blog comments, guestbooks and alike. For detailed information on rel="nofollow" links read our tutorial on steering and supporting search engine crawling.

rel="tag" rel's 'tag' value was introduced by Technorati and is defined here. By adding rel="tag" to a hyperlink, a page indicates that the destination of that hyperlink is an author-designated "tag" (or keyword/subject) of the current page. Note that a tag may just refer to a major portion of the current page (i.e. a blog post). By placing these links on a page


<a href="http://en.wikipedia.org/wiki/seo" rel="tag">seo</a>
<a href="http://technorati.com/tag/search+engine+optimization" rel="tag">Search Engine Optimization</a>


the author indicates that the page (or some portion of the page) has the tags "seo" and "search engine optimization". The linked page should exist, and it is the linked page, rather than the link text that defines the tag. That is the last segment (everything after the last slash in the path) of the URI defines the tag's value, not the anchor text.

Tags belong to the blogosphere, social bookmarking services etc., they are rarely used on old fashioned Web pages. Most bloggers tag their posts, because lots of users subscribe to Web feeds (RSS, Atom, XML) delivering search results on particular tags of their interest. Here is an example of a typical bottom line of a blog post:


Tags: ()



rel="value1 [value2 . . . valuen]" Webmasters make use of several not standardized values in rel, for example 'advertising' to label affiliate links (in conjunction with rev="sponsor"), 'charity' to label off-topic links leading to non-profit organizations like UgandaCAN, 'external' to label links leaving a network etc. etc.

Basically you can put everything you find useful into rel, as long as the usage of a particular value doesn't conflict with a standardized microformat. For example bloggers qualify human relationships in blogrolls using XFN values like 'sweetheart date met' or 'met acquaintance'.

Back



rev = "Type of reverse relationship"

This attribute is used to describe the relationship of a reverse link from the anchor specified by the href attribute to the current document. The value of this attribute is a space-separated list of link types. For example, if you link to a sponsor, you could put rev="Sponsor" and rel="Propaganda".

rev="vote-for | vote-abstain | vote-against" Indexing and tracking applications like search engine ranking algorithms treat all links as endorsements, or expressions of support. This is a problem, as there is a need to link to documents one disagrees with as well, to discuss why. The votelinks microformat, introduced by , is used to qualify a link, in case of 'vote-against' possibly in conjunction with rel="nofollow".

At the time of writing, Yahoo! could make (experimental) use of votelinks and XFN values for community based searching and alike. Probably Google does its researches too, but there is no evidence that votelinks found its way into the PageRank™ formula. As long as the whole votelinks thing lives in the blogosphere alone, it's unlikely that major search engines make use of it with regard to ranking of Web pages. However, it may be a good idea to start using it in other areas too.

Back



Intrinsic event handlers

The A element's valid events are onfocus, onblur, onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onkeypress, onkeydown, and onkeyup (all events are linked to JavaScript syntax descriptions, but can be used with other scripting languages as well). In the A element lowercase is obligatory for attribute names, so you must code onclick="...;" instead of the camelCased onClick="...;" notation used in JavaScript manuals!

When you make creative use of event handlers, bear in mind that not all vistors surf with JavaScript enabled. If the functionality is essential, you need a server sided solution. If you're just producing neat convenience stuff, don't worry. Just ensure nothing really important relies on JavaScript code.

For example you can track outgoing traffic using the onClick event handler. This is a search engine friendly way to track exit clicks, because the the commonly used method to link to a redirect script brings all kind of troubles. Or you can hide ugly affiliate links:


<a href="http://adtracking.your-sponsor.com/landing-page.php?affid=...countless-cryptic-parameters" title="yada yada yada" onmouseover="window.status='http://www.sponsor.com/'; return true;" onmouseout="window.status=''; return true;">call for action</a>


When a visitor moves the mouse over this link, the browser window's status bar displays a pretty clean URI, not the ugly value of the href attribute.

Back



Anchor Text

The linked text, usually designated as anchor text (or anchortext), should describe the linked resource short and to the point.

The anchor text should contain the most distinct keyword(s) applicable to the target document. Ensure the visitor gets an idea of the linked page's content even if s/he reads only the highlighted anchor text. If the link is a call for action, link complete sentences. That's very important, because most surfers are scanning nowadays, that is they often don't read the words before and after the text link. However, always try to define the links context with the surrounding text. A well formatted text link looks like:


"Hierarchical site maps are a must for every Web site. Especially on large sites some users rely on this navigation element, and search engine crawlers make use of sitemaps to crawl even pages deeply buried in the link hierarchy. To make use of site maps download the free SEO tool Simple Sitemaps now and generate your HTML site map in no time. Simple Sitemaps even adds a RSS site feed and a Google XML sitemap to your Web site!"


... <a href="http://www.smart-it-consulting.com/article.htm?node=154" title="Get THREE Free Site Maps for Your Web Site!
Simple SiteMaps generates HTML + XML + RSS Sitemaps!"><b>download</b> the free SEO tool <strong>Simple Sitemaps</strong> <b>now</b></a> ...


When it comes to search engine rankings, the anchor text and the surrounding text as well have great impact on both the source page's authority status and the target page's keyword relevancy. Always make use of the title attribute to describe the link's target.

Back



Image Element

If you link images, do not omit the image element's required alt and deprecated border attributes. In the alt attribute goes a descriptive text, telling the visitor something about both the image and the link's target as well. Do not stuff keywords in the alt attribute! border="0" suppresses the ugly rectangle indicating the link. If that is covered by your style sheet, omit border, or use it for reasons of backward compatibility only.


<a class="imagelink" href="URI" title="descriptive text"><img id="image-id" class="linkedimage" src="URI" alt="descriptive text" border="0" height="integer" width="integer" /></a>


Because image links have disadvantages when it comes to search engine optimization, you should provide an additional text link for each image link.



The Components of a Link [HTML Element: LINK]

[Shrink]

The LINK element can be used in the HEAD section only. It defines document relationships (META data) and has no content. For example LINK assigns style sheets to a document, it defines the position of the current document in a series, it provides information about printable or translated versions of the current document, it refers to RSS/ATOM feeds etc. etc.

The LINK syntax overview below omits some attributes, for a complete syntax compendium please refer to the W3C and the A element section of this tutorial.


Since LINK has no content, you must not use an end tag. Close LINK with a space followed by a slash: ' />'. LINK may be empty, it has no required attributes. For more detailed information please read the 'general notes on link syntax'.

Back



href specifies the fully qualified location of a related Web resource, for example an external style sheet or a feed. Examples:


<link href="http://www.domain.com/css/standard.css" rel="stylesheet" type="text/css" />

<link href="http://www.domain.com/css/800x600.css" title="Small Monitor" rel="stylesheet" type="text/css" />

<link href="http://www.domain.com/css/black-on-white.css" title="Black on white background" rel="alternate stylesheet" type="text/css" />

<link rel="alternate" type="application/rss+xml" title="Document title equals title of the RSS feed" href="http://www.domain.com/feeds/document-name.rss" />

<link rel="alternate" type="application/atom+xml" title="Document title equals title of the ATOM feed" href="http://www.domain.com/feeds/document-name.xml" />


For the persistent (default, 1st example) style sheet do not use the title attribute. For the preferred (2nd example) style sheet use title and put 'stylesheet' in rel. For alternate (3rd example) style sheets use title and put 'alternate stylesheet' in rel.

The 4th example tells user agents and search engines where the page's RSS feed is located. Some sites use type="application/rss+xml" for both RSS and ATOM feeds, if they provide only one of both formats. It seems to work due to feed autodiscovery applied at most places, but we don't recommend it. Better (5th example) specify the utilized format in type. At the time of writing, neither 'application/rss+xml' nor 'application/atom+xml' are registered media types, thus it's not guaranteed that all user agents will handle feed-URIs as expected.

Back



The title attribute in LINK usually gets used in conjunction with other attributes like href and rel to form an expression in valid syntax.

  • title is used together with the link type 'Alternate' or 'Alternate Stylesheet' to name user-selectable alternate style sheets.
  • In a link of the type 'Bookmark' to a key entry point within an extended document, the title attribute is used to label the bookmark.
  • In a link of the type 'Alternate' pointing to a translated version of the current document, or a version made up for another destination media (devices like 'screen', 'tty', 'tv', 'handheld', 'print' etc.), or a version delivered in another content type ('text/html', 'video/mpeg' etc.), title is used to put a human readable representation of the contents of hreflang|lang, media or type. Examples: 'This article in French', 'WAP version' or 'Watch the video'.

Back



The rel attribute describes the relationship from the current document to the resource specified by the href attribute. The value of this attribute is a space-separated list of link types. User agents, search engines, etc. may interpret these link types in a variety of ways. For example, user agents may provide access to linked documents thru a navigation bar, or print a series of related documents in their logical order. Search engines may link to alternate formats on their SERPs, or crawl and index resources which are not linked using an A element in the current document's BODY section. Here is a list of all link types defined by the W3C:

rel="Alternate" designates substitute versions for the document in which the link occurs. When used together with the lang attribute, it implies a translated version of the document. When used together with the type attribute, it implies a version delivered in another content format, for example RSS or ATOM. When used together with the media attribute, it implies a version designed for a different medium (devices like 'screen', 'tty', 'tv', 'handheld', 'print' etc.).

rel="Stylesheet"
Refers to an external style sheet. See the examples above and the manual on external style sheets for details. 'Stylesheet' is used together with the link type 'Alternate' for user-selectable alternate style sheets.

rel="Appendix" refers to a document serving as an appendix in a collection of documents.

rel="Bookmark" refers to a bookmark. A bookmark is a link to a key entry point within an extended document. The title attribute should be used to label the bookmark. Note that several bookmarks may be defined in each document.

rel="Chapter" refers to a document serving as a chapter in a collection of documents.

rel="Contents" refers to a document serving as a table of contents. Some user agents also support the synonym 'ToC' (from 'Table of Contents').

rel="Copyright" refers to a copyright statement for the current document.

rel="Glossary" refers to a document providing a glossary of terms that pertain to the current document.

rel="Help" refers to a document offering help (more information, links to other sources of information, etc.)

rel="Index" refers to a document providing an index for the current document.

rel="Next" refers to the next document in a linear sequence of documents. User agents may choose to preload the 'next' document, to reduce the perceived load time.

rel="Prev" refers to the previous document in an ordered series of documents. Some user agents also support the synonym 'Previous'.

rel="Section" refers to a document serving as a section in a collection of documents.

rel="Start" refers to the first document in a collection of documents. This link type tells search engines which document is considered by the author to be the starting point of the collection.

rel="Subsection" refers to a document serving as a subsection in a collection of documents.

Back



The rel and rev attributes play complementary roles and may be specified simultaneously. Although usually rev is not used in the LINK element, you could express the hierarchical position of a node as follows:


<link href="parent-node-URI" rev="parent" />
<link href="child-node1-URI" rel="child" />
<link href="child-node2-URI" rel="child" />
...


This construct would allow a spider to create a recursively ordered structure of a complex, hierarchial organized Web site, which is, caused by topical and hierarchical interlinking of nodes, not that easy to generate by following clickable links.



Web Site Structuring

[Shrink]

Before you create pages, navigation elements etc., and before you develop the very first link, go to the white board and think about Web site structures. Developers tend to think geeky, they create logical, non-redundant structures. Bear in mind, that Web site visitors usually have no idea of abstraction, encapsulation, inheritance and normalization.


BAD: spreading related information over too many pagesGOOD: providing all necessary information on the landing pageThey go to a search engine and seek for a particular information or product, for example green widgets. On your landing page they expect to find the whole story. To get general information on (colored) widgets, most likely they won't trail your path to the root from a subsidiary page expressing a widget variant's attributes and behavior, but omitting a description of widgets. Instead they hit the back button and click on the next search result. You must provide all related information on the landing page for the search term 'green widgets', that is you must include a short paragraph of text on widgets and colored widgets as well. Sounds pretty easy, but it's a dilemma, because fulfilling the visitors expectations leads to content duplication, which is a bad thing with regard to SEO (solvable with clever linkage and snippet management - read on, also check out this method).


The point I want to bring home is, that there should be a few linkage layers extending the (technically conditioned) hierarchical structure of a Web site with topic-oriented navigation elements and themed interlinking within the body text as well. This requirement often results in a geeky structure (black lines), obscured by different topical connections (green and red lines):

BAD: technically driven hierarchical structure with topical layers

Looking at the picture it becomes obvious, that technically driven structures cannot fit the confused user's needs, nor will they ever be suitable to represent any site's theme (especially not on multi-themed sites). Why? Because the first hierarchy level is superfluous and prohibits theming. There is no valid reason to put all news under news, all tutorials under tutorials, all products under shop etc. etc. - it's done at so many places caused by lazyness, geeky thinking, and villainous planning. Even 'unexpected growth' of formerly tiny sites to large resources is no excuse for a poor or shortsighted design. By the way, geeky structures confuse the hell out of search engines and dilute a site's authority status.

Is there a way to do all themed interlinking by changing the established Web site structure? No. At least not yet. But we can improve the old fashioned hierarchical structure with universal nodes and topical connectivity. Both concepts are not that new, but many webmasters (especially technically unschooled former publishers or salesmen) coming from the static side of Web development can't adopt them completely. At work they fall back a paradigm or two, and develop dynamic sites emulating restrictions which belong to the stone age of the Internet. Another weakness is the usage of various 3rd party scripts for particular tasks (shop, publishing, directory, web log, link exchanges ...), because those will seldom interact as seamless as necessary.


Universal Nodes

Define node: A node is an information unit containing information about a single topic, linked to at most one parent node, and none to many child nodes. Nodes without children are called leaf nodes. The topmost node of a hierarchy is the root node, it has no parent node. All nodes in a structure under a root node, including the root node itself, build a tree.


Define Universal Node: A universal node is sort of an instance of the superclass nodes. It has an unextended set of attributes (ID, parentID, name, title, description, sort order, anchors, type). That is a universal node has no more attributes than necessary to build and represent a tree structure. The attribute anchors contains a set of link attributes. The attribute type is a pointer to an instance of a node-subclass.

Other than well known nodes like folders in a directory tree, universal nodes don't act as a shelter containing objects having attributes and behaviour which aren't (necessarily) transparent to the container. Universal nodes transform the contained object to a node, that is they infold the objects and make them a part of the tree structure. Looking at a hierarchy of universal nodes, one can see only nodes of different appearances, but not a single container loaded with cargo.


Getting technical:
Logically, a universal node is an instance of a subclass, which inherits the attributes and methods necessary to build and represent a tree structure from its parent class.

Physically, a universal node is a conglomerate of at least three objects (node, anchors, and type specific data and behavior) for two major reasons.

First, it makes sense to change a node's type dynamically, for example a one-pager can mutate to a complex structured document during its life cycle. That can't be done by changing data in a given structure, it must be done by replacing the outdated object by the new object, where the new object is an instance of a different subclass, while other values of the universal node's instance variables are not necessarily subject of changes. Since those changes occur frequently in production systems running 24/7/365, changing the type pointer is the best maintenance procedure.

Second, the majority of accesses to persistent nodes data is caused by navigational requests. To create menus, listings on (category) index pages and alike, only few values are needed. Dealing with the overhead of complex database tables to fetch 3-4 values per tuple is a no-no on systems where high performance is almost everything.

Summarizing, with regard to system transparency and useability, the best solution is the segmentation of complex subclasses into separate entities. Publishers, editors and webmasters handling SQL queries will not understand atomized database structures created by object2table wrappers or persistence frameworks.


In a Web site's structure a node-subclass can represent a single Web page (primitive node), or a set of homogeneous data belonging together (complex node), which are visualized on multiple Web pages of the same micro structure (and which may be indexed in a node specific table of contents, and which may have complex attributes like glossaries, appendixes etc.). Alias nodes are 'empty' primitive nodes, pointing to a primitive or complex node, thereby overwriting one or more attributes of the wrapped node, for example title and description.

Examples of primitive nodes: home page, category index page (one-pager), articles (one-pager), feed previews. Important (outgoing) links can be handled as primitive nodes, regardless whether they are described on a separate page, or just appear on their parent node's (index) page.

Examples of complex nodes: image or movie galleries, multi-page category indexes, structured product information, structured documents like huge articles, news pages aggregating related feeds in variable orders.

Example of a tree structure built with universal nodes:

GOOD: topic oriented structure built with universal nodes

In a tree of universal nodes every node type can be parent or child of any other node type. A universal node (and a node's structure with all children, grandchildren...) can be moved under another parent node by simply changing the parentID attribute. Because the dynamic navigation 'knows' only nodeIDs, all links stay valid, the navigation elements simply regard the new parent(s) throughout all levels.


WHITE: page area controlled by the universal node's generic part   GRAY: page area controled by the node's type specific partThis picture shows a simplified example of a partitioned Web page created by a universal node. All peripheral areas and CI components (on white background) are controlled by the generic functionality of the universal node. The navigation links represent a part of the tree from the node's perspective, probably extended by or intermeshed with topical connectors. In the header the path to the root is put as linked bread crumbs, using the data gathered in the left handed navigation element's recursive loop. Other topical connectors and type specific or generic site wide links, for example a 'most popular' or 'featured products' box or so, are placed below the navigation. The block of further leading options like 'related products' or 'related articles' above the footer's generic first level navigation bar is created from topical connectors of the type 'related links', which may or may not appear in navigation elements too.

The page's body area (on gray background) is controlled by the universal node's type specific functionality. That includes inner navigation and advertising within the content.

Deeper segmentation and other layouts are possible and easy to accomplish, for example the type specific functionality can control segments of the peripheral area (e.g. inner navigation blocks, targeted ads or even the page title etc.).


Topical Connectivity

Define Topical Connector: A topical connector is a themed link from one universal node to another universal node, where the nodes are not connected in the tree structure. A topical link has (possibly complex) properties and behaviour, for example purpose, qualification, conditional in-/exclusions, and node-like attributes like title, description and anchors, which allow the intermixed use of universal nodes and topical connectors in structured Web page elements like menus.


Topical connectors extend the tree to a webbed structure, which is a (possibly on multiple layers) intermeshed network of nodes. In comparision with the geeky structure, the topical layers consist of way less non-hierarchical connections, because most (if not all) topical affiliations are (implemented as structural links between universal nodes) part of the tree structure.

Alias nodes are a comprehensive form of topical connectors.


Getting technical:
Example of a node specific flat helper used to output a menu barAlthough it's feasible to handle complex structures as described above dynamically, it may be a good idea to implement denormalized flat helpers for performance reasons. That is editorial users (authors, publishers, webmasters ...) work with the normalized structures, while Web site visitors get the pages created based on merged and sorted node sequences extracted from (parts of) the tree plus all related topical connectors. Such a flat helper is a persistent sequence of nodeIDs plus level numbers in a particular order, which gets joined with the nodes table on nodeID to fetch attributes like anchor text, tooltip, and URI.

Different outputs need different denormalized extracts, which may overlap with other flat helpers. On structure changes a node's update trigger should regenerate all flat helpers belonging to the changed nodeID or containing the changed nodeID or it's parent, as long as the flat helpers are needed to create important pages.

To generate hierarchical site maps, article or white paper indexes and alike (on the fly), it makes sense to have some large flat helpers, including flat helpers covering the whole site's structure. Those batch jobs should be started by cron jobs in low traffic maintenance windows, because heavy batch processes can slow down a server, and the actuality of site maps etc. is not that crucial.


Amplification of Internal Linkage by Inbound Links

From a search engine's point of view, a well structured, user friendly (and content rich) Web site is more likely considered an authority on its overall theme. The organization of related content under one universal node per topic is considered a network of topical hubs. Each internal hub becomes an authority on its topic, and all hubs together form an authority on an overall theme.

Larger internal hubs attract way more deep unidirectional inbound links than a couple of pages on the same topic, which are (structurally) scattered all over a site. Supplementally, each internal hub should be subject of its own link building campaign. Clever themed internal interlinking provided, the power of incoming links gets perfectly distributed to all related nodes (content pages). That's not so much an issue with link popularity, respectively PageRank™ (but an issue with regard to topical PageRank™). Good inbound links (pay attention to anchor text variations!) contribute to the hub's (=index node's) topic (~keyword) relevancy and authority. The hub has only few upward connections (navigational links to upper levels), so the inbound link's authority contribution gets funneled to the right content pages. All on-the-page optimization being equal, pages loaded with authority do rank higher on the SERPs.

A side note on the hub vs. authority myth:
Not every page loaded with more links than textual content is a hub. Not every page loaded with useful content and few outgoing links becomes an authority when many related pages link to it. Consider a link and its surrounding text content. The assignment of a (weighted) hub- respectively authority-status to a Web page by search engines is not mutual exclusive across the boards. A Web page classified as hub can be considered a topic authority and vice versa. An authority hub is a Web page providing on-topic content and valuable connections to related content, regardless where the content is hosted. A web page's cluster (that is a group of closely related documents providing content and related connections) can form an authority hub as well.



A Universal Node's Anchors and their Link Attributes

[Shrink]

A universal node contains one or many anchors suitable for in- and outgoing natural linkage. The formerly introduced anchors attribute of the universal node is (technically) located in different entities.

Each node has a primary anchor, stored in the nodes table. This anchor is the page representing a primitive node, or an index page or table of contents page of a complex node (called the node's primary [landing] page). In order to form a valid link, the universal node provides at least these attributes:

Static URL is an absolute URI, whereby scheme and server/domain are not necessarily persistent. That is, internal URIs get stored as relative URIs, then the method fetching the tuple from the database adds scheme and server for portability reasons (development systems -> production systems and vice versa). Load balancing may require on the fly server names too (www1 ... wwwn). Scheme may be given in a separate attribute (http vs. https ...).
- XOR -
Dynamic URL is a transient attribute populated at runtime dependent on the node's type. The method fetching the tuple from the database knows which script is assigned to which node type and how it has to compose the absolute URL. If you make use of keywords in file names, ensure that the file name cannot get changed after the first page view. You don't like bookmarkers, search engine users and spiders reading your 404 page, don't you?

Page Title is the content of the primary page's title tag and the default anchor text. It appears in the page header in a h1 tag. On index pages (internal hubs) and in other long internal links it makes sense to use the page title as anchor text.

Short Anchor Text is a keyword or catchword (phrase), which is preferable unique within the scope of your site, and short enough to fit in the limited space of menu items.

Long Description is the rich text snippet appearing on index pages and alike.

Short Description is an ODP like one sentence summary of the page's content, used in 'related stuff' link blocks and alike.

Tooltip is the title text displayed on mouseover. If Long/Short Description is used as tooltip (component) too, you must strip out the HTML tags. If tooltip is a persistent attribute, don't allow tags.

With the few components above you can create a lot of different links, some examples:


<h3><a href="URL" title="Tooltip">Page Title</a></h3><p>Long Description</p> [Index pages]

<h5><a href="URL" title="Tooltip">Page Title</a></h5><p>Short Description</p> ['Related stuff' link boxes]

<li><a href="URL" title="Page Title
Tooltip">Short Anchor Text</a></li> [Menus, most browsers recognize and render the NewLine between 'Page Title' and 'Tooltip']

<li><strong><a href="URL" title="Page Title">Short Anchor Text</a></strong>&middot;<a href="URL" title="Tooltip" style="text-decoration:none;">Short Description</a></li> [Right handed news like navigation, the description is invisible linked. (Don't make use of inline style attributes, reference a CSS class instead!)]


You can use more or less the same design for topical connectors, which either receive their link attribute values from the destination node dynamically, or overwrite specific values like title, description, and tooltip.

Basically, the design patterns discussed as yet should cover enough stuff to create a site navigation and more. But wait, what about the haut école of internal and external linking?


Identifying (Alternate) Anchors

Many pages provide enough content to have alternate anchors in the body section, which should be used in (new) in- and outgoing links. Valuable interlinking with related internal and external resources helps the visitor gathering more good content in less time, and it helps the Web site itself, because search engines honor legitimate interlinking too, what leads to an increase of incoming targeted traffic. The common problem is the lack of a suitable system to manage those alternate anchors. Lets see what we can automate, but first we need to discuss the basics and requirements.

Since anchor means more than starting- or endpoint of a hyperlink, and anchor is also defined as 'mechanical device that prevents a vessel from moving', we need a suitable definition of the anchor object to continue:


Define Anchor: An anchor is a fragment of a hypertext document serving as point of entry or point of exit, technically represented as one end of a hyperlink.


Besides the anchor's technical attributes an anchor has content and a context, both defining the anchor's topic.


Define Topical Anchor: A topical anchor is the whole picture of an anchor in its context, that is the topical anchor extends the anchor's technical properties and behavior with usage information. The topical anchor's extended attributes are theme tags, topic tags, catchwords, context, rating, realm. A topical anchor has one or more sets of anchor text, title text (tooltip) and description, but only one URI.


The implementation of topical anchors is a highly individual task, their usage and appearance depends on the site's theme and (commercial) goals. Thus I'll leave it at a few general comments on the topical anchor's attributes:    

Theme Tags designate the overall theme of a topical anchor's context, for example 'Linkage'.

Topic Tags designate the particular topic of a topical anchor's context, for example 'Linking to fragment identifiers'.

Catchwords are more detailed and more targeted keyword phrases expressing the topical anchor's content fragment and nearby content like surrounding text (subject, purpose), for example 'How to link to a link' or 'Scrolling to a particular table cell'. The catchwords attribute should not get abused by populating it with keyword lists.

Context is a descriptive text snippet featuring the topical anchor's primary information or message in its context. For example

If in Web development potential references to an external resource occur multiple times on one HTML page, encapsulating the point of exit is a search engine friendly way to expose an outbound link and strengthen its topic authority. Deployment of multiple optimized on page links with varying anchor text pointing to a named outbound link satisfy the visitor and help maximizing the authority status of the page with regard to search engine rankings, without decreasing the page's linking power.

describing a paragraph in an article on boosting a page's authority status with outbound links covers:

Context: User- and search engine friendly Web development    
Theme: Deployment of optimized hyperlinks
Topic: On page link pointing to a named outbound link
Subject: Multiple references to related external information on a HTML page
Purposes: Encapsulating points of exit. Minimizing the risk of diluting the ranking power of outgoing links.
Information [implicit]: On page links to a named outgoing link strengthen the page's topic authority and minimize the potential risk assumed in devaluation of other outbound links.

Assembling those text snippets is a lot of work, but worth it - think of non-textual content too and read on.

Rating is a subjective valuation of the usefulness of the topical anchor's page fragment, measured by quality and quantity of its information with regard to the topical anchor's labels (theme, topic, subject ...). In some environments it makes sound sense to handle user ratings too.

Realm tells where the topical anchor belongs too. In case of 'nearby' or 'own' the destination page referenced by the topical anchor's URI resides within a network, or site. Other possible values would be 'related' and 'foreign' standing for friends and external resources.

So far to the topical anchor's properties. Identifying topical anchors is a manual process like editorial indexing of content. In most cases this task will be part of a post-release business process, because it needs a more comprehensive view on the content's scope and environment than the author posses (workflow example: [autor/editor]:tentative release -> [editor]:topical preparation -> [publisher/webmaster]:review and final release). Alternate topical anchors are independent pointers, thus they can be created when the news are out. Scanning related content and inserting links in its body text is part of this task.

Obvious topical anchors are (an attribute of) universal nodes and topical connectors, and linked off site resources. Reusable content may have many topical anchors assigned. Especially complex nodes will contain fragments which are worth of creating a topical anchor, for example the chapters and sections of a large document, troubleshooting FAQ entries of manuals, paragraphs and headings of step by step guides and tutorials, blueprints, video sequences, image series, podcasts ...

While creating topical anchors, don't think too much in terms of keywords. Dealing with tags and topics without SEO and keywords in mind will result in better and more natural wording, because writing anchor text targeting a topic instead of a keyword phrase makes you inventing natural keyword phrases, which very likely get used to search for your content. Semantic algorithms like word stemming and contextual indexing invented by the major search engines make assignments of artificial keywords (misspellings, obscure synonyms ...) obsolete.


Making Use of Topical Anchors

EXAMPLE: Inserting a link made up from a topical anchor into the current body text at the cursor positionHaving a searchable database of topical anchors is a great tool to maintain proper internal linking, especially when the tool stores usage data in the background. Usage data means link direction plus source or destination, and other attributes as well, if a topical anchor is source and/or destination in an off site link trade. Also, it's a good idea to capture the anchor text for each placement. Storing anchor text variations with the topical anchor allows regular fine tuning of the tagging, it enhances search functionality and more.

On small sites the webmaster probably hosts this database in the brain. When it comes to large sites providing mostly static content, that's another story. It makes no sense to create alternate topical anchors for time limited content like news articles or blog posts, but it makes a lot of sense to create them for any kind of static content (fragments). Structured storage of topical anchors enables the content management system to reuse the data, it can provide a lot of functionality based on persistent anchors:

  • Inserting links into body text stored in a database (see picture above) allows the CMS to draw a picture of all linkage (link maps are a great design and SEO tool), it makes it easy to locate and change internal links if the referenced content changes, it allows general changes of link syntax, because the anchor elements are generated dynamically, etc. etc.
  • Accessing topical anchors besides the page contents will enhance the search functionality, because it allows delivery of more targeted results than a simple keyword oriented relevancy algorithm.
  • Based on topical anchors one can provide a page-independent directory like navigation element. Due to the fine granularity of topics (in comparision with pages or complex documents), users can navigate way faster to the content of their interest. As a side effect, these navigation pages add unique content to a site, and they provide kinda shortcuts for crawlers to pages buried deeply in a site's structure.
  • Storing internal and external resources as topical anchors enhances the management of link trades with related sites. Since all URIs are stored in database tables instead of within the body text, (reciprocal) link checking becomes an easy task.
  • The creation of (more) themed feeds based on stored queries will attract additional (recurring and very targeted) traffic.
  • Creating topical connectors and alias nodes based on topical anchors eleminates points of failure and makes structural link maintenance an easy task.
  • Proper normalization is sexy at all, because one can make multiple use of normalized data. Think of glossaries, easily achieved close meshed internal interlinking, reduced costs of maintenance and operation...

All sites are different, thus an architecture making sense for one site may not work for another site, or the architecture must get altered and customized. I've tried to explain the principles of link normalization in general, that is my examples and design patterns should be taken as flexible elements meant to provide ideas for your design, not to declare anything as set in stone. As always: Every rule is meant to be broken. Knowing when to do so is one of the things that separates the general from the major.



Linking is All About Popularity and Authority

[Shrink]

The World Wide Web is based on hyper links, your tiny site providing a few thousands of pages or more is just a molecule of the Web, thus apply the principles of linking to your site as part of the Web. Don't buy the myths telling you that internal linkage is much different from environmental linkage. Based on your linkage your site interacts with the Web or not. Your internal links are more or less a continuation of 3rd party Web links. If you don't link out, nobody links in, it's that easy. To amplify your internal linkage, you need foreign inbound links. To make a page part of the theme/topic authority spread on the Web, you need to put in closely related outbound links.

A link is a transportation medium, its load is primarily a traffic stream, that is human visitors are landing on the destination page. To convert this traffic into sales, donations, attention or whatever your site's goal is, design each page as a landing page. Once the browser has rendered your page, you've five seconds or less to hold the visitor. The first heading and the first sentence(s) (or image(s) ...) make up the user's mind. If you can't catch the visitor with your content, you've one more try with your most prominent links. Look at the traffic stats of a popular Web site:


Number of visits: 7,492,987
Average visits duration: 449 seconds
Number of visitsPercent
0s-30s5,846,11078 %
30s-2mn171,7382.2 %
2mn-5mn150,7632 %
5mn-15mn230,6723 %
15mn-30mn202,0732.6 %
30mn-1h383,9155.1 %
1h+496,2886.6 %
Unknown11,4280.1 %

Lay out your prominent links in a way that the majority of visitors staying only a few seconds can view them, and expect that search engines interpret link prominence and placement. Link formatting and placement primarily belongs to Web design and marketing. However, link deployment is based on a site's (technical) architecture and it has huge impact on its search engine optimizing, thus 'talk traffic' in your team (for example don't accept fancy client sided links and other search engine unfriendly stuff like that).

EXAMPLE: The power of links depends on their placement on the pageThe picture on the left indicates that the power of a link is (at least partly) dependent on its placement on the page. Please note, that this is an overstressed example, you cannot apply it to any page. However, prominently placed links near the top of a page, and links embedded in the body text count more in search engine ranking algorithms. There are a few methods to code pages in a way that some content gets placed high in the HTML code, although the visitor's browser renders the content elements in a different order. This may or may not help with search engine rankings, but it comes with decreased performance and most probably issues with some 'intolerant' browsers. Better optimize your pages for loading speed and user friendliness, and avoid tricky stuff meant to outsmart search engines, which are usually way smarter than you can imagine.

Search engine ranking algorithms try to emulate a human user's behavior. To rank Web pages for their users, they simulate zillions of virtual trails thru the Web trying to measure popularity and authority. In a search query's context, there are three major factors determining a Web page's placement on the SERP (simplified):

1. Relevancy of the renderable content. Relevancy is determined by keyword matches and more sophisticated technologies.

2. Authority on the topic which the engine has identified by analyzing the search term. Authority is determined by analyzing anchor text and topical relations of linked pages (and their neighborhood). The ranking factor authority is a bridge between relevancy and popularity, it takes both incoming and outgoing links into account.

3. Popularity. A page's popularity is determined by counting and weighting on-topic inbound links, and (to a low degree currently) bookmarks, tracked visits duration and clicks on the SERPs. The topical popularity as used in a search query's page ranking does not equal link popularity or PageRank™. LinkPop/PageRank™ stands for the total number of incoming links, and their universally weighting based on the source page's score. When it comes to placements on the SERPs, related natural links (topical PageRank™) count more than the mixture of artificial and natural inbound links (LinkPop/PageRank™)

A Link's load: Traffic, authority, and popularityLink and get linked for traffic management, user convenience, commercial purposes, you name it, but always bear in mind that your link carries more than traffic, its underlying cargo is authority and popularity. If you sell link spots, you should put 'nofollow' in the sold link's rel attribute, because Google decreases or even takes away your ability to pass reputation with your links, if you don't devalue your unrelated ads and other purchased links yourself.

You'll find a lot of 'content is king vs. only linkage counts' discussions at webmaster boards and other internet marketing resources. Probably 90% of everything you can read on this topic is utterly nonsense, although there is some substance in both positions. Absolutely wrong is the assumption that content and linkage are mutual exclusive properties, so concentrating on one of both leads into dead ends beyond the first SERPs.

You can't go wrong following the 'properly linked content is king' SEO advice. Content alone is nothing, you need popular content and lots of it, clever interlinked within a site and the rest of the world. Linkage alone is nothing, because to what do you link if you've got no content? Search engines aren't that cheatable, and only few users click on the ads out of desperation.


Properly Linked Great Content Survives on the SERPs

Here are a few more reasons1 why properly linked outstanding content brings in more, and more targeted, search engine traffic - not to speak of all the valuable traffic streams created by word of mouth.

When many visitors bookmark pages, this is a sign of a great site. Although at the time of writing all major search engines providing toolbars disclaim the capturing of bookmarks, you can bet that they are well aware of Joe Surfer's bookmarking behavior. If it's really not done by capturing or spying out bookmarks (note that those search engines hold (pending) patents on this technology), it's done by toolbars counting the number and duration of recurring visits per URI.

Sensible and prominent on-topic internal interlinking gives visitors good options. The result of well placed and trustable options is an increase of page views, since vistors explore a site more completely. All page views by search engine users with toolbars are logged. The engines make use of statistical data harvested from toolbars and cookie based tracking from the SERPs to determine quality. Those signs of quality lead to kind of trust bonus in the engine's index and have, respectively will have in the near future, impact on rankings.

Providing the visitor with top notch external resources not only makes the page linking out a part of the topical auhority hub or network spread all over the Web. Surfers tend to remember places where they've got good recommendations. Next time they go out searching for something, most likely they click on a link leading to a previously visited site where they've found something useful, and chances are this site gets bookmarked during the second or third visit.

Inbound links directly boost search engine rankings. An often overseen source of inbound links are social bookmarking services like delicious or blinklist. Due to the mostly precise tagging their links pages are on-topic, and as a bonus those pages often come with a high PageRank™2. When many users bookmark a page, the traffic volume produced by page one listings at Delicious & Co. often surmounts the number of hits coming from all major search engines together.

Satisfied visitors blogging a commented link is a sign of quality too, and another inbound link. Don't underestimate the blogosphere, good links get spread like wildfire. The same goes for users dropping links in forums, newsgrous and blog comments. Because URL dropping is very common in forums and usenet groups, having keyword rich URLs ensures that those links pass not only PageRank™ but also topic relevancy.

Well marketed content rich sites offer interactive components like forums or blogs. Surfers become loyal recurring visitors when they get a chance to contribute. It pleases their ego, it makes them nosy for follow-ups on their posts, and if they are treated with respect, they start to actively acquire new community members. As a side effect, the mostly unique content contributed by users attracts more search engine traffic on new but related search terms and topics.

Prominently linked and easy to subscribe site feeds as well as themed feeds on particular topics can function as 'pulling bookmarks'. Users get alerted on content changes, updates, news or whatever by their RSS reader, which makes it easy to click the link leading to the source. This works best with headlines and descriptive snippets. Full content feeds do get abused by scrapers and spammers, and they produce less Web site traffic because the reader gets the content remotely. Search engines are greedily spidering RSS feeds and they are working on technologies to count subscriptions like page views and bookmarks. By the way, promoted feeds get listed in all kind of catalogues and directories, that's even more inbound links to the home page and the feed URIs.

Summary: Bookmarks and other tracks left by recurring visitors earn a ranking boost on the SERPs - only outstanding content along with a user friendly content presentation gets bookmarked. Before the search engine traffic floods in, ensure the title tags and description META tags of each page are somewhat sexy. Catchy titles have a high CTR. High ranking pages with gibberish titles and crappy snippets on the other hand are pretty much useless (text previews on the SERPs get often extracted from directory submissions and META tags, if they reflect the on page content).




1

Thanks to EGOL for a nice short-list of often forgotten principles of Web site development, which I've used as kind of storyline in the next couple of paragraphs.

2

Even if the bookmarking service doesn't allow search engines to index the links pages (like delicious), those links get spread via feeds to many places on the Web, and some of them insert those links server sided.




Optimizing Web Site Navigation

[Shrink]

While outlining universal nodes and topical anchors, discussing link placement and supporting search engine crawls, I've dropped a few snippets of information on internal linking in a Web site's (outer) navigation. Now lets draw the whole picture by looking at the impact navigational links have on search engine placements.

Web site navigation obviously must be user friendly. User friendliness plus a few tweaks and shortcuts implemented for search engine spiders makes up a search engine friendly navigation. Laying out navigational links to lead users straight to the content they're searching for allows some fair search engine optimizing. What you never should do is tweaking the linkage for the engines when this results in a loss of usability and visitor support.

Technically, outer navigation elements are a part of the page's template (see page partitioning and link placement). Search engines can distinguish templated page areas from the body's (unique) content. As a matter of fact, they weight text and links differently depending on the page area. How much power navigational links have with regard to ranking depends on the site's architecture. On sites where the outer navigation is very repetitive, that is the menus get duplicated over and over with very few page specific items, those navigational links are treated like artificial links and their power gets downgraded next to zero. You will find this kind of flawed design at many eCommerce sites, where the static outer navigation (i.e. links to product lines and home page) is identical on most pages. The in-depth linkage is represented by the dynamicly generated inner navigation, that is links nearby or within the page body.

Since it makes no sense to deal with impotent page areas, clever developers balance the linking power by placing as many in-depth navigation as possible at the outer navigation areas. The goal is to drill down the outer navigation to the last node (e.g. product), while restricting the inner navigation to within-the-node linkage (e.g. product sizes and colors). Sometimes it's even possible to integrate a complex node's internal navigation with the outer navigation. This approach enables powerful linking in the peripheral areas, because every node comes with a different menu, that is less repetition (link duplication).

Another advantage of node specific outer navigation is, that it supports internal authority hubs. Having less than a handful of links leading to upper levels, most of the linking power gets used to strengthen on-topic (navigational) links. Additionally, a node specific outer navigation, e.g. a left handed menu, bread crumbs near the top and horizontal links at the bottom, develops enough linking power (mostly received from deep inbound links) to support the root and the main sections as well. Thus having a search engine unfriendly DHTML menu or flash based navigation at the top or right side of the page doesn't harm anymore, it may even help to establish topic authority hubs.

To demonstrate the impact keywords placed in a page's templated area can have on rankings, here is an example of a node specific outer navigation where a search engine assigns a lot of weight to navigational anchor text. The search term is "Internet Google", which is by the way a totally useless #4 spot, because nobody searches for it. I've picked it because it pulls 68.5 million results at Google, "Internet" is not closely related to the on page content (the word "Internet" appears only once in a navigational link and the URL), and at least it looks like a popular search. Here is the SERP:

Google's first SERP for 'Internet Google'

Look at the snippet and the screenshot of the page at the time of indexing. The sequence of keywords in the bread crumbs' anchor text makes (most of) the placement on the SERP.

This page is about 'Google Sitemap Validation' and has not so much to do with 'Internet'

It works fine with a few other useless keyword phrases taken from the page's bread crumbs too: Utterly Useless Keywords: smart internet business google Utterly Useless Keywords: it consulting internet google Utterly Useless Keywords: smart internet google Utterly Useless Keywords: consulting internet google ... but don't expect it's that easy to achieve top rankings in competitive markets. However, keywords within the navigation can help to define themes and topics, so you should use your most important keywords in prominent navigational anchor text.

Bread crumbs are a great way to break down a site's theme to topics and sub-topics. This You are here path to the root index page, placed near the top of each page, can act as an authority hub's sole connection to it's upper hierarchy levels. It's not even necessary to repeat the complete path to the root in the left handed menu.

By the way, consistent linking of the current node is neither lazy nor useless, because in complex nodes the current page is often different from the node's point of entry.

Other important navigation element are top level links, stored popular searches and horizontal views. The number of top level links, leading to the home page and main sections, must be kept as low as possible to avoid dilution of topical authority build around the rich nodes in deeper levels. There is nothing to say against nicely formatted top level links which aren't spiderable, e.g. in java based drop down menus, if they improve the surfing experience. From a SEO point of view they are (in most cases) pretty much useless.

Horizontal views are for example indexes of all image galleries, all tutorials related to a product line, or all articles related to a broader theme. These indexes may or may not reflect a part of the site's hierarchy, but mostly they are used as more content type oriented than theme specific layers. Like site maps, horizontal views should not contain more than 100 links per page, 15-25 links plus descriptions and/or previews are a proven limit. The content linked on a site map page or on a horizontal index should be describable with a short catchword (phrase) in a manu item. If that doesn't work, probably the collection of links is useless at all.



Search Engine Friendly Click Tracking

[Shrink]

For many site owners it is very important to know, what kind of and how much traffic they send to which locations. Tracking internal links allows profiling of user behavior, this helps the webmaster to fine tune a site's navigation and look and feel. Tracking affiliate links is essential, if upselling foreign products is the site's primary source of revenue. There are only so many ways to track a user's clicks on links, and most of them are sub-optimal with regard to search engine optimization.

Most content management systems (CMS), ad management systems, links list scripts etc. offering tracking of outgoing traffic, make use of redirects. Instead of using the target page's URI in the link's href attribute, each link points to a script, which counts the click and then redirects the user to the target page. This method of tracking outbound clicks has disadvantages:

  • Depending on the underlying technology, loading the target page in the users browser can get delayed up to a few seconds. That's not a big issue if the visitor surfs on cable or DSL, but with dial-up connections, which are still used by the majority of Internet users, the time to load is crucial. Users tend to hit the back button or close the window if a page loads slow, because they expect the target site to be slow. Especially with affiliate links, redirecting outgoing traffic decreases a site's revenue.
  • Search engines can't follow all redirects with regard to rankings, they may not count the redirected link as vote for the linked page. That means that many masked links do not increase the target page's link popularity respectively PageRank™. Most webmasters refuse link trade offers if the other site makes use of redirect scripts. Incoming links are the most important factor when it comes to search engine placements. A site without a reasonable amount of incoming links will not rank high on the SERPs.
  • Search engines consider redirecting fishy. There are lots of legitimate uses for redirects, but why take the risk of red-flagging? If you are not an experienced search engine optimizer, avoid redirects. Redirect moved pages to their new location and ensure your server responds with a '301' code, but no ney never let your Web server do '302' redirects for any reason. Unfortunately '302' is the default response code, used if a script doesn't explicitly forces a '301', what most scripts don't do.
  • Unskillfully redirects can result in unintended page hijacking on search engine result pages. Under some circumstances search engines may index foreign content under your redirect-URIs. Not only is this unfair with regard to the other site, it can easily dilute your site's theming and your search engine traffic becomes less targeted or even decreases.
Below I explain a smart method of search engine friendly click tracking. It makes use of the onClick event handler and is suitable, if you can live with not 100% accurate results caused by a minority of Internet users surfing with JavaScript disabled.

You may want to download the code examples discussed below, before you read on. Download and unzip 'track-clicks.zip' (3kb), then upload all files to an empty directory on your server. Do a CHMOD 666 on 'clicks.txt', then execute 'page.htm' with your browser. The sample program requires PHP 4.3.1 or later installed on your Web server.


In order to track clicks on (exit) links, you must include a JavaScript function in the HEAD section of each HTML page. Also, you must put at least one image on the page. The required image can be the page's background image, or an invisible 1x1 pixel gif-image if you've text-only pages.


...
<script language="JavaScript">
function trackclick(url, aid, atxt) {
    if(document.images){
        (new Image()).src="trackclick.php?url="+url+
        "&aid="+aid+"&atxt="+atxt+"&loc="+document.location;
    }
    return true;
}
</script>
...


The function 'trackclick' is called by the onClick event handler when a user clicks on a link or tabs to a link and presses the enter key. The event handler passes three input parameters to 'trackclick':
url is the URI of the linked page, taken from the link's href attribute.
aid is the value of the link's id|name attribute. On dynamic sites, this would be the primary key of a links table or so. In order to track clicks, you must assign a unique ID to every link by populating the id and the name attribute.
atxt is a hard coded literal. The string can contain the link's anchor text or another useful value.
When calling the tracking script, the function 'trackclick' passes these variables and adds the URI of the current page. Here is the link syntax:


<a href="http://www.smart-it-consulting.com/"
    id="link-1" name="link-1"
    title="Smart stuff that matters ;-]"
    onclick="return trackclick(this.href, this.name, 'left navigation:Smart IT Consulting');">
    Smart IT Consulting</a>


This search engine friendly link will perfectly pass both PageRank™ and keyword relevancy to the target. As long as the user has JavaScript enabled, each click gets stored in the database. Here is a simple tracking script template:


<?php
    // trackclick.php?url=$url&aid=$aid&atxt=$atxt&loc=$loc
    // url = target (clicked link)
    // aid = id (name) of the clicked link
    // atxt = anchor text of the clicked link or another literal
    // loc = source (document where the user clicked a link)
$ip = getenv("REMOTE_ADDR");
$userAgent = getenv("HTTP_USER_AGENT");
$string = date("Y-m-d G:i:s T") .";"
        .$url .";"
        .$aid .";"
        .$atxt .";"
        .$loc .";"
        .$ip .";"
        .$userAgent .";\n";
$fp = @fopen("clicks.txt", "a");
if ($fp) {
    @fputs ($fp, $string, strlen($string));
    @fclose($fp);
}
?>


This PHP code gets the user's IP address and user agent name, then it appends a ';' delimited line to a plain text file ('clicks.txt'):


Date/time;Clicked URI;Link ID;Tracking text;Source page;IP address;User agent name;
--------- ----------- ------- ------------- ----------- ---------- ---------------
2005-08-20 9:03:20 EDT; http://www.smart-it-consulting.com/article.htm?node=155; documentation; Article on SE friendly click tracking; http://www.your-domain.com/test/clicktrack/page.htm; 66.115.133.131; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1);

2005-08-20 9:03:33 EDT; http://www.your-domain.com/; link1; link1 tracking string:root index page; http://www.your-domain.com/test/clicktrack/page.htm; 66.115.133.131; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1);

2005-08-20 9:04:04 EDT; http://www.smart-it-consulting.com/; link2; link2 anchor text:Smart IT Consulting; http://www.your-domain.com/test/clicktrack/page.htm; 66.115.133.131; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1);


In real life you'd most probably track more stuff, and you'd store the values in a database. The server sided component can be developed in any other programming language, for example PERL, ASP ...

The click tracking procedure outlined above is pretty much simplified to bring the point home, but you'll get the idea. Go ahead and implement a search engine friendly redirect-free traffic monitoring and analysis on your Web site.



Author: Sebastian
Last Update: September/7/2005 [1st DRAFT]   Web Feed

· Home

· Internet

· Link Tutorial

· Web Links

· Link to us

· Contact

· What's new

· Site map

· Get Help


Most popular:

· Site Feeds

· Database Design Guide

· Google Sitemaps

· smartDataPump

· Spider Support

· How To Link Properly


Free Tools:

· Sitemap Validator

· Simple Sitemaps

· Spider Spoofer

· Ad & Click Tracking



Search Google
Web Site

Add to My Yahoo!
Syndicate our Content via RSS FeedSyndicate our Content via RSS Feed



To eliminate unwanted email from ALL sources use SpamArrest!





neatCMS

neat CMS:
Smart Web Publishing



Text Link Ads

Banners don't work anymore. Buy and sell targeted traffic via text links:
Monetize Your Website
Buy Relevant Traffic
text-link-ads.com


[Editor's notes on
buying and selling links
]






Digg this · Add to del.icio.us · Add to Furl · We Can Help You!




Home · Categories · Articles & Tutorials · Syndicated News, Blogs & Knowledge Bases · Web Log Archives


Top of page

No Ads


Copyright © 2004, 2005 by Smart IT Consulting · Reprinting except quotes along with a link to this site is prohibited · Contact · Privacy