MW-photo
March 22-25, 2006
Albuquerque, New Mexico

Papers: Search Engine Optimization (SEO) Essentials
For Cultural Web Sites

Thierry Arsenault, Department of Canadian Heritage / Culture.ca, Culture-mondo.org and Culturescope.ca and Erik Rask, Canadian Heritage Information Network / Virtual Museum of Canada, Canada

Abstract

Building an on-line exhibit or a Web site is worthless if your targeted audience cannot find it. This mini-workshop will highlight the benefits and the importance that Search Engine Optimization (SEO) should have within an Internet marketing strategy. Participants will learn techniques that could be applied to any Web sites or on-line exhibit. After this session, participants will know better:

  • How to implement professional SEO: What to do and not do to avoid being penalized by the major search engines
  • How to achieve good SEO results
  • How to better target audiences using SEO tactics
  • How to continually improve results.
  • How to keep up-to-date with the SEO industry

In 2005, Culture.ca had an in-depth SEO review of its gateway done by Ian McAnerin, a renowned professional SEO authority and speaker for international SEO conferences. Following this analysis, Culture.ca established an action plan to optimize its ranking within the major Search engines. Culture.ca will share this plan and will explain the rationale behind each SEO tactics they have in their SEO plan.

Keywords: search engine optimization, SEO, target audiences, Internet marketing strategy

Introduction

If you build an on-line exhibit or a Web site but your targeted audience cannot find it, then it is, in a word, worthless. You can spend a lot of money and effort on a Web site and still not get traffic from search engines. Why? Simply put, some of your Web pages are not listed on search engines. As for other pages on your Web site, they just don’t rank well within those search engines. No matter how nicely laid out, content-rich and useful your Web site may be, if it has not been designed taking search engine optimization (SEO) into consideration, it won’t get sufficient traffic from search engines. Culture.ca and the VMC, like many sites out there, were initiating minimum SEO efforts. In 2005, we took a sharp turn to SEO for greater visibility and success. This document provides useful SEO techniques and practices that you can apply to improve your on-line exhibit or Web site visibility.

Basics Of Search Engine Optimization (SEO)

Before we delve into SEO techniques, we need to understand what SEO is. Here are a few basic elements that will better explain SEO:

  • Web sites vs. Web page
    Search engine result pages do not list Web sites. They do list and/or rank Web pages. Each of the pages on your site is ranked individually. You have to choose those pages you wish to optimize.
  • Search engines vs. crawlers
    A search engine is a service designed to help users to find information/listings on the Web, based on their specific query/search term. More than 80% of Web surfers are visiting Web sites through search engines. The major* search engines are:
    46.3% Google
    23.4% Yahoo
    11.4% MSN
    6.9% AOL
    2.5% MyWay
    9.4% Other
    *Information is based on share of searches done by U.S. Web surfers in November 2005. The share of search information shown above is from the January 2006 SearchEngineWatch Nielson//NetRating report, available at http://searchenginewatch.com/reports/article.php/2156451.
    Search engines depend on humans (paid listings) and crawlers/spiders (organic listings) for the creation of their listings. A crawler/spider is a program that surfs the Web to find, scan and store a maximum number of Web pages in its index, based on a specific ranking criteria called an algorithm. A crawler will randomly follow links within your page (both on and off your site). There are many hundreds of crawlers. The most popular crawlers are:
  • Organic Vs. Paid Listing
    Organic listings are unpaid/natural listings available on a search engine. Paid listings are guaranteed list positioning (keyword “buy”) through advertising programs offered by search engines. Organic listings can usually be differentiated easily from paid listings.
  • Search engine marketing
    Search engine marketing is the ability to achieve a better rank on search engines for specific/strategic keywords using a combination of SEO techniques and paid listing programs.
  • SEO
    SEO is the ability to improve your organic ranking on search engines by changing your Web site using various actions/techniques based on the search engines’ crawler algorithm (ranking criteria).

Your goal is to get your pages indexed on major search engines for specific and popular keywords that your targeted audience will look for, and to make sure your pages will be at the top of organic search results for these specific keywords. This is what SEO is all about!

Who Should Handle SEO In Your Organization?

SEO is part of search engine marketing, an on-line marketing activity similar to an email campaign or on-line advertising. Of the four Ps of marketing - promotion, place, product and price - search engine marketing comes under promotion. If SEO is not optimal, the Marketing team needs to compensate with other activities, such as keyword buy, viral marketing and/or on-line advertising, to generate traffic from the targeted audience. Supervising the activities and monitoring the results of SEO efforts will allow the Marketing team to adjust the SEO strategy as well as rebalance the marketing mix in order to achieve an annual objective, which could be:

  • Brand positioning
  • Increase in number of visitors to the exhibition facilities
  • Increase in number of visitors to the on-line exhibition
  • Conversion of 5% of on-line visitors to exhibition walk-ins
  • Better audience mix (reaching other audiences, like youth, for example)

The Marketing team needs to supervise SEO as well as many other marketing activities in order to achieve corporate objectives. However, the Marketing team cannot achieve SEO without important contributions from the Information Technology and Content teams.

An Information Technology (IT) team is essential to ensure that your site is search engine-friendly, which means:

  • No down time or interruption
  • Proper redirection (301)
  • Right instruction (robots.txt file)
  • Proper database structure
  • Proper coding (smart coding)
  • Proper file and URL name and structure/nomenclature
  • Proper design (layout, information prominence)

A Content team is essential to ensure that the right topics have been selected, written or acquired, revised and adapted to include the right words (smart writing), translated and revised before its goes live on your Web site.

A joint effort, SEO should be led by the Marketing team with contributions from IT and Content representatives.

Many small exhibitions do not have Content, Marketing and/or IT teams in place. The project manager has to handle and/or subcontract some of the tasks. This obviously limits SEO and makes it harder to optimize your ranking. Even with these conditions, you can still perform well if you focus your efforts and select the right SEO mix.

SEO Activities

For better SEO when designing a Web site, make sure each team carries out the following activities:

Information Technology (IT) Team

Keep it simple

Search engine spiders read basic body text 100% of the time. Providing search engine spiders with content in the easiest format for them to read is the most important aspect of SEO. While some search engines, like Google, can strip text and link content from Flash files (SWF files), nothing beats basic body text when it comes to providing information to the spiders. Talented SEO consultants can almost always find a way to work basic body text into a site without compromising the designer’s intended look, feel and functionality.

Use well-formatted HTML code

Search engines will have a much easier time indexing your pages by crawling through well-formatted HTML code that adheres to HTML coding standards. Crawlers can be confused by unstructured code, poorly coded CSS templates, browser-specific code and other problems, such as tags that are not closed. The World Wide Web Consortium (W3C) is the best source for standards and also provides an HTML as well as a CSS validator.

Limit page size

Limit the size of the HTML code to less than 100K, excluding graphics. Some search engines, such as Google, index the first 100K only. As a general rule, you should also limit the number of links to about 100 per page.

Have a crawler-friendly design

Provide a clear path to all parts of your site by using simple text links that allow spiders to easily navigate your site. Spiders cannot follow JavaScript links or drop-down menus. They cannot interpret graphics and they have difficulties with Flash. These can still be used providing you have alternative text-based links. Ensure there are links to and from every page by adding links from top pages to lower-level pages and links back to the top-level pages.

Build a custom error page (404)

A standard 404 error page will make crawlers leave the site. A custom 404 page with a site map, navigation and search will be helpful to users who land on a missing page, and will keep crawlers from leaving the site.

Be careful with your Robots.txt

Robots.txt files are used to tell spiders which pages not to index. There may be certain sections of the site that you do not want indexed, such as login pages. Directories containing scripts, stylesheets and JavaScript code should not be indexed. Make sure you have clear procedures for site content updates or Web site migration so that your Robots.txt file won’t be affected (overwritten) during these processes. A badly coded Robot.txt file could remove all your indexed pages from search engines.

Have a site map accessible from every page of your site

Text-based site maps are another way to help spiders navigate your site. The use of breadcrumbs is another feature used on many sites.

Add a navigation footer

Simple text navigation footer will provide another link exit for the crawler to continue navigating the site.

Frequently update your crawler sitemap

Recently Google and Yahoo offered crawler-oriented sitemaps. These services allow larger number of pages to be indexed as well as faster indexation…and it is free!

Content Team

Have a focus

The content itself should be thematically focused. In other words, keep it simple. Some sites cover multiple topics on each page, which is confusing for spiders and users alike. The basic SEO rule here is that if you need to express more than one topic on a page, you need more pages. Fortunately, creating new pages with unique topic-focused content is one of the most basic SEO techniques, and makes a site simpler for both users and spiders.

Select the right keywords

Keyword selection should be based on:

  • Your targeted audience: which keywords are they using?
  • Popularity of the keyword: how often is this keyword searched?
  • The competition: how many search results are there for this keyword?

You are looking for a popular targeted keyword with minimum competition. Your log files and some search engine tracking tools could help your selection:

Find thematic/contextual keywords

Search engines will give more weight to your page if your main keyword is surrounded by the right contextual or thematic keywords. You could find those keywords using this tool.

Convey your message clearly

Consider keyword frequency, density, proximity and prominence when writing your text. Place several keywords throughout the body or visible text on the page. Important keywords should be placed at or near the top and close to each other. The page should have between 200 and 300 words. Keywords placed within style tags, such as bold or italic, may be given more importance. Keep in mind that the most important factor in the SEO process is to provide well-written text that clearly conveys your message to the visitor. The visitor will soon discard the page and move on to the next site if the information provided is neither relevant nor easily understood.

Have unique title tags on every page

Incorporate the main keywords in the HTML page’s title tag. Search engines view the title tag as one of the most important aspects of the Web page and often use its content as the text for the link in search results pages. Each page should have a unique title and should be limited to between seven and 10 words.

Use header tags

Use keywords in the header tags. The <h1> tag is the most important, then comes the <h2> tag, and so on.

Use contextual keywords in your anchor text

Include keywords in the anchor text for internal links. These keywords should relate to the page that is being linked. Even more important is encouraging the use of relevant keywords in the links to this page from other sites. This is important for image ALT tag, too, especially when the graphic is used as an anchor in a link.

Increase your internal links

When it comes to ranking your pages, some search engines give weight to internal links. It also helps the spiders to find and index other important pages of your Web site.

Don’t spend too much time on META tags

There is some controversy regarding the usefulness of META tags. It is generally agreed that META keywords are of little importance. However, the META description could be occasionally displayed in some search results. The META description should contain two or three sentences, each with an important keyword, and be limited to 250 characters.

Have catchy and contextual domain and file names and navigation

If you have a brand name, or are intending on branding a name, you should choose a short, catchy domain name. Otherwise, you should choose a domain name that incorporates keywords separated by hyphens. The use of keywords in the domain name not only helps in the ranking, but is also one of the most beneficial methods to increase ranking because other sites linking to your site tend to use the same keywords in their links. This is also the case with file names, though to a lesser extent.

Marketing Team

Increase your inbound links with a link-building strategy

Develop a link-building strategy. To obtain high-quality incoming links from important sites is the single most important task in any SEO effort. Incoming links not only generate traffic to your site, but are also evaluated by search engines to measure quality and trustworthiness. Inbound links from important pages are ranked higher as are links from pages with similarly themed content or topic (link community). The more quality links to your page, the higher the ranking it achieves. Inbound links from low-rated sites should be avoided as they can actually reduce your ranking.

Reciprocal linking or trading of links between two sites was, at one time, one of the most important ways to achieve high ranking. Trading links with other sites is decreasing in importance and should be used with caution, especially with sites that have little or no similar content. One-way links to pages with similarly themed content tend to achieve much higher ranking. Search engines tend to favour natural links or links added from sites with similarly themed contents. Reciprocal links should be limited to pages on sites with similar topics. Avoid link farms and sites that sell links.

The Google toolbar provides PageRank as a feature to measure the importance of a page. While only approximate and not always reliable, PageRank does give you an indication of the importance Google assigns to the page, making it a useful tool in deciding which pages to get inbound links from. To facilitate this activity, we recommend that you use proper link management software

We are using Arelis for this important SEO activity.

Activities to Avoid

SEO consultants or on-line services that guarantee high rankings and tremendous traffic in a short period of time are fake. Be cautious with them as they probably use search engine spamming techniques. If search engines find out that you are spamming, they will blacklist your IP. As a result, your listing won’t appear on these search engines and your traffic will go down the drain for months, until you regain your credibility with them. These activities are considered spamming, so be aware of them.

Links-only page

Avoid pages that contain a list of links only.

Keyword stuffing or spamming

This was one of the most common tricks used in the past to increase rankings. A keyword or phrase, sometimes irrelevant to the page contents, is repeated numerous times in the body text as well as in other keyword locations. Indiscriminate use of keywords may cause your pages to be penalized or rejected entirely.

Hidden or invisible text

This is another attempt to trick search engines: the addition of keyword-heavy text that the crawlers can see, but are not visible to the visitor. This is accomplished by using the same font colour for the text as the background colour of the page. Again, the search engines may penalize your site when they detect this. Harder to detect whose colour is nearly the same as the background.

Hidden links

This is the same concept as above and can be used to make links visible to crawlers but not visitors.

Cloaking, doorway and customized ranking pages

These are pages that are strategically optimized and served to selected search engines for targeted keywords to attain a higher ranking. These pages are usually linked to the main or top pages of your site and are generally not available to visitors. Search engines have implemented filters to remove these pages.

IP delivery

This is another form of cloaking in which highly optimized content is served to known search engine crawlers only. IP addresses known to be search engine-based are served this set of information while visitors are served another.

Mini-site networks

This refers to a network of sites designed to exploit Google’s PageRank algorithm by creating several topic- or product-related sites that all link back to a central site. Each mini-site has its own keyword-enriched URL and is designed to meet the specific requirements of a major search engine. An artificial link-density, which could heavily influence Google’s perception of the importance of the main site, is created by linking between mini-sites.

Link farms

These are series of low-quality sites set up to provide external sites for inbound linking, usually for selling links. These were used extensively when the number of incoming links influenced the Google ranking. Google, in turn, quickly devalued and eventually eliminated the PR value it assigned to pages with an inordinate collection or number of links.

Redirect pages

These are pages that automatically load another page. Although there are legitimate uses for using redirects, such as directing visitors to a new section of your site, search engines tend to disapprove of this as the redirect pages can be used as highly optimized doorway pages, also known as cloaking. 302 redirects should be used cautiously.

Extraneous code

Poorly designed content management systems (CMS) and other automatic code generators may result in the generation of substandard code that contains numerous errors or extraneous code that often has repetitive content. Such code is poorly optimized for search engines. Another example of this are MS Word documents saved as HTML files, which include vast amounts of extraneous code.

Flash and graphics

Although Flash and graphical sites are very attractive for engaging visitors, search engines are very poor at indexing them. Some search engines are starting to index some of the text and links found within Flash files.

Session IDs

Session IDs are unique identifiers added to URLs to track visitors. The problem is that each time a spider comes in, a new session ID is added as it crawls the page. This could result in a large number of links for the spiders to crawl and they could index the same content many times (repeat content is considered spamming).

Cookies and JavaScriptv

Spiders have JavaScript and cookies turned off when crawling a site. Sites using JavaScript and cookies to determine navigation will stop the crawler from crawling further.

Database or hidden contents

Crawlers cannot fill out forms or search your site. If you have information stored in a database, add a series of canned searches to expose the most important content. Avoid using dynamically generated links as these often contain a number of parameters that are difficult for the robots to follow.

SEO Best Practices

We’ve read many SEO books, attended SEO conferences and participated in chat rooms and billboards. Guess what we’ve learned? There are no magic solutions. You have to work hard and you have to work smart in order to rank higher.

From our experience, here is the best way to proceed:

  1. Review your site from an SEO perspective
    Ian McInerin, an SEO specialist in Canada, carried out an SEO review of the Culture.ca site. The review revealed IT, Content and Marketing downsides, which we are working on for the next generation of Culture.ca, to be unveiled in fall 2006. Not only is it important, but it is also a good investment to have a professional SEO consultant from outside to review your site. He will look at all the elements that could impact your ranking within search engines.
  2. Create an SEO committee (IT + Content + Marketing)
  3. Correct your main architecture problems
  4. Buy essential SEO tools (keyword, theme, ranking, links, etc.)
  5. Select strategic keywords
  6. Identify strategic Web pages to optimize use of those strategic keywords
  7. Get your current metrics to establish a baseline to measure against
    Indexed pages, visits, visitors
    Keyword/Web page ranking
    Conversion
  8. Optimize your pages (all SEO activities)
  9. Leverage on Google and Yahoo Sitemap submission system
  10. Build SEO content and technical policy/guidelines
  11. Improve your architecture and design
  12. Initiate your link solicitation campaign
  13. Track and improve your strategic pages

Conclusion

Marketing activity effectiveness should be measured based on cost per visit and cost per conversion. Without a doubt, SEO provides you with the lower cost per visit. If you follow the step-by-step method outlined in the best practices section, learn from your experience and stay away from spamming, you should significantly increase the number of visits from organic searches. The more visibility you have, the more visibility you will give to culture in general - and we will all benefit in the end.

References and Tools

These are the SEO references and tools that we are currently using. This is not, however, a complete list of available SEO references and tools. Many other great SEO references and tools are out there.

SEO news, discussion,forum
SearchEngineWatch
www.searchenginewatch.com
SEO news, tools…
McAnerin Networks Inc
http://www.mcanerin.com/
SEO news, discussion, forum
StepForth Placement Inc.
http://www.stepforth.com
SEO marketing and guide (subscription based)
Planet Ocean Search Engine News
http://www.searchenginenews.com
Search engine relationship chart
Bruce Clay Inc.
http://www.bruceclay.com/searchenginerelationshipchart.htm
Broken links and general SEO analysis (spider simulation)
OptiSpider
http://www.optispider.com/
Indexed pages on search engines
enginesMarketleap
http://www.marketleap.com/publinkpop/
HTML validation tool
W3C
http://validator.w3.org/
Web Site Analyzer
SEO Browser
http://www.seo-browser.com/
Keyword popularity search
Wordtracker
http://www.wordtracker.com
Overture - Keyword Selector Tool
http://inventory.overture.com/d/searchinventory/suggestion/
Google Adwords
https://adwords.google.com/select/
Keyword research and analysis software
Site Content Analyzer
http://www.sitecontentanalyzer.com/
Link campaign manager
Arelis
http://www.axandra-link-popularity-tool.com/download.htm
Thematic/contextual keyword search
OptiRanker (previously ThemeMaster)
http://www.optiranker.com/index.html
Metric Web page ranking
Ranking-Manager
http://www.websitemanagementtools.com/
PageRank (Google)
Google Toolbar
http://toolbar.google.com/

Cite as:

Aresenault T. and Rask E., Search Engine Optimization (SEO) Essentials For Cultural Web Sites, in J. Trant and D. Bearman (eds.). Museums and the Web 2006: Proceedings, Toronto: Archives & Museum Informatics, published March 1, 2006 at http://www.archimuse.com/mw2006/papers/arsenault/arsenault.html