Rewriting Dynamic URLs for SEO
Do You Like This Post?
Introduction:
For the last few years, most developers and site designers have been aware of the increased risk of using complex URLs. A complex URL tends to look something like:
http://www.somewebsite.com/pages/search.asp?id=21f&c=9clr&y=true.
This URL is a problem for both search engine spiders and, in some cases, the user of the site . Complex, non-user friendly URLs, also known as dirty URLs, are very common throughout the internet. Dirty URLs pose an array of problems. The most common problems with dirty URLs are:
- they are hard to remember
- they are not SEO friendly
- they can be a security risk
- they can just be a problem for ease of usability
Sweeping changes in search engine practices and developer practices look to correct the problem through sophisticated URL rewriting techniques. URL rewriting seeks to simplify the URL and make the page look and feel more static, rather than the dynamic dirty URL example above. User friendly or simplified URLs are easier for users as well as for search engines. Depending on the server environment, URL Rewriting does, however, require some forward thinking and planning and some sophisticated rule-based rewriters. Some developers have written off the process simply due to its complexity. Rest assured, however, the benefits far outweigh the time spent and efforts learning.
Basic SEO URL Writing Tips
Some basic areas to consider, before jumping straight into URL rewriting:
- Avoid punctuation in directory and file names. This can needlessly complicate site structures and maps.
- Avoid capital letters. Using capital letters can be extremely confusing and, depending on the operating system, it may cause unnecessary confusion. Take the following example: http://www.somedomain.com/Page.html and http://www.somedomain.com/page.html. While these two sites may be the same on a Windows server, on a UNIX machine these are two separate files. This can also be compounded by the fact that domains are not case sensitive. This means the best solution is ensure that all files and directories are lower case only.
- Be descriptive and include proper keywords in your directory and file names. For example, if you are developing a site to sell fishing gear, instead of “/pages/tackle/pbaits” use “/fishing-tackle/plastic-baits”.
- To hyphen or not to hyphen. Please note that Google and the top search engines are smart enough to parse out keywords in the URL, meaning they can read this directory or file name “/plasticbaits” as 2 keywords, however, it doesn’t hurt to add a separator such as a hyphen where a space would naturally occur such as “/plastic-baits”. The search engines will treat hyphens as spaces when looking for keywords. However this type of parsing ends at the URL, most search engines including Google have not perfected parsing out combined words in the body of the content.
- Avoid leaving in common directory names. When you are using common server side technology, specific directories can, not only complicate the URL, but also cause a security issue for the site. Avoid directory names like “/cgi-bin”. You can and should rewrite the entire URL including directories.
- Avoid the use of query strings in URLs, for example /pages/carsearch?id=3. The whole purpose of URL rewriting is to rewrite the URL to become more user friendly and search engine friendly. we can’t emphasize the point more, if you are going to put in the effort, make the whole URL search engine friendly for example /cars/bmw/financing.jsp.
- Use file extensions such as .php, .asp, .jsp. Actually file extensions are optional. Search engines will read the file with or without the file extensions. It does make the website look more natural as if you did not intend to do it for SEO. As a rule of thumb in general is that anything that makes your website look natural is better for SEO. So if the search engines ever decide to make this a factor, you will be perfectly safe. One thing to note, however, is that some SEO consultants may tell you that these file extensions hurt your SEO score, however, that simply is not true and is being confused with dynamic query strings that these programming languages are capable of versus static HTML. With URL rewriting, dynamic database driven websites no longer need to have a lower SEO score by default.
- Most importantly, do the keyword research. Working with your in-house search engine marketing manager or external SEO consulting firm, you need to find the best keywords to use that will drive the most traffic to your website. The cost and effort for technical implementation is the same, so why not maximize your business impact by doing some well spent up front planning. It is a critical part of the process to have thorough business and marketing planning prior to involving the technical team. Remember that this URL structure may last you for many years to come.
SEO URL Rewriting in Content Management Systems (CMS)
Most off the shelf CMS applications, such as WordPress, Joomla, and Drupal, have SEO URL rewriting functionality built-in as a feature. However, some applications may require installing plug-ins or additional customization. Custom-built applications will most likely require custom code in order to perform URL rewriting.
SEO URL Rewriting with IIS and Windows Server
For IIS, there is a module built by Microsoft for IIS7 with the specific purpose of creating rules-based friendly and clean URLs (download the URL Rewrite Module here). The module, or extension, enables URL rewriting using rules that are integrated into IIS Manager. These rules can be set up and defined based on HTTP headers, HTTP response/request headers, IIS server variables, and complex programmatic rules.
Microsoft has written an easy guideline accompanied with screenshots on how to use IIS7 to create rewrite rules. Read the guideline on creating rewrite rules for the URL rewrite module.
SEO URL Rewriting with Apache and .htaccess
Apache URL rewriting is fairly complicated. Even seasoned developers often find new sections of the mod_rewrite that they never knew existed. Mod_rewrite can be the best friend of a developer as well as the worst enemy. The first step to using this system is creating an .htaccess file and placing it in the root directory of the website. Once the file is created, open it with a text editor and insert the following code:
RewriteEngine on
RewriteRule ^article/([^-]+)-([^&]+)\.html$ /article.php?categoryid=$1&articleid=$2
Looking at each line individually, the first line tells the web server that the site will be rewriting the URLs and this keeps the server from producing 404 errors. The next line tells the web server, when it encounters a URL that matches the rule, it should complete the action defined. There are a couple of complicated regular expression (regex) patterns being used in this example. Read more about regular expression and additional supporting information here.
When thinking in terms of steps, the first step is to identify what is being sought. In the above example ^article/([^-]+)-([^&]+)\.html defines what the server is seeking. In apache it is important to always end this section with a dollar sign ‘$’. The next section defines the URL to be rewritten. In this case: /article.php?categoryid=$1&articleid=$2.
The above will essentially rewrite the following URL:
http://www.somewebsite.com/article.php?categoryid=category-name&articleid=article-title
It will be rewritten in a more search engine friendly format as:
http://www.somewebsite.com/article/category-name-article-title.html
With Apache, it is really important to understand regular expressions, which are something outside the scope of this article.
The above example demonstrates the power of Apache URL Rewriting and the Apache web server to recognize patterns and rewrite large groups of URLS.
SEO URL Rewriting with Java Server Pages (JSP)
While there is nothing as straight forward as Apache mod_rewite and IIS URL Rewrite Module, web based applets can also take advantage of URL rewriting. Most JSP URL rewrites must be accomplished through the use of third party plug-ins and filters. A good tool for Java-based URL rewriting is the URL Rewrite Filter which can work with any J2EE compliant web application server, such as Resin, Orion, or Tomcat. This plug-in allows the server to rewrite URLs before users get to the code through the use of xml files which contain the set of rules.
301 Redirects are Your Friend
Don’t forget that after you rename or rewrite a URL, that the old URL may still be indexed in the search engines, hyperlinked from other websites or blogs, and/or bookmarked by users. In order to retain the SEO score that an older page may have earned as well as redirect a user back to the right page, have your server automatically redirect all requests for the old URL to the new URL. There are many redirect methods, but only a 301 Redirect will tell the search engines that the page has permanently moved and to transfer the SEO score to the new page. All SEO and website professionals will understand this request, if you simply ask them to make sure to do a 301 redirect once your new URL’s are made.
Other SEO Factors to Consider
A Dirty URL is very challenging to the search engines and adds very little SEO score to your website. It use to be the case where search engines simply stop at certain characters, such as ‘?’ when reading through your URL and subsequent web pages, however, Google, Bing, and other search engines have gotten smarter over the last few years. Even though search engines develop better search algorithms to read dynamic URL’s, they will not, at least any time soon, build an algorithm to translate “id=21f” to really mean the intended keyword, e.g. Fishing Tackle. And all SEO consultants know that keywords in the URL are one of the top factors in SEO score. Performing URL rewriting techniques will always give your website SEO a lift no matter how advanced the search engine algorithms get.
Closing Thoughts and Remarks
The above discussed methods are not the only alternatives for each platform and server side technology. There are a number of third party tools that offer to simplify the process of URL rewriting for the developer, but they do come with a cost. The real question simply becomes: is it more economical to learn the above methods or get a third party tool? There is no good answer to this, as both methods work great. URL rewriting is one of the most overlooked areas when building a website, yet it will make a tremendous impact on your search engine rankings. With the right business and technical planning, you can dramatically increase the website traffic you see from each of the search engines.