This is the first part in a series of solid web standards that I plan on writing. There is a lot of simple web standards, or things that I think should be standards, that people either do the hard way or don’t make their site as user friendly as possible. So either you know exactly what I’m talking about here or you’re like, Founderwhat the heck is canonicalization? Well according to Wikipedia Canonicalization is:
a process for converting data that has more than one possible representation into a “standard” canonical representation
Ok so that helps a lot!? Let me be a little more specific here www or non-www URL Canonicalization of your websites URL. Following me yet? For example at Wofford www.wofford.edu takes me to the same place as wofford.edu. Big deal right? Well actually it is a big deal and it’s important to do this right. You want to make sure that a visitor who comes to your site always gets to the right page no matter if they use the www or not. From a SEO perspective this is a BAD thing because you have duplicate content. That’s right www.wofford.edu and wofford.edu are seen on search engines as two separate pages and you’re splitting your linkjuice on your site in half. Now think about your entire website with this issue…
Not only do you have to worry about this duplicate content, but what about:
- www.wofford.edu/default.aspx
- www.wofford.edu/
- wofford.edu/index.html
- wofford.edu/Default.aspx
See how this problem quickly gets exponentially worse? Not only is this really bad for SEO, but deciding on www or non-www can be an important marketing decision for brand awareness. As a great example we use www.wofford.edu because well that’s how we have traditionally used it for years now and our audience expects it. Not to mention it’s got a nice balanced look to it. On the other end of the spectrum this blog uses doteduguru.com because I like the shorter and quicker look of it. There is no one way that is better than the other, but it is important to make a decision and stick with it.
You want to redirect all your pages from one preference to the other using SEO friendly 301 redirects. I would not recommend going the delayed redirect route. (If someone knows someone who works there, please forward them this post to fix their https://columbia.edu/ link. I’m sure there are other schools out there that do this, but come on don’t make your visitors wait or annoy them. It’s all about the user experience and this takes five minutes and you can automate the entire process for your user instantaneously AND think about your SEO. We are talking about a super powerful homepage that has a PageRank of 9 splitting resources here). You also do not want one of these options coming up as a dead link either, your wasting your visitors times trying to type it perfect and potentially causing them not to come back.
Luckily there are a few best practices and steps you can do to help stem this problem and get control of your duplicate content.
How do I check and Fix a URL Canonicalization problem?
The first thing I would recommend doing is to run your homepage URL through this Redirect Check Tool.
I’ll talk more about 301 redirects in a later post, but for now all you need to know is that they are SEO friendly and the proper way to redirect a website. The results should return one 200 response if you have handled all your duplicate content properly. So how do you go about resolving this issue?
Solving URL Canonicalization on Apache
It’s really easy to fix this issue on an Apache. All you need to do is edit your .htaccess file in your root directory and add the following.
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.doteduguru\.com
RewriteRule (.*) https://www.doteduguru.com/$1 [R=301,L]
This example would make all traffic to doteduguru.com be redirected with the www at the beginning.
Solving URL Canonicalization on IIS
Solving a URL Canonicalization issues on IIS is a little bit more difficult than Apache, but still not very bad and easy enough to resolve in less than five minutes. Instead of trying to explain it in this post here’s a great post that will tell you all about it and I following it works. How do I know? Because it’s the instructions I followed months ago when I setup wofford’s permissions properly.
Solving Canonicalization with IIS with SEO friendly 301 redirects
Canonicalization Best Practices
Let me also emphasize the importance of setting up all links to directories properly. This means no matter of your platform is on .net, php, static pages, or whatever you should always link to the directory not your default or index page specifically. An example of this would be linking to www.wofford.edu/admission/ not www.wofford.edu/admission/default.aspx. Not only does this make it easier on your redirects, but it’s good practice should you ever change coding languages down the road. This is also helpful in reducing duplication of your web analytics tracking and many other other issues.
Want more on URL Canonicalization?
For additional reading on this subject check out Matt Cutts post about URL Canonicalization
I would also recommend reading SEOmoz’s post about Canonical and Duplicate Versions of Content
Thanks for this one Kyle, personally this one I’ve been slacking off on.
Especially on our numerous sub-domains and clients that demand a https://www.project.university.edu
Drives me batty.
Great post, and couldn’t have been more timely… We just fixed this on a server using IIS7. It’s not as easy as Linux mod rewrites but it definitely works. I was going to post on this one myself because of the lack of info on the Web about canonicalization of index.asp files to 301 redirect to the root folder!
~A
@Jon & Adam - glad you guys found this article timely and useful. It’s definitely one of those web standard things that needs to be thought about and just done on every site. Especially considering it takes all of 10 minutes to do properly and can save tons of headaches down the road.
well i am just glad to have read this information, i have heard of this before, but you have explained it more clearly, shed some light for me… thanks kyle!
Nice article. I covered the canonicalization topic when doing a writeup for webmasters at the University of South Florida on SEO. See the article for a simple way to do 301 redirects without having to use URL rewrites from IIS or Apache. It also includes a link to Matt Cutts’ website with more recent information on canonicalization.
Excellent post — this is something I believe not enough SEO’s take seriously. (Not to mention the usability issue of having two apparent URLs.) Search Engine Guide also did a nice piece on the mod_rewrite process: https://www.searchengineguide.com/stoney-degeyter/how-to-use-your-www-to-prevent-duplicate.php
I can handle Perl and PHP, but Apache Mod_Rewrite drives me batty sometimes.
He who can master it, however, shall be greatly pleased at the results. Oh yes.
Wow, thanks!
I was looking all over for this information. I was trying all kinds of permutations of the code. Yours is the one that worked! Thanks!
Wow! I didn’t know this. I guess I better start making sure all of my links point to the same place or use a 301 redirect.
I tried to do this myself and screwed it up. This article will really help me get it done right. One question…what about the link structure of the site you redirect?
This is something that I have been tempted to do, but didn’t really know how affective it would be. Do all links from both domains stick? I would hate to loose what I’ve been able to get up to this point.
You just pointed out some interesting SEO points that my partner and I were discussing, relieved I ran into it, so thank you for that.