As we discussed in our post about redirecting from HTTP to HTTPS, duplicate content is one of the main problems we may have with our SEO.
We, as developers, must make sure that two different URLs won't have the same content (even changing the order). There are 3 really easy ways to have duplicate content without us knowing about them:
- Not setting up a redirection from HTTP to HTTPS: this will lead to having the same content both in HTTP and HTTPS version, two different URLs.
- Not having a WWW policy: this would also lead to duplicate content in two different URLs. This is the point we will discuss today.
- Not having a trailing slash policy
Although the latest two problems can be solved by using a canonical tag, I think it's best to perform a 301 redirection in most cases.
WWW policy
Let's have a look at Unsplash. Open a new tab and enter this URL: unsplash.com
Nothing happened besides the page load, right?
Now, try to load this other one: unsplash.com, and, after loading, have a look at the address URL in your browser.
See? They performed a redirection to the other domain! This is what we will call the WWW policy. We will choose between using www or not using www in our domain, and perform a redirection in the other case.
Should we use the www or non-www policy?
According to John Mu, a search advocate from Google, this decision will not affect your SEO positioning. It's only a decision on how your brand wants to appear online.
As of 2016, you should be able to perform a redirection without affecting your Page Rank, so you can change your decision afterward.
In general, we should try to perform the redirection server-wide, instead of using our preferred language, as servers would also redirect static files to our new version.
Redirect to the canonical domain using .htaccess
To perform the 301 using .htaccess we just need to add the following lines to our file:
<IfModule mod_rewrite.c>
RewriteEngine On
# Uncomment to redirect to the WWW version
# RewriteCond %{HTTP_HOST} ^YOUR_DOMAIN.com [NC]
# RewriteRule ^(.*)$ https://www.YOUR_DOMAIN.com/$1 [L,R=301]
# Uncomment to redirect to the non-WWW version
# RewriteCond %{HTTP_HOST} ^www.YOUR_DOMAIN.com [NC]
# RewriteRule ^(.*)$ https://YOUR_DOMAIN.com/$1 [L,R=301]
</IfModule>
Setting up a canonical domain using PHP
In this first example, we will create a function to determine whether we are entering the WWW version or not, and perform the redirection if needed.
<?php
/**
* Returns a boolean indicating if the current request is performed against www.ourdomain.com or only against ourdomain.com
*
* @return boolean
*/
function isWwwVersion () {
return substr($_SERVER['HTTP_HOST'], 0, 4) === 'www.';
}
if (isWwwVersion()) {
$domain = substr($_SERVER['HTTP_HOST'], 4);
header('Location: https://' . $domain . $_SERVER['REQUEST_URI']);
exit; // Don't forget to stop the script execution for a faster redirect
}
// ...
This function isWwwVersion()
will check if the first part of our URL is www
. This code is not perfect, as it would not work for www.com
, but it's good enough :-)
In this case, we are using a non-www policy.
This would be the code for a www policy:
<?php
/**
* Returns a boolean indicating if the current request is performed against www.ourdomain.com or only against ourdomain.com
*
* @return boolean
*/
function isWwwVersion () {
return substr($_SERVER['HTTP_HOST'], 0, 4) === 'www.';
}
if (!isWwwVersion()) {
$domain = $_SERVER['HTTP_HOST'];
header('Location: https://www.' . $domain . $_SERVER['REQUEST_URI']);
exit; // Don't forget to stop the script execution for a faster redirect
}
// ...
Using Laravel to redirect to our canonical domain
Laravel makes forcing redirects much easier for PHP developers. First of all, we need to create a new Middleware:
$ php artisan make:middleware RedirectToNonWww
# Or
$ php artisan make:middleware RedirectToWww
Once we have created our middleware, we will check if the current request is running on our www version and, in that case, perform the redirection.
<?php
namespace App\Http\Middleware;
use Closure;
use Illuminate\Support\Facades\App;
class RedirectToNonWww
{
/**
* Handle an incoming request.
*
* @param \Illuminate\Http\Request $request
* @param \Closure $next
* @return mixed
*/
public function handle($request, Closure $next)
{
$host = $request->header('host');
if (substr($host, 0, 4) === 'www.') {
$request->headers->set('host', substr($host, 4));
return Redirect::to($request->path());
}
return $next($request);
}
}
Or to redirect to the WWW version:
<?php
namespace App\Http\Middleware;
use Closure;
use Illuminate\Support\Facades\App;
class RedirectToWww
{
/**
* Handle an incoming request.
*
* @param \Illuminate\Http\Request $request
* @param \Closure $next
* @return mixed
*/
public function handle($request, Closure $next)
{
$host = $request->header('host');
if (substr($host, 0, 4) !== 'www.') {
$request->headers->set('host', 'www.' . $host);
return Redirect::to($request->path());
}
return $next($request);
}
}
We are also checking if the app is running in production to avoid performing the redirection on your localhost.
How to redirect to the canonical domain using Express on NodeJS
Express also makes it super easy to redirect to our canonical domain in just a few lines of code:
// To redirect every request to the non-www version
app.all('/*', function(req, res, next) {
if (req.headers && req.headers.host.match(/^www/) !== null ) {
res.redirect(req.protocol + '://' + req.headers.host.replace(/^www\./, '') + req.url);
} else {
next();
}
});
// To redirect every request to the www version
app.all('/*', function(req, res, next) {
if (req.headers && !req.headers.host.match(/^www/)) {
res.redirect(req.protocol + '://www.' + req.headers.host + req.url);
} else {
next();
}
});
How to redirect to the canonical domain version using Django
As almost-always, Python has the simplest answer of all languages. To redirect to the canonical version of our domain using Django you just need to set the following variable inside your settings.py file:
PREPEND_WWW = True
And that's it!
Summary
We have seen how to choose between a www and a non-www domain and redirecting every request to our site to the correct version, avoiding the duplicate content problem ๐