A common Mod_rewrite mistake
Thursday, November 26th, 2009Rewriting dynamic urls like
www.mysite.ie/information.php?page=information&id=45
to
www.mysite.ie/45/information.htm
using mod_rewrite makes sense for your users. It is easier to read, and should somebody be good enough to give you a link then they are less likely to make a mistake ending in a 404 error.
So what is the mistake?
In two words… “Capital Letters”.
If you rewrite the previous to
www.mysite.ie/45/Information.htm it is not the same as www.mysite.com/45/information.htm. A link to the wrong version will again lead to a 404 error.
The mistake seems to come about as a result of good coding practice. We might right $myDetails when writing our back end scripts. This makes our scripts easier to read and makes sense.
However when we write www.mysite.ie/45/myDetails.htm as a url it is looking for trouble.
Good SEO will remove as much user error as possible. As soon as you use caps in your urls you are opening up a mine field. You WILL end up with bad links, with users who can’t access pages when they try to type them directly and will eventually have to rewrite them.
Get it right in the first place. www.mysite.ie/45/my-details.htm is a much better way of doing it.
Which leads to another question “-” (dash) or “_”(underscore)?
When you look at a raw link (no css applied) then the answer is clear. An underscore will be swallowed up in the underline that shows it is a link. Therefor a dash is definitely the better option.
“I don’t need to rewrite my urls. Google can read dynamic url’s just fine.”
That is true, but websites are not just for search engines. Make life as easy for users and, from an SEO perspective, for those that will give you links, as you possibly can.
How do I rewrite a page?
Let us assume you want to rewrite the page
http://www.mysite.ie/myinfo.php?page=myinfo&id=45
to
http://www.mysite.ie/45/my-information.htm
In your .htaccess file (apache) write the following
Options +FollowSymlinks
RewriteEngine On
You just need to write that once. Then the rewrite code…
RewriteRule ^(.*)/my-information.htm/?$ myinfo.php?page=myinfo&id=$1
That’s it. So what does it do?
In plain english the line says take any characters that come before /my-information.htm (in our example this is 45) and make them into a string ($1 in this case – but if there were two (.*)’s then the second would become $2). Then rewrite that to myinfo.php?page=myinfo&id=45 (it took the dynamic 45 from the regular expression (.*) and created $1 from it before adding it to the final url.
Warning. www.mysite.ie/anything/45/my-information.htm will also be affected by this expression. Your $1 would then become “anything/45″ which would undoubtably mess with your GET statement.
Simply by changing around the page you can get around this. There are other ways too but this is the simplest method and means you don’t have to learn as many regular expressions.
Instead of rewriteing it as www.mysite.ie/45/my-information.htm
we could rewrite it as www.mysite.ie/my-information/45.htm
The mod_rewrite would then be
RewriteRule ^my-information/(.*).htm/?$ myinfo.php?page=myinfo$id=$1
Done and dusted!
