One of the downsides of having spent years messing with my old Drupal blog is that I’ve ended up with a bunch of different permalink styles: to pick three posts at random,
http://zhasper.com/?p=631. Fortunately, I’m only running this blog to give myself a place to vent, so I don’t care about lost traffic. If I did care, this would be a problem.
I’m using the “Platinum SEO pack” plugin, which does a good job of handling URLs that don’t quite match the same schema that WordPress is using – for instance, if you visit
http://zhasper.com/linkbloggery, it’ll figure out that you meant the second URL in the list above. Unfortunately, it’s not perfect – and my old blog had way too many variations for anything to cope with.
So, I’m going through and doing what I can to fix the low-hanging fruit. URLs in the second form, /YYYY/MM/title, already work fine. URLs in the first form need to have the
/zhasper/ removed, and need all the
_s turned into
-s. I accomplish both of these through a bit of RewriteRule magic:
RewriteRule zhasper/(.*) /$1 [R=301,L]
RewriteRule (.*)_(.*) $1-$2 [R=301,L]
This is quite definitely not the neatest way to achieve this. In the example above, it requires three excess round-trips between the server and the browser:
- Browser requests /zhasper/harry_potter_done
- Server sends a redirect to /harry_potter_done
- Browser requests /harry_potter_done
- Server sends a redirect to /harry_potter-done
- Browser requests /harry_potter-done
- Server sends a redirect to /harry-potter-done
- Browser requests /harry-potter-done
- Server sends a redirect to /2007/07/harry-potter-done/
- Browser requests /2007/07/harry-potter-done/
- Server sends actual content
The 301 in the RewriteRule means that the server tells the client that this is a permanent redirect – the content will never be at the old address, please update your bookmarks. This doesn’t make much difference to your browser – but crawlers such as Google should use this as a signal to update their index, and send any link-love directed at the old link to the new link.
If you didn’t have the redirect at all, Google wouldn’t know that
/2007/07/harry-potter-done were the same page – it would think that the latter was just a more-recently-seen page which mysteriously had similar content to the old page.
If you go with a temporary redirect (by just using
R on its own, or by stipulating
[R=302], Google won’t know to update its index: it will still come back later and check the old URL, just in case the page has moved back there.
There are definitely better ways to achieve this – suggested enhancements are welcome