One of the downsides of having spent years messing with my old Drupal blog is that I've ended up with a bunch of different permalink styles: to pick three posts at random, http://zhasper.com/zhasper/harry_potter_done, http://zhasper.com/2007/09/linkbloggery, http://zhasper.com/?p=631. Fortunately, I'm only running this blog to give myself a place to vent, so I don't care about lost traffic. If I did care, this would be a problem.

I'm using the "Platinum SEO pack" plugin, which does a good job of handling URLs that don't quite match the same schema that Wordpress is using - for instance, if you visit http://zhasper.com/linkbloggery, it'll figure out that you meant the second URL in the list above. Unfortunately, it's not perfect - and my old blog had way too many variations for anything to cope with.

So, I'm going through and doing what I can to fix the low-hanging fruit. URLs in the second form, /YYYY/MM/title, already work fine. URLs in the first form need to have the /zhasper/ removed, and need all the _s turned into -s. I accomplish both of these through a bit of RewriteRule magic:

RewriteEngine On

RewriteBase /

RewriteRule zhasper/(.*) /$1 [R=301,L]

RewriteRule (.*)_(.*) $1-$2 [R=301,L]

This is quite definitely not the neatest way to achieve this. In the example above, it requires three excess round-trips between the server and the browser:

  • Browser requests /zhasper/harry_potter_done
  • Server sends a redirect to /harry_potter_done
  • Browser requests /harry_potter_done
  • Server sends a redirect to /harry_potter-done
  • Browser requests /harry_potter-done
  • Server sends a redirect to /harry-potter-done
  • Browser requests /harry-potter-done
  • Server sends a redirect to /2007/07/harry-potter-done/
  • Browser requests /2007/07/harry-potter-done/
  • Server sends actual content

The 301 in the RewriteRule means that the server tells the client that this is a permanent redirect - the content will never be at the old address, please update your bookmarks. This doesn't make much difference to your browser - but crawlers such as Google should use this as a signal to update their index, and send any link-love directed at the old link to the new link.

If you didn't have the redirect at all, Google wouldn't know that /zhasper/harry_potter_done and /2007/07/harry-potter-done were the same page - it would think that the latter was just a more-recently-seen page which mysteriously had similar content to the old page.

If you go with a temporary redirect (by just using R on its own, or by stipulating [R=302], Google won't know to update its index: it will still come back later and check the old URL, just in case the page has moved back there.

There are definitely better ways to achieve this - suggested enhancements are welcome :)