Archive for the ‘slugworthy’ Category.

OpenAustralia/ScraperWiki hackfest: my first ruby code!

This weekend, I’ve been hanging out at my old office, taking part in the OpenAustralia/ScraperWiki “What are you up to next weekend?” hackfest. I’ve been to quite a few OA hackfests before, but always as a host – this is the first time I’ve been to one with the intent to code.

I’ve been meaning to learn Ruby for a while, and this seemed like a good opportunity, so I decided to write a scraper to get some more data into PlanningAlerts.

PlanningAlerts is a project of the OpenAustralia Foundation, and aims to provide you with email alerts of development applications near you. Development applications are scraped from council websites, alerts are sent (via RSS or email) to people who have requested notifications about applications in that area; and the site gives you a simple way to send your feedback back to the council.

Henare from OpenAustralia has written a guide to writing scrapers using the excellent ScraperWiki. Utilising that, cadging from some of his existing scrapers, and asking a few noob questions along the way, I created a scraper that pulls in information about development applications from the Redfern/Waterloo Authority site.

The good parts of the code I’ve scraped together come from the doc or from other samples; the ugly parts are my own invention.

When I started, the provided sample code when I started working looked like this:

 if ScraperWiki.select("* from swdata where `council_reference`='#{record['council_reference']}'").empty?
  ScraperWiki.save_sqlite(['council_reference'], record)
else
  puts "Skipping already saved record " + record['council_reference']
end

This breaks on a couple of corner cases: if the swdata table doesn’t already exist, this will die. If you want to trample on your existing data, you have to manually comment out 4 lines of code. As well, it results in one select code per record – fine in small cases, but potentially a time-sink for larger cases.

While I was working on the code, the first problem was fixed by changing the first line to:


if (ScraperWiki.select("* from swdata where `council_reference`='#{record['council_reference']}'").empty? rescue true)

I expanded on that (and along the way taught myself a little bit about Ruby classes):

class Saver
  def initialize
    #If you want to trample on existing data, set this to true
    @trample_data = false
    @references = (ScraperWiki.select("council_reference from swdata") rescue nil)
  end

  def save(record)
    if record
      if @trample_data || @references.nil? || @references.include?(record['council_reference'])
        ScraperWiki.save_sqlite(['council_reference'], record)
      else
        puts "Skipping already saved record " + record['council_reference']
      end
    end
  end
end

This will only do one lookup, and can then do in-memory comparisons to decide if the database needs to be updated for each record. This handles the case where swdata doesn’t exist yet; and if you want to trample on the data, just one word needs to be changed.

There’s some real ugliness in other parts of the code though.

* The entire page uses a tables-based layout, so to find the data I want I have to use page.search('table table table table table table table table tr')
* Both DAs on the site right now have the same data items in the same order; but rather than assume this is consistent, I have my parser iterating over the rows and using a nasty big case to interpret the contents of the second cell based on the value of the first cell in the same row.
* Each DA is on public exhibition from a specifc date to another specific date. The two dates are expressed in compact form: if the month/year values are the same for both dates, they’ll only be expressed once, on the second date. There’s another nasty case block to handle the different possible values here and extract useful dates.
* Every time the code encounters the start of a new record, it tries to save the old record. This leads to an attempt to save an empty record at the start of the parsing (hence the if record test in Saver.save); and a need to manually do One Last Save at the bottom of the code.

The complete code is available on ScraperWiki, and the data is already available on the PlanningAlerts site.

Running multiple instances of Chrome on Mac/Linux

Sometimes it’s handy to be able to have multiple browser instances open at once. For instance, Google’s Multiple Login only allows me to have 3 accounts signed in at once, which isn’t enough for me to have all the personal accounts I want to check plus my work account. Even if it could, I like to keep my personal and work search and browsing histories separate, so that it’s easier for me to find something I vaguely remember seeing recently.

When doing web development, it’s often handy to have one browser signed into the site as an admin, another signed in as a regular user, and one not signed in. Chrome’s “Incognito Window” feature can help with one of these, but you can’t have two Incognito windows at the same time (at least, not on Mac/Linux – I hear tell that the Windows version may have supported multiple incognito sessions at some point, but I don’t know if that’s still the case)

So.

I’ve created a little script. I call it chrome and it lives in ~/bin on all my machines. It detects the platform and calls the appropriate binary.

More importantly, it takes one (optional) parameter, which it uses to figure out which profile to run.

I usually start my day by running this script twice: once as chrome work and once as chrome personal. The order is significant, as clicking on urls in other applications will result in them being opened in the first profile that ran. So, while I’m at work I want most things to open in the work profile; if I’m not working I want a different default behaviour.

If you don’t pass a parameter, the script will invoke the default profile – the one that gets used if you don’t specify a profile at all.

I’ve put the script on github for your amusement and pleasure (and hardcore forking action).

Deprecating your phone number made easy

18 months ago, I ended up with an Optus account – I was on a 12 month contract in order to receive an iPhone. For various reasons, I decided not to port the number I’d been using for almost a decade to Optus, but keep it active on another carrier instead. As of a few weeks ago, I’ve now migrated away from Optus, and I want to switch back to my original number. I want to keep the number I’d been using on Optus active for a while, but I don’t want to be answering it – I just want people who use it to be notified about my new number.

This is made easier by the fact that the SIM lives in my Nexus One (given to me by my employer as a Christmas gift last year, but this post, as always, is entirely my own opinion), which runs Android 2.2. Unlike on an iPhone, this means I can have all sorts of applications always running in the background – and those apps can take access the SMS database, respond to incoming SMSes, and sending outbound SMS.

I tried a few apps, but ended up settling on Ultimate SMS. This app allows me to set an auto-response sent in reply to any incoming SMS (‘James does not use this number any more; he can be reached on 0407123456 instead). This app also forwards a copy of the inbound SMS on to my new number – so I usually get it, and respond to it, while the person who messaged me is still reading my auto-reply.

One last special feature from Telstra makes this twice as useful: SMSes sent from their Message2Text service show the original caller’s number as the origin of the SMS. This means that if anyone calls me and leaves a message, they still get an SMS in response notifying them of my new number. Even better, Ultimate SMS includes the original number when it forwards that SMS to me – so even if their call was from a number that can’t receive SMS, I still get their message on the phone I do carry, and I know what number the message came from.

Update: Between drafting this and posting it, my Nexus One went missing. I’m now doing the same thing on my G1 running Android 1.6.

openwrt, dnsmasq, linuxigd, and Back To My Mac

Simple task: set up my wrt-54g (running openwrt) with miniupnpdlinuxigd so that “Back To My Mac” works[1].

miniupnpdlinuxigd: trivial. Click a few buttons to enable it, done. I tried miniupnpd first; but althought it initially looked good, I couldn’t get it to work consistently.

However, that’s when I start getting the MobileMe prefpane telling me that BTMM couldn’t start because “Your DNS server isn’t responding”. A little bit of searching on Google finds me pages like this one, which baldly state that “Back to My Mac isn’t compatible with dnsmasq.”

Well, dear internets, I’m here to tell you that you are wrong. BTMM is perfectly compatible with dnsmasq. Sure,openwrt’s default settings don’t work, but that doesn’t make the two incompatible.

It did take me a while to figure out what was going on. The clue also came from Apple’s forums, which told me to do this:

betelgeuse:~ james$ echo "show State:/Network/BackToMyMac" | scutil

<dictionary> {

  zhasper.members.mac.com : <dictionary> {

    ExternalAddress : 143.211.101.234

    StatusMessage : GetZoneData failed: _afpovertcp._tcp.username.members.mac.com.

    AutoTunnelExternalPort : 4500

    StatusCode : -65554

    LLQExternalPort : 5353

    RouterAddress : 192.168.0.1

    LastNATMapResultCode : 0

  }

}

The vital clue was the StatusMessage, which tells you exactly which DNS lookup failed. The important thing is that the hostname starts with an underscore.

Take a look at the dnsmasq man page, specifically the filterwin2k option. Once upon a time, SRV records (and records with underscores) really were a sign that you had win2k machines on your network. Once upon a time, “triggering dial-on-demand links” was actually something to be worried about. Those times are long past.

I turned this option off (vi /etc/dnsmasq.conf, add a # at the start of that line to comment the option out, save the file, and run /etc/init.d/S65dnsmasq to restart the service). As expected BTMM now works fine. Well, as fine as you could expect.

[1] I’m ideologically opposed to all things UPnP, and BTMM in particular. What’s the point of having a firewall if you’re going to allow everything inside to poke so many holes in it it may as well not be there? There’s nothing BTMM can give me that a small firewall hole (to allow SSH on a non-standard port) + ssh portforwarding can’t give me in a more controlled way – and without shelling out $$$ to Uncle Steve, too. Nevertheless…

For all your expert travel advice

ads-by-google-1

QNAP TS-409 Pro: initial setup from a non-windows (linux/mac) machine

I just bought myself a QNAP TS-409 Pro from Skycomp. Very happy with both the device and Skycomp so far.

However, the initial setup was a struggle.

The device has a very limited openwrt-style firmware. Very, very limited: it contains the bare minimum functionality to be able to bootstrap the device with a more capable OS once you have disks installed.

The documented way of doing this is via a “QuickInstall Wizard”, that comes on a provided CD in Mac and Windows flavors. I only have Macs on my home network, so the windows flavor wasn’t useable for me. The Mac flavor is… interesting. I ran into the problem described here: In short, the full firmware isn’t pushed until after the drives are initiated; but the Wizard gets stuck at the “Initializing drives” stage, so the full firmware is never pushed.

I got around it using these instructions – they’re described as being “For linux”, but as it just uses basic tools like telnet and ftpd, it will work on any *nix.

Some notes:

  • Obviously, had to enable file sharing via FTP on my mac first. Did this under “Sharing” prefpane, “File Sharing”, “Share files and folders using FTP”. As the warning states, this involves transmitting your username and password in cleartext: only enable this if you’re confident you’ll only be transmitting them across a safe network. Better, use a username/password you created just for this purpose; which has no special privileges, and which will be turned off as soon as you’re done.
  • Out of the box, the device listens for telnet connections on port 13131. Username and password are “admin”.
  • Once you’ve successfully updated the firmare and rebooted, you won’t find a telnetd on 13131 any more. THIS IS NOT AN ERROR, DON’T PANIC. Instead, you’ll find an sshd listening on port 22.
  • You’ll also find a web interface listening on port 8080. If you visit that, you can start the process of setting up the device.
  • It may be helpful to have let the wizard run at least to the “Initializing drives” stage at least once. After I thought I knew what I was doing I switched to a new set of disks and tried again; and this time the hard drives weren’t mounted at all, so I couldn’t go through the documented process.

It’s not clear from the documentation, but the device creates a RAID-1 segment 500Mb in size on each disk you insert (/dev/md9 in my case), and mounts this on /mnt/HDA_ROOT. This is where configs for the device, packages you install, and so on are stored.

The device can handle multiple raidsets – although with only 4 disks to play with, you’re not likely to end up with >2 sets. In my cause I currently have 3 1Tb drives in a RAID-5 set, and a single 500Gb disk sitting on its own.

Laundry powder gets huge upgrade

I was in the supermarket getting some laundry powder last night and noticed something really strange: every single brand of concentrated laundry powder was advertising on their packaging the fact that they’re about to be relaunched in a new version. The new powders are all going to be 2x as concentrated, and most brands made a big deal out of the fact that the new packaging will therefore be half the size.

Golly. Every brand? All at once? All deciding to redo their formulation, redo their packaging, and retool their manufacturing plants, all with identical changes to formulation and packaging, all at the same time? Unpossible!

You’d almost think that every brand of powder was actually exactly the same, made at the same plant, and just packaged slightly differently. But that would surely never happen!

Everything old is new again redux

Lindsay did an excellent blog post yesterday titled “Everything old is new again“, about the re-emergence of multi-dimensioned databases.

Great title, but just to prove his point, it applies even better to a post he shared on Google Reader a few days ago, written by Kurt Schrader and titled “Living in a Post Rails World“. To quote that post:

I think that the Ruby world is eventually going to end up in a model like this, writing small simple apps that all talk to each other, and can be replaced or upgraded at any time.

<snip two paragraphs>

All of my hard/long running logic is well tested, encapsulated, and most likely running in little agents on the wire.

Sound familiar? It should. Kurt has re-discovered the same principles that the Holy Fathers of Unix discovered, over a quarter of a century ago. Doug McIlroy, circa 1978:

(i) Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new features.

(ii) Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don’t insist on interactive input.

Later, he simplified it:

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Of course, Henry Spencer said it the best:

Those who don’t understand UNIX are condemned to reinvent it, poorly.

iPhone/Google Sync tips

Some hints about using the Google Sync for iPhone. These will probably also apply to the Windows Mobile sync – but I’ve not used that, so I’m not sure. I’m going to say “iPhone” consistently – but the same will apply to an iPod touch as well (modulo the things that involve a 3G connection, of course).

  • BACK UP YOUR DATA. Really can’t stress this enough. The process of setting up the sync WILL WIPE ALL YOUR CONTACTS AND CALENDARS. Back up first.
  • You can choose up to 5 calendars (not including your primary calendar) to sync.
    • If you have a gmail/googlemail account, visit m.google.com/sync on your iPhone, follow the prompts, and you’ll be able to choose up to 5 additional calendars to sync.
    • If you have a Google Apps account, visit http://google.com/m/a/<<domain.com>>, then click “More” and then “Sync”. For this to work, your domain administrator will have to have enabled Google Sync for your domain first.
    • [update]It’s been pointed out to me that Apps users can actually access the sync settings from m.google.com/sync. Click on “Google Mobile” on the bottom left-hand corner of the page, and you’ll be taken to a page  with lots of icons for different Google services. Scroll down and make sure there’s a link that says “Not in United States?”. If it lists another country, click it and change your contry to the United States – this won’t work in any other country. Once you’ve changed that and you’re back at the page with service icons, find the “Google Apps user?” button, and enter your domain into the popup. You’ll now have icons for your Apps domain – including a Sync icon. Click it, and once again just follow the directions from there.[/update].
  • I have one Google Apps account for work and one personal Google Apps account. However, the iPhone only allows me to set up one Exchange account, so I have to pick which of the two I’m going to sync, right? Wrong! I’ve shared my personal calendar with my work account, giving it “Make changes to events” permissions. I’ve then set up my work account to sync with my iPhone, and chosen my personal calendar as one of the additional calendars to sync.
  • If you go with the default setup, it will sync both Calendars and Contacts. This is almost certainly not what you want. It does have the benefit of pushing changes to contacts straight into the cloud – but it also has the effect of breaking the sync between your Google contacts and your Address Book. That is – assuming you used to sync the two – which a lot of people did not, due to Google’s contacts manager being rather broken. However, it’s easy enough to set the sync to Calendar only. If you look at step 13 of the official instructions, you’ll see both Calendar and Contacts selected. If you choose to sync only Calendar, Contacts will still be synced with Address Book by iTunes whenever you sync your iPhone. If you’ve chosen to sync Address Book with Google Contacts as well, that will still happen too.
  • You can sync calendars with both an Exchange and MobileMe cloud at the same time; but as soon as you enable one of them, you can’t sync calendars with iTunes any more. You can only have one MobleMe account and one Exchange account.

I used to have a messy messy setup involving Spanning Sync pulling all my Google Calendars into iCal; then using Mobile Me to push them into the cloud; then using the iPhone’s Mobile Me sync to pull them onto the phone. Many moving parts, 3 different sync stages for something to go wrong. Only works if you have a permanently online machine that can be doing the translation between the Google cloud and the Apple cloud. I’m much happier with this direct sync.

[update]About the contact sync thing. See, you only get the option to sync your Address Book and your Google Contacts visible in iTunes if you’re syncing contacts with your iPhone. If you’re syncing contacts with the cloud, you’re not syncing with your iPhone, so you don’t get the option. If you do use Google Contacts, that means that the cloud and your iPhone are both up-to-date – but your desktop is not.

If you really want instant syncing between your phone and your desktop, turn on cloud-syncing of your contacts. If you’d prefer to keep your phone, desktop, and the cloud all in sync, turn off cloud-syncing, and let iTunes handle the sync instead. [/update]

[update 2009-09-09] As of Snow Leopard, it’s no longer necessary to have an iPhone/iPod in order to get Address Book <-> Gmail Contacts syncing. So, it’s now perfectly possible to have your iPhone cloud-syncing your contacts AND have your Mac also cloud-syncing. To turn it on on your Mac, just go into the Address Book’s preferences and look under the Accounts tab.[/update]

[update]Facebook Events? Magically pushed into your iPhone calendar? Easy!

Go to your Facebook Events page. On the top left (below the blue Facebook bar; above the big word “Events”) you’ll see “Export Events”. Click on that link, and you’ll get a popup with a long URL. Copy this URL.

Next, go to your Google Calendar. Click “Settings”, “Calendars”, “Import Calendar”, “Add By URL”, and paste that URL into the box.

Now visit the Sync Settings page, and choose your new Facebook calendar as one of the 5 to import. Now if you RSVP to any events in Facebook, that event will appear in your Google Calendar and your iPhone.[/update]

Bad taste in advertising award for the day goes to: SMH!

At first glance, I assumed that this was related to the horrible fires in Victoria. Nope, just advertising. Well done SMH!

badtaste-1