Inspired by the simplicity of implementing a proximity search using MongoDB, I found myself keen to try out another technology.
It just so happened that I was presented with a fun little problem the other day. Given a latitude and longitude, how do I quickly determine what the time is? Continuing the recent trend, I wanted to solve this problem with Node.JS.
Unsurprisingly, there's a lot of information out there about timezones. Whenever I've worked with timezones in the past, I've always gotten a little bit lost so this time, I decided to actually read a bit and find out what was supposed to happen. In essence, if you're doing this sort of task. you do not want to have to figure out the actual time yourself. Nope. It's quite similar to one of my top web dev rules:
Never host your own video.
(Really, never deal with video yourself. Pay someone else to host it, transcode it and serve it up. It'll will always work out cheaper.)
What you want to do when working with timezones is tie into someone else's database. There are just too many rules around international boundaries, summer time, leap years, leap seconds, countries that have jumped over the international date line (more than once!), islands whose timezone is 30 minutes off the adjacent ones...
To solve this problem, it needs to be split into two: the first part is to determine which timezone the coordinate is in, the second is the harder problem of figuring out what time it is in that timezone. Fortunately, there are other people who are already doing this. Buried near the back of the bottom drawer in very operating system is some version of the tz database
. You can spend hours reading up about it, its controversies and history on Wikipedia if you like. More relevantly, however, is what it can do for us in this case. Given an IANA timezone name – "America/New_York", "Asia/Tokyo" – you can retrieve the current time from the system's tz database
. I don't know how it works. I don't need to know. It works.
Node
Even better for reaching a solution to this problem, there's a node module that will abstract the problem of loading and querying the database. If you use the zoneinfo
module, you can create a new timezone-aware Date object, pass the timezone name to it and it will do the hard work. awsm. The module wasn't perfect, however. It loaded the system database synchronously using fs.readFileSync
which is I/O blocking and therefore a Bad Thing. Boo.
10 minutes later and Max had wrangled it into using the asynchronous, non-blocking fs.ReadFile
. Hooray!
Now all I needed to do was figure out how to do the first half of the problem: map a coordinate to a timezone name.
Nearest-Neighbour vs Point-in-Polygon
There are probably more ways to solve this problem but these were the two routes that jumped to mind. The tricky thing is that the latitude and longitude provided could be arbitrarily accurate. A simple lookup table just wouldn't work. Of course, the core of the problem was that we needed to figure out the answer fast.
Nearest Neighbour
- Create a data file containing a spread of points across the globe, determine (using any slow solution) the timezone at that point.
- Load the data into an easily searchable in-memory data-structure (such as a k-d tree)
- Given a coordinate, find the nearest existing data point and return its value.
Point in Polygon
- Create a data file specifying the geometry of all timezones.
- Given a coordinate, loop over each polygon and determine whether this coordinate is positioned inside or outside the polygon.
- Return the first containing polygon
This second algorithm could be improved by using a coarse binary search to quickly reduce the number of possible polygons that contain this point before step 2.
Despite some kind of qualification in mathematic-y computer-y stuff, algorithm analysis isn't my strong point. To be fair, I spent the first three years of my degree trying to get a record deal and the fourth trying to be a stand-up comedian so we may have covered complexity analysis at some point and I just didn't notice. What I do know, however, is that k-d trees are fast for searching. Super fast. They can be a bit slower to create initially but the point to bear in mind is that you only load it once while you search for data lots. On the other hand, while it's a quick task to load the geometry of a small number of polygons into memory, determining which polygon a given point is in can be slow, particularly if the polygons are complex.
Given this vague intuition, I settled on the first option.
If I wanted to create a spread of coordinates and their known timezones from scratch, it might have been an annoyingly slow process but, the Internet being what it is, someone already did the hard work. This gist contains the latitude and longitude for every city in the world and what IANA timezone it is in. Score! A quick regex later and it looks like this:
module.exports = [
{"latitude": 42.50729, "longitude": 1.53414, "timezone": "Europe/Andorra"},
{"latitude": 42.50779, "longitude": 1.52109, "timezone": "Europe/Andorra"},
{"latitude": 25.56473, "longitude": 55.55517, "timezone": "Asia/Dubai"},
{"latitude": 25.78953, "longitude": 55.9432, "timezone": "Asia/Dubai"},
{"latitude": 25.33132, "longitude": 56.34199, "timezone": "Asia/Dubai"},
etc…
All that's left is to load that into a k-d tree and we've got a fully-searchable, fast nearest neighbour lookup.
Source
The source for this node module is, of course, available on GitHub and the module itself is available for install via npm using:
npm install coordinate-tz
When combined with the zoneinfo
module (or, even better, this async fork of the module), you can get a fast, accurate current time lookup for any latitude and longitude.
Not a bad little challenge for a Monday evening.