During all the research I did for my CSS testing talk, I couldn't help but spot another gap where a testing tool could be useful.
Cucumber
Cucumber is a technology used widely in automated testing setups, mostly for acceptance testing - ensuring that the thing everybody agreed on at the beginning was the thing delivered at the end.
This is accomplished by having a set of plain text files containing descriptions of different scenarios or aspects of the application, usually with a description of the actions performed by an imaginary user. You describe the situation (known as a 'Given' step), describe the user's action ('When') and describe the expected outcome ('Then').
The language (properly known as gherkin) used in these files is deliberately simple and jargon-free so that all the key stakeholders in the project - designers, developers, product owners - can understand but the files are also written in a predictable and structured style so that they can, behind the scenes, be turned into testable code.
What occurred to me when looking into this area was that there wasn't an agreed terminology for specifying the layout/colour/look and feel of a project in plain text. Surely this would be the perfect place to drop in some cucumber salad.
What we've got now is a project based on SpookyJS - a way of controlling CasperJS (and, therefore PhantomJS) from NodeJS - which contains the GhostStory testing steps and their corresponding 'behind the scenes' test code. There are only two steps at the moment but they are the most fundamental which can be used to build up future steps.
Implemented Steps
Here, "Element descriptor" is a non-dev-readable description of the element you want to test - "Main title", "Left-hand navigation", "Hero area call-to-action". In the project, you keep a mapping file, selectors.json, which translates between these descriptions and the CSS selector used to identify the element in tests.
Then the "Element descriptor" should have "property" of "value"
This is using the computed styles on an element and checking to see if they are what you expect them to be. I talked about something similar to this before in an earlier post. This is related to the 'Frozen DOM' approach that my first attempt at a CSS testing tool, cssert, uses but this way does not actually involve a DOM snapshot.
Then the "Element descriptor" should look the same as before
This uses the 'Image Diff' approach. You specify an element and render the browser output of that element to an image. The next time you run the test, you do the same and check to see if the two images differ. As mentioned many times before, this technique is 'content-fragile' but can be useful for a specific subset of tests or when you have mocked content. It can also be particularly useful if you have a 'living styleguide' as described by Nico Hagenburger. I've got some ideas about CSS testing on living styleguides that I'll need to write up in a later post.
Future Steps
Off the top of my head, there are a couple of other generic steps that I think would be useful in this project.
Then the "Element descriptor" should have a "property" of "value1", "value2", ..., or "valueN"
This variation on the computed style measurement allows an arbitrary-length list of values. As long as the element being tested matches at least one of the rules, the step counts as a pass. This could be used to ensure that all text on a site is one of a certain number of font-sizes or that all links are from the predefined colour palette.
Then the "Element descriptor" should look the same across different browsers.
This would build on the existing image diff step but include multiple browser runners. Just now, the image diffs are performed using PhantomCSS which is built on top of PhantomJS which is Webkit-based. This would ideally integrate a Gecko renderer or a Trident renderer process so that the images generated from one could be checked against another. I still feel that image diff testing is extremely fragile and doesn't cover the majority of what CSS testing needs to do but it can be a useful additional check.
The aim
I'm hoping this can sit alongside the other testing tools gathering on csste.st where it can help people get a head-start on their CSS testing practices. What I'm particularly keen on with the GhostStory project is that it can pull in other tools and abstract them into testing steps. That way, we can take advantage of the best tools out there and stuff it into easily digested Cucumber sandwiches.
Try it
The GhostStory project is, naturally, available on GitHub. More usefully, however, I've been working on a fork of SpookyJS that integrates GhostStory into an immediately usable tool.
Please check out this project and let me know what you think. I might rename it to distinguish it from the original SpookyJS if I can figure out exactly how to do that and maintain upstream relationships on GitHub.
This is a collection of code snippets for various common tasks you might need to accomplish with the App.net API. Most of these are focused on creating or reading geo-tagged posts. They require a developer account on app.net and at least one of an App ID, App Code, App Access Token or User Access Token. The calls here are implemented using jQuery but that's just to make it easier to copy-paste into the console to test them out (so long as you fill in the blanks).
An important thing to bear in mind is the possibility for confusion between a 'stream' and 'streams'. By default, a 'stream' is a discrete chunk of the 20 latest posts served at a number of endpoints. This is the open, public, global stream:
On the other hand, 'streams' are long-poll connections that serve up any matching posts as soon as they are created. The connection stays open while there is something there to receive the response. Streams are available under:
https://alpha-api.app.net/stream/0/streams
Totally not confusing. Not at all.
Creating a user access token
Required for any user-specific data retrieval. The only tricky thing you'll need to think about here is the scope you require.
Using a user access token to create a post (with annotations)
Requires
User Access Token
text to post
The text is essential if you don't mark a post as 'machine_only'. The annotations here are optional. Annotations don't appear in the global stream unless the requesting client asks for them.
Retrieve the global stream, including geo-annotated posts if there are any
Requires
User Access Token
This is a very basic call to retrieve the global stream but it also instructs the endpoint to return us all annotations and include machine-only posts.
client_credentials is one of the four types of grant_type specified in the OAuth 2.0 specification. I had difficulty getting this to work when using a data object:
var data = {
"client_id": "{CLIENT_ID}",
"client_secret":"{CLIENT_SECRET}",
"grant_type": "client_credentials"
};
The client_credentials kept throwing an error. Instead, sending this as a string worked fine:
One other thing to note is that this bit should be done server-side. This will throw a bunch of "…not allowed by Access-Control-Allow-Origin…" errors if you do it via jQuery.
Returns
{
"access_token": "{APP_ACCESS_TOKEN}"
}
Creating a streams format
Now you have your app access token, you can use it to tell the service what kind of data you want back. The streams offered in the API have two quite powerful aspects. Firstly, filters allow you to run many kinds of queries on the data before it is streamed to you so you don't need to recieve and process it all. Secondly, the decoupling of filters from streams means you can specify the data structure and requirements you want once then just access that custom endpoint to get the data you want back any time.
Requires
App access token
This first example just creates an unfiltered stream endpoint
Using Filters to create a stream of geotagged posts
We'll specify some requirements for our filter now so that it only returns back a subset of posts. The rules we're specfying here are:
At least one item in the "/data/annotations/*/type" field
must "match"
the value "net.app.core.geolocation"
Requires
User access token
The field is specified in 'JSON Pointer' format. Within the response, there is a 'data' object and a 'meta' object. The data contains an 'annotations' object which contains an array of annotations, each of which has a type. This is represented as /data/annotations/*/type.
Open that URL up in your browser (seeing as we're testing) and, in a different tab, create a geo-tagged machine-only post (see above). Your post will appear almost instantly after you've submitted it.
It just so happened that I was presented with a fun little problem the other day. Given a latitude and longitude, how do I quickly determine what the time is? Continuing the recent trend, I wanted to solve this problem with Node.JS.
Unsurprisingly, there's a lot of information out there about timezones. Whenever I've worked with timezones in the past, I've always gotten a little bit lost so this time, I decided to actually read a bit and find out what was supposed to happen. In essence, if you're doing this sort of task. you do not want to have to figure out the actual time yourself. Nope. It's quite similar to one of my top web dev rules:
Never host your own video.
(Really, never deal with video yourself. Pay someone else to host it, transcode it and serve it up. It'll will always work out cheaper.)
What you want to do when working with timezones is tie into someone else's database. There are just too many rules around international boundaries, summer time, leap years, leap seconds, countries that have jumped over the international date line (more than once!), islands whose timezone is 30 minutes off the adjacent ones...
To solve this problem, it needs to be split into two: the first part is to determine which timezone the coordinate is in, the second is the harder problem of figuring out what time it is in that timezone. Fortunately, there are other people who are already doing this. Buried near the back of the bottom drawer in very operating system is some version of the tz database. You can spend hours reading up about it, its controversies and history on Wikipedia if you like. More relevantly, however, is what it can do for us in this case. Given an IANA timezone name – "America/New_York", "Asia/Tokyo" – you can retrieve the current time from the system's tz database. I don't know how it works. I don't need to know. It works.
Node
Even better for reaching a solution to this problem, there's a node module that will abstract the problem of loading and querying the database. If you use the zoneinfo module, you can create a new timezone-aware Date object, pass the timezone name to it and it will do the hard work. awsm. The module wasn't perfect, however. It loaded the system database synchronously using fs.readFileSync which is I/O blocking and therefore a Bad Thing. Boo.
10 minutes later and Max had wrangled it into using the asynchronous, non-blocking fs.ReadFile. Hooray!
Now all I needed to do was figure out how to do the first half of the problem: map a coordinate to a timezone name.
Nearest-Neighbour vs Point-in-Polygon
There are probably more ways to solve this problem but these were the two routes that jumped to mind. The tricky thing is that the latitude and longitude provided could be arbitrarily accurate. A simple lookup table just wouldn't work. Of course, the core of the problem was that we needed to figure out the answer fast.
Nearest Neighbour
Create a data file containing a spread of points across the globe, determine (using any slow solution) the timezone at that point.
Load the data into an easily searchable in-memory data-structure (such as a k-d tree)
Given a coordinate, find the nearest existing data point and return its value.
Point in Polygon
Create a data file specifying the geometry of all timezones.
Given a coordinate, loop over each polygon and determine whether this coordinate is positioned inside or outside the polygon.
Return the first containing polygon
This second algorithm could be improved by using a coarse binary search to quickly reduce the number of possible polygons that contain this point before step 2.
Despite some kind of qualification in mathematic-y computer-y stuff, algorithm analysis isn't my strong point. To be fair, I spent the first three years of my degree trying to get a record deal and the fourth trying to be a stand-up comedian so we may have covered complexity analysis at some point and I just didn't notice. What I do know, however, is that k-d trees are fast for searching. Super fast. They can be a bit slower to create initially but the point to bear in mind is that you only load it once while you search for data lots. On the other hand, while it's a quick task to load the geometry of a small number of polygons into memory, determining which polygon a given point is in can be slow, particularly if the polygons are complex.
Given this vague intuition, I settled on the first option.
If I wanted to create a spread of coordinates and their known timezones from scratch, it might have been an annoyingly slow process but, the Internet being what it is, someone already did the hard work. This gist contains the latitude and longitude for every city in the world and what IANA timezone it is in. Score! A quick regex later and it looks like this:
These are included in the package.json but it can't hurt to mention them here:
npm install twitter (node twitter streaming API library)
npm install mongodb (native mongodb driver for node)
npm install express (for convenience with API later)
Start mongod in the background. We don't quite need it yet but it needs done at some point, may as well do it now.
Create a Twitter App
Fill out the form Then press the button to get the single-user access token and key. I love that Twitter does this now, rather than having to create a full authentication flow for single-user applications.
ingest.js
(open the ingest.js file and read along with this bit)
Using the basic native MongoDB driver, everything must be done in the database.open callback. This might lead to a bit of Nested Callback Fury but if it bothers you or becomes a bit too furious for your particular implementation, there are a couple of alternative Node-MongoDB modules that abstract this out a bit.
// Open the proximity database
db.open(function() {
// Open the post collection
db.collection('posts', function(err, collection) {
// Start listening to the global stream
twit.stream('statuses/sample', function(stream) {
// For each post
stream.on('data', function(data) {
if ( !! data.geo) {
collection.insert(data);
}
});
});
});
});
Ensure the system has a Geospatial index on the tweets.
db.posts.ensureIndex({"geo.coordinates" : "2d"})
Standard Geospatial search query:
db.posts.find({"geo.coordinates": {$near: [50, 13]}}).pretty()
(find the closest points to (50,13) and return them sorted by distance)
By this point, we've got a database full of geo-searchable posts and a way to do a proximity search on them. To be fair, it's more down to mongodb than anything we've done.
Next, we extend the search on those posts to allow filtering by query
Super simple API, we only have two main query types:
/proximity?latitude=55&longitude=13
/proximity?latitude=55&longitude=13&q=searchterm
Each of these can take an optional 'callback' parameter to enable jsonp. We're using express so the callback parameter and content type for returning JSON are both handled automatically.
api.js
(open the api.js file and read along with this bit)
This next chunk of code contains everything so don't panic.
If you've already implemented the ingest.js bit, the majority of this api.js will be fairly obvious. The biggest change is that instead of loading the data stream then acting upon each individual post that comes in, we're acting on URL requests.
app.get('/proximity', function(req, res) {
For every request on this path, we try and parse the query string to pull out a latitude, longitude and optional query parameter.
if (/^(-?d+(.d+)?)$/.test(latitude) && /^(-?d+(.d+)?)$/.test(longitude)) {
If we do have valid coordinates, pass through to Mongo to do that actual search: