Testing asynchronous background processes in Omeka

I ran into an interesting testing challenge yesterday. In Neatline, there are a couple of controller actions that need to spawn off asynchronous background processes to handle operations that are too long-running to cram inside of a regular request. For example, when the user imports Omeka items into an exhibit, Neatline needs to query a (potentially quite large) collection of Omeka items and insert a corresponding Neatline record for each of them.

Jobs extend Omeka_Job_AbstractJob and define a public perform method:

And can be dispatched asynchronously by getting the job_dispatcher out of the registry and passing the job name and parameters to sendLongRunning:

It’s easy enough to directly unit test the perform method on the job, but, since actual execution of the process is non-blocking, the jobs can’t be tested at the integration level in the ordinary manner. For example, I’d like to just dispatch a request with a mock item query, and check that the correct Neatline records were created. This can’t be asserted reliably, though, since there’s no guarantee that the job will have completed before the testing assertions are executed.

The job itself is non-blocking, but the job invocation in the controller code is blocking, and can be tested pretty easily by replacing the job_dispatcher with a testing double and spying on the sendLongRunning method. Since this is a pattern that needs to be implemented in more than one test, I started by adding a mockJobDispatcher method to the abstract test-case class that mocks the job dispatcher and injects it into the registry:

Then, in the test, we can just call this method to mock the dispatcher, assert that the dispatcher is expecting a call to sendLongRunning with the correct job and parameters, and then fire off a mock request to the controller action under test:

This is a pretty good solution, but not perfect: The integration test is really asserting an intermediate step in the implementation of the controller action, not the end result – it tests that the job was called with certain parameters, not the final effect of the request. This opens up the door to false positives. For example, in the future, I might make a breaking change to the public API of the Neatline_ImportItems. Assuming I’ve changed the job’s unit tests to assert against the new API, the test suite would pass even if I completely forget to update any of the job invocations, since the integration tests are just asserting the structure of the invocation, not the final effects.

I’ve encountered a version of this problem more than once, and I’ve never really found a good solution to it. Short of moving up to something like in-browser Selenium tests, or resorting to hacky execution pauses in the integration tests, has anyone ever come across a better way to do this?

Interactive CSS in Neatline 2.0

[Cross-posted with scholarslab.org]

Neatline 2.0 makes it possible to work with really large collections of records – as many as about 1,000,000 in a single exhibit. This level of scalability opens up the door to a whole range of projects that would have been impossible with the first version of Neatline, but it also introduces some really interesting content management challenges. If the map can display millions of records, it also needs utilities to effectively manage content at that scale.

This often involves a shift from working with individual records to working with groups of records. With a million records on the map, it’s pretty unlikely that you’ll want to change the color of just one of them. More likely, that record will exist as part of a large grouping of related records (eg, “democratic precincts,” or “photographs from 1945″), all of which should share a certain set of attributes. There needs to be a way to slice and dice records into overlapping clusters of related records, and then apply bulk updates to the individual clusters.

Really, this is a familiar problem – it’s structurally identical to the task of styling web pages with CSS, which makes it possible to address groupings of elements with “selectors” and apply key-value styling rules to the groups. Inspired by projects like Mike Migurski’s Cascadenick, Neatline 2.0 makes it possible to use a Neatline-inflected dialect of CSS to update groups of records linked together with “tags,” which can be applied in any combination to the individual records.

Neatline Stylesheet Basics

Let’s take a look at how this works in practice. Imagine you’re plotting results from the last four presidential elections. You load in a big collection of 800,000 records (200,000 precincts for each of the four elections), each representing an individual polling place with a point on the map. Each point is scaled to represent the number of ballots cast at that location, and shared red or blue according to which party won more votes. In this case, there are really seven different nested and overlapping taxonomies in the data. All of the records are precincts, but each falls into one of the our election seasons – 2000, 2004, 2008, or 2012. And each precinct went either democrat or republican, regardless of which election cycle it belongs to. Each record can be tagged with some combination of these tags:

tags-input

Each of the groupings needs to share a specific set of attributes – and also not share some attributes that need to be assigned separate values on individual records. For example, all of the precincts – regardless of date or party – should share the same basic fill-opacity and stroke-width styles. All records in each of the groupings for the four election seasons need to share the same after-date and before-date visibility settings so that the records phase in and out of visibility in unison. And all republican and democratic records should share the same shares of red and blue. Meanwhile, none of the groupings should define a standard point-radius style, which is used on a per-record basis to encode the number of ballots cast at that location.

Neatline-inflected CSS makes it easy to model these relationships. To start, I’ll define some basic styles for the top-level precinct tag, which is applied to all the records in the exhibit:

Now, when I click “Save,” Neatline instantaneously updates the stroke-width and fill-opacity styles on all records tagged with precinct:

precinct-styles

Next, I’ll set the before-date and after-date properties for each of the for election season tags, which ensure that the four batches of records phase in and out of visibility in unison as the timeline is scrolled back and forth:

Now, when I open up any individual record, the before-date and after-date fields will be updated with new values depending on which election the record belongs to:

dates-fieldset

Last, I’ll define the coloring rules for the two political parties. First, the Democrats:

Click “Save,” and all democratic precincts update with the new color:

democrat-styles

Auto-updating stylesheet values

So far, we’ve just been entering hard-coded values into the stylesheet. This often makes sense for properties that have inherently semantic values (eg, dates). For other attributes, though (namely colors), it’s much harder to reason in the abstract about what value you want. For example, I know that I want the republican precincts to be “red,” but I don’t know off-hand that #ff0000 is the specific hexadecimal value that I want to use. It makes more sense to open up the edit form for an individual record and use the color pickers for the “Fill Color” field to find a color that looks good.

And even for styles that can be reasoned about in the abstract, it’s often easier and more intuitive to use the auto-previewing functionality on one of the record forms to tinker around with different values. Once you’ve decided on a new setting, though, it’s annoying to have to manually propagate the value back into the stylesheets so that all of the record’s siblings stay in sync – you’d have to copy the value, close the form, open up the stylesheet, find the right rule, and paste in the new value. To avoid this, Neatline also automatically updates the stylesheet when individual record values are changed, and immediately pushes out the new value to all of the record’s siblings.

Let’s go back to the election results. For the republican precincts, instead of pasting in a specific hex value for the fill-color style, we’ll just “register” fill-color as being one of the properties controlled by the republican tag by listing the style and assigning it a value of auto:

When I click “Save,” nothing happens, since a value isn’t defined. Now, though, I can just open up any of the individual republican records, choose a shade of red, and save the record. Since we activated the fill-color style for the republican tag, Neatline automatically updates all of the other republican records just as if we had set the value directly on the stylesheet:

republican-record

And now, when I go back to the stylesheet, the fill-color rule under republican is automatically updated with the value that we just set in the record form:

updated-stylesheet

This also works for styles that already have concrete values. For example, say I change my mind and want to tweak the shade of blue used for democratic precincts. I can just open up any of the individual democrat-tagged records, pick a new value with the color picker, and save the record. Again, Neatline automatically replaces the old value on the stylesheet and propagates the change to all of the other democratic precincts.

Announcing Neatline 2.0-alpha1!

neatline-2.0-alpha1-small

[Cross-posted with scholarslab.org]

It’s here! After much hard work, we’re delighted to announce the first alpha release of Neatline 2.0, which migrates the codebase to Omeka 2.0 and adds lots of exciting new things. For now, this is just an initial testing release aimed at developers and other brave folks who want to tinker around with the new set of features and help us work out the kinks. Notably, this build doesn’t yet include the migration to upgrade existing exhibits from the 1.1.x series, which we’ll ship with the first stable release in the next couple weeks once we’ve had a chance to field test the new code.


45 minutes of Neatline 2.0 alpha testing, compressed to 90 seconds, set to Chopin.

In the interest of modularity (more on this later), the set of features that was bundled together in the original version of Neatline has been split into three separate plugins:

  • Neatline – The core map-making toolkit and content management system.
  • NeatlineWaypoints – A list of sortable waypoints, the new version of the vertical “Item Browser” panel from the 1.x series.
  • NeatlineSimile – The SIMILE Timeline widget.

Just unpack the .zip archives, copy the folders into the /plugins directory in your Omeka 2.x installation, and install the plugins in the Omeka admin. For more detailed information, head over to the Neatline 2.0-alpha1 Installation Wiki, and take a look at the change log for a more complete list of changes and additions.

We’re really excited about this code. Since releasing the first version last summer, we’ve gotten a huge amount of incredibly helpful feedback from users, much of which has been directly incorporated into the new release. We’ve also added a carefully-selected set of new features that opens up the door to some really interesting new approaches to geospatial (and completely non-geospatial) annotation. It’s a leaner, faster, more focused, more reliable, and generally more capable piece of software – we’re excited to start building projects with it!

Some of the additions and changes:

  • Real-time spatial querying, which makes it possible to create really large exhibits – as many as about 1,000,000 records on a single map;
  • A total rewrite of the front-end application in Backbone.js and Marionette that provides a more minimal, streamlined, and responsive environment for creating and publishing exhibits;
  • An interactive “stylesheet” system (inspired by projects like Mike Migurski’s Cascadenick), that makes it possible to use a dialect of CSS – built directly into the editing environment – to synchronize large batches of records;
  • The ability to import high-fidelity SVG illustrations created in specialized vector editing tools like Adobe Illustrator and Inkscape;
  • The ability to add custom base layers, which, among other things, makes it possible to annotate completely non-spatial entities – paintings, photographs, documents, and anything else that can be captured as an image;
  • A revamped import-from-Omeka workflow that makes it easier to link Neatline records to Omeka items and batch-import large collections of items;
  • A flexible programming API and “sub-plugin” system that makes it easy for developers to extend the core feature set with custom functionality for specific projects – everything from simple JavaScript widgets (legends, sliders, scrollers, etc.) up to really deep modifications that extend the core data model and add completely new interactions.

Over the course of the next two weeks, I’ll be writing in much more detail about some of the new features. In the meantime – let us know what you think! We’re going to be pushing out a series of alpha releases in pretty rapid succession over the course of the next couple weeks, and we’re really keen to get feedback about the new features before cutting off a stable 2.0 release. If you find a bug, or think of a feature that you’d like to see included, be sure to file a report on the issue tracker.

Restarting Marionette applications

Over the course of the last couple months, I’ve been using Derick Bailey’s superb Marionette framework for Backbone.js to build the new version of Neatline. Marionette sits somewhere in the hazy zone between a library and a framework – it’s really a collection of architectural components for large front-end applications that can be composed in lots of different ways. I use Marionette mainly for the core set of message-passing utilities, which make it easy to define interactions among different parts of big applications – pub-sub event channels, command execution, request-response patterns, etc. I’ve come to completely rely on these structures, and can’t really imagine writing non-trivial applications without them anymore.

The only big kink I’ve encountered was in the Jasmine suite. Since almost all of the integration-level test cases mutate the state of the application (trigger routes, open/close views, etc.), I needed to completely burn down the app and re-start it from scratch at the beginning of each test to ensure a clean slate. The top-level Marionette Application has a start method that walks down the tree of modules and runs the initializers. As it exists now, though, start can only be called once during the lifecycle of the application, and does nothing if it’s called again later on.

I was getting around this by defining independently-callable init methods for all of my modules and wiring them up to the regular Marionette start-up system:

And then manually calling all of the init methods in my Jasmine beforeEach()‘s to force-restart the application:

This is icky – I have to exactly recreate a specific start-up order that’s automatically enforced in the application itself by before: and after: initialization events. And it introduces lots of opportunities for false-negatives – if you add a module, and forget to explicitly start it in the test suite, everything falls apart.

Really, I wanted to just re-call Neatline.start() before every test. I realized tonight, though, that the application object can be tricked into restarting itself by (a) stopping all of the modules and (b) resetting the top-level Callbacks on the application:

Much cleaner. Assuming all state-holding components are started in the initializers, this has the desired effect of completely rebooting the application.

I’d imagine this is a pretty common issue – is there any philosophical reason for the prohibition against re-calling Application.start() more than once?

Neatline and Omeka 2.0

neatline-logo-rgb

[Cross-posted with scholarslab.org]

We’ve been getting a lot of questions about when Neatline plugins will be ready for the newly-released Omeka 2.0. The answer is – very soon! In addition to migrating all of the plugins (Neatline, Neatline Time, Neatline Maps, Neatline Features) over to the new version of Omeka, we’re also using this transition to roll out a major evolution of the Neatline feature-set that incorporates lots of feedback from the first version.

Some of the new, Omeka-2.0-powered things on tap:

  • Real-time spatial querying on the map, which makes it possible to work with really large collections of data (as many as 1,000,000 records in a single exhibit);

  • The ability to import SVG documents from vector-editing programs like Adobe Illustrator, making it possible to render complex illustrations on the map;

  • A portable stylesheet system that allows exhibit-builders to use a CSS-like syntax to apply bulk updates to large collections of records;

  • An improved workflow for displaying Omeka items in Neatline exhibits – mix and match individual Dublin Core fields, entire metadata records, images, and other item attributes;

  • A flexible workflow for adding custom base layers in exhibits, which makes it possible to use Neatline to annotate non-spatial materials: paintings, drawings, abstract maps, and anything else that can be captured as an image.

  • A new set of hooks and filters – both on the server and in the browser – that make it easy to for developers to write modular add-ons and customizations for Neatline exhibits – legends, sliders, record display formats, integrations with long-format texts, etc.

The new version is just about feature-complete, and we’re now in the process of tying up loose ends and writing the migration code to upgrade projects built on the 1.1.x releases. We’re on schedule for a public beta by the end of March, and a full release by the end of the semester.

Going forward, we’ll continue supporting the Omeka 1.5.x-compatible releases of Neatline from a maintenance standpoint, but we’re moving all new development efforts into the new versions of the plugins, which only work with Omeka 2.0.

As the final pieces fall into place over the course of the next couple weeks, we’ll start posting a series of alpha releases for developers and other folks who want to test-drive the new feature set. Between now and then, check out some of the feature-preview articles we’ve posted in the last couple weeks:

Neatline Feature Preview – 1,000,000 records in an exhibit
Neatline Feature Preview – Importing SVG documents from Adobe Illustrator

And watch this space for ongoing weekly updates!