NeatlineText – Connect Neatline exhibits with documents

[Cross-posted from scholarslab.org]

Download the plugin

nltext-detail

Today we’re pleased to announce the first public release of NeatlineText, which makes it possible to create interactive, Neatline-enhanced editions of text documents – literary and historical texts, articles, book chapters, dissertations, blog posts, etc. – by connecting individual paragraphs, sentences, and words with objects in Neatline exhibits. Once the associations are established, the plugin wires up two-way linkages in the user interface between the highlighted sections in the text and the imagery in the exhibit. Click on the text, and the exhibit focuses around the corresponding location or annotation. Or, click on the map, and the text scrolls to show the corresponding sections in the text.

We’ve been using some version of this code in internal projects here at the lab for almost two years, and it’s long overdue for a public release. The rationale for NeatlineText is pretty simple – again and again, we’ve found that Neatline projects often go hand-in-hand with some kind of regular text narrative that sets the stage, describes the goals of project, or lays out an expository thesis that would be hard to weave into the more visual, free-form world of the Neatline exhibit proper. This is awesome combination – tools like Neatline are really good at displaying spatial, visual, dimensional, diagrammatic information, but nothing beats plain old text when you need to develop a nuanced, closely-argued narrative or interpretation.

The difficulty, though, is that it’s hard to combine the two in a way that doesn’t favor one over the other. We’ve taken quite a few whacks at this problem over the course of the last few years. One option is to present the text narrative as a kind of “front page” of the project that links out to the interactive environment. But this tends to make the Neatline exhibit feel like an add-on, something grafted onto the project as an after-thought. And this can easily become a self-fulfilling prophecy – when you have the click back and forth between different web pages to read the text and explore the exhibit, you tend to write the text as a more or less standalone piece of writing, instead of weaving the narrative in with the conceptual landscape of the exhibit.

Another option is to chop the prose narrative up into little chunks and build it directly into the exhibit – like the numbered waypoints we used in the the Hotchkiss projects back in 2012, each waypoint containing a small portion of a longer interpretive essay. But this tends to err in the other direction, dissolving the text into the visual organization of the exhibit instead of presenting it as a first-class piece of content.

NeatlineText tries to solves the problem by just plunking the two next to each other and making it easy for the reader (and the writer!) to move back and forth between the two. For example, NeatlineText powers the interactions between the text and imagery in these two exhibits of NASA photograph from the 1960s:

nltext-gemini

nltext-saturn-v

(Yes, I know – I’m a space nerd.) NeatlineText is also great for creating interactive editions of primary texts. An earlier version of this code powers the Mapping the Catalog of Ships project by Jenny Strauss Clay, Courtney Evans, and Ben Jasnow (winner of the Fortier Prize prize at DH2013!), which connects the contingents in the Greek army mentioned in Book 2 of the Iliad with locations on the map:

nltext-ships

And NeatlineText was also used in this interactive edition of the first draft of the Gettysburg Address:

nltext-gettysburg

Anyway, grab the code from the Omeka add-ons repository and check out the documentation for step-by-step instructions about how to get up and running. And, as always, be sure to file a ticket if you run into problems!

FedoraConnector 2.0

fedora

[Cross-posted from scholarslab.org]

Hot on the heels of yesterday’s update to the SolrSearch plugin, today we’re happy to announce version 2.0 of the FedoraConnector plugin, which makes it possible to link items in Omeka with objects in Fedora Commons repositories! The workflow is simple – just register the location of one or more installations of Fedora, and then individual items in the Omeka collection can be associated with a Fedora PID. Once the link is established, any combination of the datastreams associated with the PID can be selected for import. For each of the datastreams, FedoraConnector proceeds in one of two ways:

  • If the datastream contains metadata (e.g., a Dublin Core record), the plugin will check to see if it can find an “importer” that knows how to read the metadata format. Out of the box, the plugin can import Dublin Core and MODS, but also includes a really simple API that makes it easy to add in new importers for other metadata standards. If an importer is found for the datastream, FedoraConnector just copies all of the metadata into the item, mapping the content into the Dublin Core elements according to the rules defined in the importer. This creates a “physical” copy of the metadata that isn’t linked to the source object in Fedora – changes in Omeka aren’t pushed back upstream into Fedora, and changes in Fedora don’t cascade down into Omeka.

  • If the datastream delivers content (e.g., an image), the plugin will check to see if it can find a “renderer” that knows how to display the content. Like the importers, the renderers are structured as an extensible API that ships with a couple of sensible defaults – out of the box, the plugin can display regular images (JPEGs, TIFs, PNGs, etc.) and JPEG2000 images. If a renderer exists for the content type in question, the plugin will display the content directly from Fedora. So, for example, if the datastream is a JPEG image, the plugin will add markup like this to the item show page:

    Unlike the metadata datastreams, then, which are copied from Fedora, content datastreams pipe in data from Fedora on-the-fly, meaning that a change in Fedora will immediately propagate out to Omeka.

(See the image below for a sense of what the final result might look like – in this case, displaying an image from the Holsinger collection at UVa, with both a metadata and content datastream selected.)

For now, FedoraConnector is a pretty simple plugin. We’ve gone back and forth over the course of the last couple years about how to model the interaction between Omeka and Fedora. Should it just be a “pull” relationship (Fedora -> Omeka), or also a “push” relationship (Omeka -> Fedora)? Should the imported content in Omeka stay completely synchronized with Fedora, or should it be allowed to diverge for presentational purposes? These are tricky questions. Implementations of Fedora – and the workflows that intersect with it – can vary pretty widely from one institution to the next. The current set of features was built in response to specific needs here at UVa, but we’ve been talking recently with folks at a couple of other institutions who are interested in experimenting with variations on the same basic theme.

So, to that end, if you’re use Fedora and Omeka and interested in wiring them together – we’d love to hear from you! Specifically, how exactly do you use Fedora, and what type of relationship between the two services would be most useful? With a more complete picture of what would be useful, I suspect that a handful of pretty surgical interventions would be enough to accommodate most use patterns. In the meantime, be sure to file a ticket on the issue tracker if you find bugs or think of other features that would be useful.

fedora

SolrSearch 2.0

solr

[Cross-posted from scholarslab.org]

Today we’re pleased to announce version 2.0 of the SolrSearch plugin for Omeka! SolrSearch replaces the default search interface in Omeka with one powered by Solr, a blazing-fast search engine that supports advanced features like hit highlighting and faceting. In most cases, Omeka’s built-in searching capabilities work great, but there are a couple of situations where it might make sense to take a look at Solr:

  • When you have a really large collection – many tens or hundreds of thousands of items – and want something scales a bit better than the default solution.

  • When your metadata contains a lot of text content and you want to take advantage of Solr’s hit highlighting functionality, which makes it possible to display a preview snippet from each of the matching records.

  • When your site makes heavy use of content taxonomies – collections, tags, item types, etc. – and you want to use Solr’s faceting capabilities, which make it possible for users to progressively narrow down search results by adding on filters that crop out records that don’t fall into certain categories. Stuff like – show me all items in “Collection 1″, tagged with “tag 2″, and so forth.

To use SolrSearch, you’ll need access to an installation of Solr 4. To make deployment easy, the plugin includes a preconfigured “core” template, which contains all the configuration files necessary to index content from Omeka. Once the plugin is installed, just copy-and-paste the core into your Solr home directory, fill out the configuration forms, and click the button to index your content in Solr.

Once everything’s up and running, SolrSearch will automatically intercept search queries that are entered into the regular Omeka “Search” box and redirect them to a custom interface, which exposes all the bells and whistles provided by Solr. Here’s what the end result looks like in the “Seasons” theme, querying against a test collection that contains the last few posts from this blog, which include lots of exciting Ivanhoe-related news:

solr-search

Out of the box, SolrSearch knows how to index three types of content: (1) Omeka items, (2) pages created with the Simple Pages plugin, and (3) exhibits (and exhibit page content) created with the Exhibit Builder plugin. Since regular Omeka items are the most common (and structurally complex) type of content, the plugin includes a point-and-click interface that makes it easy to configure exactly how the items are stored in Solr – which elements are indexed, and which elements should be used as facets:

solr-config

Meanwhile, if you have content housed in database tables controlled by other plugins that you want to vacuum up into the index, SolrSearch ships with an “addons” system (devised by my brilliant partner in crime Eric Rochester), which makes it possible to tell SolrSearch how to index other types of content just by adding little JSON documents that describe the schema. For example, registering Simple Pages is as simple is this:

And the system even scales up to handle more complicated data models, like the parent-child relationship between “pages” and “page blocks” in ExhibitBuilder, or between “exhibits” and “records” in Neatline.

Anyhow, grab the built package from the Omeka addons repository or clone the repository from GitHub. As always, if you find bugs or think of useful features, be sure to file a ticket on the issue tracker!

Project Gemini over Baja California

Launch the Exhibit

gemini-screenshot

A couple weeks ago, somewhere in the middle of a long session of free-association link hopping on Wikipedia, I stumbled into a cluster of articles about Project Gemini, NASA’s second manned spaceflight program. Gemini, I quickly discovered, produced some spectacular photographs – many of them pointed downward towards the surface of the earth, capturing a dizzying opposition between the intelligible scale of the foreground (the 20-foot capsule, 100-foot tethering cords, 6-foot human bodies floating in space) and the completely unintelligible scale of the massive geographic entities below (peninsulas, continents, oceans).

As I started to click through the pictures, I found myself reflexively alt-tabbing back and forth between Chrome and Google Earth to compare them with the modern satellite imagery of the same geographic locations. Which made me think – why not try to actually combine the two into a single environment? Over the course of the next few days, I sketched out a little Neatline exhibit that plasters two photographs of Baja California Sur – taken about a year apart on Gemini 5 (August 1965) and Gemini 11 (September 1966) – around the Mapbox satellite imagery of the peninsula. Instead of lining up the coastlines to make the images overlay accurately on top of the satellite tiles, I just plunked them down on the map off to the side at a scale and orientation that makes it easy to compare the two. (We’ve played around with this before, and I like to think of it as faux – or just especially humanistic! – georectification.)

faux-georectification

Then, using the drawing tools in Neatline, I blocked in some simple annotations that visually wire up the two sets of imagery – outlines around the four islands along the eastern coast of the peninsula, and arrows between the different instantiations of La Paz and San José del Cabo. I also wanted to find a way to visually formalize the difference in perspective between the Gemini photographs (oblique, wide-angle, deliberate) and the Mapbox tiles (flat-on, uniform). Using Illustrator, I created a long, ruler-like vector shape to label the ~200-mile distance between La Paz and the approximate positon of the Gemini 5 capsule when the picture was taken, and then used the “Perspective Grid” tool to render the shape in three dimensions and place it on top of the Gemini photograph, as if the same shape were physically positioned in front of the lens. In Illustrator:

200-miles-illustrator

And placed in the Neatline exhibit, first to match the shallow angle of the Gemini shot:

200-miles

And then to match the perpendicular angle of the Mapbox tiles:

200-miles-mapbox

I was also fascinated by the surreal opposition in scale between the Agena Target Vehicle (an unmanned spacecraft used for docking practice in orbit) and Isla San Jos̩, which sits serenely in the dark blue of the Gulf of California hundreds of miles below, but occupies almost exactly the same amount of space in the photograph as the 7-foot boom antenna on the Agena. In the space between the two, I dragged out two little shapes that map the sizes of things onto recognizable objects Рa 6-foot person in the foreground, Manhattan in the background:

manhattan-person

Perspective and Perspectivelessness

These images fascinate me because they roll together two types of imagery – both ubiquitous on the web – that are almost exact opposites of one another. On the one hand, you have regular pictures, taken by regular (non-astronaut) people. These photographs freeze into place one particular perspective on things. In a literal sense, the world recedes from the lens in three dimensions – walls, buildings, bridges, mountains, valleys, clouds. Close things are big, distant things are small. Some are in focus, others aren’t. And unlike other forms of art like painting, poetry, sculpture, or music, which can claim (overconfidently, maybe) to graft completely new material onto the world, photographs innovate at the level of stance and viewpoint, the newness of the perspective on things that already exist. It’s less about what they add, more about what they subtract in order to whittle the world down to one particular frame. Why that angle? Why that moment? Why that, and not anything else?

On the other hand you have spatial photography – the satellite imagery used in Google Maps, Mapbox, Bing Maps, etc. – which is almost completely perspectiveless, in just about every sense of the word. The world becomes a flat, depthless plane, photographed from a distance at a perpendicular angle. Instead of trying to find interesting new ways to crop down the world, spatial tiles try to be comprehensive and standardized. Instead of showing one thing, in one way, at one moment in time, they try to show everything, in the exact same way, at the exact same moment – now. The companies that source and assemble the satellite imagery race to keep it as current as possible, right at the threshold of the present. Last year, Google announced that its satellite imagery had been purged of all clouds. No doubt this makes it more functional, but it also does away with those wispy, bright-white threads of cloud that used to hang over the rainforests in Peru and Brazil, which were lovely. What’s gained, of course, is the intoxicating grandeur of it all, the ability to hold in a single view a photograph of the entire world – which, if nothing else, is some sort of crazy affirmation of human willpower. I always imagine Whitman, scratching out “Salut au Monde“, panning around Google Maps for inspiration.

Photographs taken by astronauts, though, collapse the distinction in interesting ways. They’re literally “satellite” photography, but they’re also drenched in subjectivity and historical stance. The oceans and continents spread out hundreds of miles below, just like on Google Maps or Mapbox – but the pictures were snapped with regular cameras by the hands of actual people, stuffed into little canisters falling around the world at thousands of miles an hour, which were only up there in the first place due to a crazy mix of socio-political ambitions and anxieties that were deeply characteristic of that particular moment in history. The Gemini imagery is haloed with little bits of space-race technology that instantly historicize the frame – the nose cone of the capsule blocks out a huge swath of desert and ocean, the Agena vehicle hangs just a couple of hundred feet from the camera, tethered by a slight, ribbon-like cord that twists for hundreds of miles across the Gulf of California.

A Neatline-ified Gettysburg Address

Launch the Exhibit

nicolay

This is a project that I’ve been hacking away at for some time, but only found the time (and motivation) to get it polished up and out the door over the weekend – a digital edition of the “Nicolay copy” of the Gettysburg Address, with each of the ~250 words in the manuscript individually traced out in Neatline and wired up with a plain-text transcription of the text on the right side of the screen. I’ve tinkered around with similar interfaces in the past, but this time I wanted to play around with different approaches to formalizing the connection between the digitally-typeset words in the text and the handwritten words in the manuscript. Your eyes tend to dart back and forth between the image and the text, and it’s easy to lose your place – how to reduce that cognitive friction?

To chip away at this, I wrote a little sub-plugin for Neatline called WordLines, which automatically overlays a little visual guideline (under the hood, a d3-wrapped SVG element) on top of the page that connects each pair of words in the two viewports when the cursor hovers on either of the instantiations. So, when the mouse passes over words in the transcription, lines are automatically drawn to the corresponding locations on the image; and vice versa. From a technical standpoint, this turns out to be quite easy – just get the pixel offsets for the element in the transcription and the vector annotation on the map (for the latter, OpenLayers does the heavy lifting with helpers like getViewPortPxFromLonLat, which maps spatial coordinates to document-space pixel pairs), and then draw a line connecting the two points. The one hitch, though, is that this involves placing a large SVG element directly on top of the page content, which, by default, will cover all of the underlying elements (shapes on the map, words in the text) and block them from receiving the cursor events that drive the rest of the UI – including, very problematically, the mouseleave event that garbage-collects old lines and prevents them from getting stuck on the screen.

wordline

The work-around is to put pointer-events: none; on the SVG container, which causes the browser to treat it as a purely visual veneer over the page – cursor events drop through to the underlying content elements, and everything else behaves normally. This is just barely and only very recently cross-browser, but I’m not sure if there’s actually any other way to accomplish this, given the full set of constraints.

Modeling intuitions about scale

Originally, I had planned to just leave it at that, but, as is almost always the case with these projects, I ended up learning lots of interesting things along the way, and I ended up going back and adding in another set of annotations that make note of some of the more historically noteworthy aspects of the manuscript. Namely, I was interested in the different types of paper used for the two different pages (Lincoln probably wrote the first page in Washington before departing, the second page after arriving in Gettysburg) and the matching fold creases on the pages, which some historians have pointed to as evidence that the Nicolay copy was perhaps the actual “reading copy” that Lincoln used when delivering the speech, since eyewitness accounts describe Lincoln pulling a folded piece of paper out of his coat pocket.

The other candidate is the Hay draft, which includes lots of changes and corrections in Lincoln’s hand, giving it the appearance of working draft that was prepared just before the event. One problem with the Hays draft, though, is its size – it’s written on larger paper and has just a single fold down the center, which would seem to make it an unlikely thing to tuck into coat pocket. When I read about this, I realized that I had paid almost no attention to the physical size of the manuscript. On the screen, it’s either extremely small or almost infinitely large – a tiny speck when you zoom far back, and an endless plane of beige-and-black when you zoom in. But, in this case, size turns out to be of great historical significance – the Nicolay copy is smaller than the Hays copy, especially when folded along the set of matching creases clearly visible on the pages.

So, how big is it? This involved a bit of guesswork. The resource page for the manuscript on the Library of Congress website doesn’t include dimensions, and direct Google searches didn’t turn up an easy answer, so I started poking around the internet to see if I could find other Lincoln manuscripts written on the “Executive Stationery” used for the first page. I rooted up a couple of documents for sale by rare book sellers, and in both cases the dimensions are listed at about 5 inches in width and 7-8 inches in height, meaning that the Nicolay copy – assuming the stationery was more or less standardized – would have folded down to a roughly 5 x 2.5-inch rectangle, which seems reasonably pocket-sized. (Again, this is amateur historical conjecture – if I’m wrong, please let me know!)

I sketched out little ruler annotations labeling the width of the page and the height of the fold segment, but, zooming around the exhibit, I realized that I still didn’t any intuitive sense of the size of the thing. Raw numerical measurements, even when you’re beat across the head with them, become surprisingly abstract in the a-physical, point-of-reference-less netherlands of deeply-zooming digital landscapes. I dug out a ruler and zoomed the exhibit back until the first page occupied five real-world inches, and then held my hand up to the screen, imagining the sheet of paper in my hand. And then I thought – why not just bake some kind of visual reference directly into the exhibit? I hunted down a CC-licensed SVG illustration of a handprint, and, using the size of my own hand as a reference, used Neatline’s import-SVG feature to position the outline in the whitespace to the right of the first page of the manuscript:

hand2