A Neatline-ified Gettysburg Address

Launch the Exhibit

nicolay

This is a project that I’ve been hacking away at for some time, but only found the time (and motivation) to get it polished up and out the door over the weekend – a digital edition of the “Nicolay copy” of the Gettysburg Address, with each of the ~250 words in the manuscript individually traced out in Neatline and wired up with a plain-text transcription of the text on the right side of the screen. I’ve tinkered around with similar interfaces in the past, but this time I wanted to play around with different approaches to formalizing the connection between the digitally-typeset words in the text and the handwritten words in the manuscript. Your eyes tend to dart back and forth between the image and the text, and it’s easy to lose your place – how to reduce that cognitive friction?

To chip away at this, I wrote a little sub-plugin for Neatline called WordLines, which automatically overlays a little visual guideline (under the hood, a d3-wrapped SVG element) on top of the page that connects each pair of words in the two viewports when the cursor hovers on either of the instantiations. So, when the mouse passes over words in the transcription, lines are automatically drawn to the corresponding locations on the image; and vice versa. From a technical standpoint, this turns out to be quite easy – just get the pixel offsets for the element in the transcription and the vector annotation on the map (for the latter, OpenLayers does the heavy lifting with helpers like getViewPortPxFromLonLat, which maps spatial coordinates to document-space pixel pairs), and then draw a line connecting the two points. The one hitch, though, is that this involves placing a large SVG element directly on top of the page content, which, by default, will cover all of the underlying elements (shapes on the map, words in the text) and block them from receiving the cursor events that drive the rest of the UI – including, very problematically, the mouseleave event that garbage-collects old lines and prevents them from getting stuck on the screen.

wordline

The work-around is to put pointer-events: none; on the SVG container, which causes the browser to treat it as a purely visual veneer over the page – cursor events drop through to the underlying content elements, and everything else behaves normally. This is just barely and only very recently cross-browser, but I’m not sure if there’s actually any other way to accomplish this, given the full set of constraints.

Modeling intuitions about scale

Originally, I had planned to just leave it at that, but, as is almost always the case with these projects, I ended up learning lots of interesting things along the way, and I ended up going back and adding in another set of annotations that make note of some of the more historically noteworthy aspects of the manuscript. Namely, I was interested in the different types of paper used for the two different pages (Lincoln probably wrote the first page in Washington before departing, the second page after arriving in Gettysburg) and the matching fold creases on the pages, which some historians have pointed to as evidence that the Nicolay copy was perhaps the actual “reading copy” that Lincoln used when delivering the speech, since eyewitness accounts describe Lincoln pulling a folded piece of paper out of his coat pocket.

The other candidate is the Hay draft, which includes lots of changes and corrections in Lincoln’s hand, giving it the appearance of working draft that was prepared just before the event. One problem with the Hays draft, though, is its size – it’s written on larger paper and has just a single fold down the center, which would seem to make it an unlikely thing to tuck into coat pocket. When I read about this, I realized that I had paid almost no attention to the physical size of the manuscript. On the screen, it’s either extremely small or almost infinitely large – a tiny speck when you zoom far back, and an endless plane of beige-and-black when you zoom in. But, in this case, size turns out to be of great historical significance – the Nicolay copy is smaller than the Hays copy, especially when folded along the set of matching creases clearly visible on the pages.

So, how big is it? This involved a bit of guesswork. The resource page for the manuscript on the Library of Congress website doesn’t include dimensions, and direct Google searches didn’t turn up an easy answer, so I started poking around the internet to see if I could find other Lincoln manuscripts written on the “Executive Stationery” used for the first page. I rooted up a couple of documents for sale by rare book sellers, and in both cases the dimensions are listed at about 5 inches in width and 7-8 inches in height, meaning that the Nicolay copy – assuming the stationery was more or less standardized – would have folded down to a roughly 5 x 2.5-inch rectangle, which seems reasonably pocket-sized. (Again, this is amateur historical conjecture – if I’m wrong, please let me know!)

I sketched out little ruler annotations labeling the width of the page and the height of the fold segment, but, zooming around the exhibit, I realized that I still didn’t any intuitive sense of the size of the thing. Raw numerical measurements, even when you’re beat across the head with them, become surprisingly abstract in the a-physical, point-of-reference-less netherlands of deeply-zooming digital landscapes. I dug out a ruler and zoomed the exhibit back until the first page occupied five real-world inches, and then held my hand up to the screen, imagining the sheet of paper in my hand. And then I thought – why not just bake some kind of visual reference directly into the exhibit? I hunted down a CC-licensed SVG illustration of a handprint, and, using the size of my own hand as a reference, used Neatline’s import-SVG feature to position the outline in the whitespace to the right of the first page of the manuscript:

hand2

  • Great work David! I have wanted to do something similar with some manuscript collections I’ve been digitizing.

    • dclure

      Thanks, Tod! I think the big hurdle for doing this at any kind of scale is finding a way to automate the process of outlining the words – for this project, I did it all manually, which is incredibly time-consuming and tedious. I know there was some really interesting work about this at the DH conference last year in Nebraska, though, and I might try to find some time to tinker around with some of the of those boundary-detection algorithms.

      http://dh2013.unl.edu/abstracts/ab-112.html

      • Right. I wonder if NYPL Labs’ https://github.com/NYPL/map-vectorizer could be trained to at least tag the word shape and then you could map the blobs to the word positions in the text.

        I guess that’s overthinking map-vectorizer’s intent. There’s likely a library out there similar to what you describe/from that DH conference.

  • this is really cool. I’d love to use something similar for a project using d3 and leaflet. How figure out the geometry and x,y values of each word for the hover affect? Did you literally have to measure each word by hand as your comment seems to indicate.

    • dclure

      Hey Dean, yeah, in this case I manually traced the polygons around each of the words, assigned a plain-text “slug” to each of the records in Neatline, and then wrote an HTML fragment that wraps each of the words in the transcription in a span element with a `data-neatline-slug` attribute that references one of the geometries in the exhibit. Not the kind of thing you could do at any kind of scale.

      What I’ve thought about, though, is writing some sort of in-browser application that would present the raw image and the text transcription, and the user could just click on a word and then click on the corresponding location in the image. This would still require some manual work, but it would be much, much faster.

      In the longer view, I think the task of fully automating this kind of thing is a really interesting research question. I’m not too savvy with computer vision stuff, but I wonder if we could devise some kind of workflow using libraries like OpenCV (http://opencv.org/) that would automatically draw bounding boxes around the words and try to map them onto words in a plain-text transcription. Again, take a look at this paper from DH2013, which implemented a version of that kind of system, although I believe it still involved some manual input from a user (clicking on words).

      http://dh2013.unl.edu/abstracts/ab-112.html

      • Hmm, interesting stuff. I was thinking an svg overlay might work. Apparently Hugh Cayless beat me by 5 years on that. Small world, he was the instrutor for a class on XML I took in grad school.

        It seems like it would be tough to get the overlay right using svg text nodes, though.

  • Kathy

    David, How do I actually use this plug in? I have it installed but I am lost what I need to do next. Any clue would be greatly appreciated.

    • dclure

      Hey Kathy, probably the best place to start is the documentation over at http://docs.neatline.org, which gives a pretty good high-level overview of the project and then gets into the details of actually building out exhibits. If you run into problems, feel free to shoot me an email!