Livingstone's 1871 Field Diary

A Multispectral Critical Edition

Restoring the Text: Data Hosting
The size of the Livingstone data archive necessitated hosting by an institution that could provide robust, long-term data storage and curation. The NEH grant application stated that the archive would "be hosted by Livingstone Online, while backup data hosting will be facilitated by The Early Manuscripts Electronic Library."
To meet this requirement, Wisnicki in collaboration with Michael Phelps (Executive Director of EMEL), Todd Grapone (Associate University Librarian for Digital Initiatives and Information Technology at UCLA), and Stephen Davison (Head of the UCLA Digital Library Program) arranged for UCLA both to host the data and seamlessly integrate it into Livingstone Online. Subsequent funding complications for Livingstone Online, however, necessitated that the integration be put off to a later date, and the team devised an alternative publication strategy: hosting by the UCLA Digital Library Program and joint publication by the Library and Livingstone Online.
Creating a Publication Template and XML Encoding
In July 2010, the Livingstone team published the beta version of Letter from Bambarre through EMEL. This represented an interim solution necessitated by funding and resource constraints. The team transferred the Letter to the UCLA Digital Library once the partnership with UCLA had been established in August 2010. Kristin Jensen of Between the Lines Editing, Ireland, provided pro-bono editorial work, then Wisnicki in collaboration with Sarina Sinick, a student at UCLA, began to refine and develop the electronic edition of the Letter. They completed the edition in May 2011 and formally republished the Letter, with announcements sent out to academic sites. To streamline efforts, the team decided to use the new edition of the Letter as the publication template for the 1871 Field Diary.
Figure 1. Simpson during the spectral imaging
phase in Edinburgh, June 2010.
Simultaneously, Wisnicki in collaboration with Kate Simpson, a research assistant recruited from Edinburgh Napier University, began to transcribe and encode the 1871 Field Diary in XML. The transcription and encoding process got underway in February 2011 as the imaging scientists systematically began to produce the processed spectral images. Wisnicki’s prior XML experience was limited to encoding the exceptionally challenging Letter from Bambarre in TEI P4 using the Livingstone Online tagging guidelines. However, the Livingstone team decided to encode the diary in TEI P5.
As a result, Lisa McAulay, the Librarian for Digital Collection Development at UCLA, now updated the Letter, created an encoding template for Wisnicki, and provided him with strategic encoding support. Later, James Cummings, Manager of InfoDev (Research Support and Data Solutions) at the University of Oxford, also assisted the team in addressing the most difficult encoding issues. Thanks to this assistance, Wisnicki quickly developed TEI P5 proficiency and, in turn, trained Simpson who had no prior tagging experience. Wisnicki also summarized all encoding decisions in a detailed XML TEI P5 Encoding Practices document.
Figures 2 and 3. Lisa McAulay (left) during a break
in supporting the XML encoding of the 1871 Field Diary.
James Cummings (right) tackling thorny XML coding issues.
The decision to encode each folio of Livingstone’s diary separately represented one of the Livingstone team’s most important encoding decisions. With some exceptions, each folio of Livingstone’s diary contains two diary pages, one on the left-hand side, one on the right-hand side. Adjacent pages are almost never continuous because Livingstone stacked, then folded the leaves of the diary to make two copy-books (now disassembled). As a result, the team chose to represent the physical artifact (the manuscript as it is) rather than the semantic artifact (what Livingstone "intended") in order to reflect the current state of the diary and to coordinate with the structure of the data archive, where the images for each folio would reside in a separate directory.
Transcription Outcomes
Livingstone’s words had taken an exceptionally long detour in their travels, but now had almost reached their destination. Wisnicki and Simpson completed the transcription and encoding of the diary in early August 2011. They succeeded in transcribing – and so making accessible – 99% of the diary’s original text for the first time since Livingstone wrote the document 140 years ago. Issues such as fading, blotting, bad handwriting, or missing pieces of the manuscript prevented the remaining 1% from being deciphered. The majority of the issues fell outside the remit of the Livingstone project, which focused on using spectral image processing to separate Livingstone’s words from the printed texts over which he wrote. In other words, the team had a success rate of nearly 100%, an outcome that far exceeded even the most optimistic team predictions and that rendered unnecessary a number of alternative imaging and processing strategies outlined in the original NEH grant application.
Figure 4. Ball while reviewing a photocopy of the
1872 Journal to prepare for XML encoding.
To complete the transcription and encoding process, Wisnicki collaborated with research assistant Heather F. Ball to produce an XML transcription of the corresponding portion of the Last Journals (1874). This transcription refined and corrected the rough transcription available from Project Gutenberg. Wisnicki, Ball, and A.J. Schmitz (Wisnicki’s graduate assistant from Indiana University of Pennsylvania) then corrected and revised the Last Journals transcription to produce an encoded version of the relevant portion of Livingstone’s handwritten 1872 Journal. Schmitz also used macros created for ImageJ by Christens-Barry to produce line-by-line mappings of the 1871 Field Diary.
Thanks to these final efforts, readers of the 1871 Field Diary would be able to study the original words of the diary alongside the revised 1872 Journal and the 1874 published text. They would be able to survey firsthand the vast distance that separated the original and published versions of the diary and, as a result, trace the extent to which subsequent revisions by Livingstone, Waller, and others had transformed the original historical record. The mappings, when incorporated into the XML transcriptions, would also help readers and support software tools to relate passages on the images to the transcriptions and vice versa.
Figure 5. Wisnicki (left) and Schmitz examine the XML transcription of
Livingstone's 1872 Journal at the Center for Digital Humanities and
Culture, Indiana University of Pennsylvania.
The Electronic Edition
Wisnicki collaborated with the staff of the UCLA Digital Library to develop the final format of the electronic edition. This format drew on the Letter from Bambarre template, and so used a similar layout and included many of the critical elements incorporated into the template. However, the team also decided to include a number of new features. Developing these features, however, proved quite challenging, and it was only by sheer force of will that the UCLA Digital Library staff finished the beta version of the site in collaboration with Wisnicki during the last month of the project (October 2011). Additional work on refining the site continued to the spring of 2012.
The team designed the new site features to showcase the scholarly and scientific accomplishments of the Livingstone team. Most importantly, the present electronic edition links to and so provides the gateway to the Livingstone spectral image archive created by Emery.  This archive allows users to download and study all the XML files and raw and processed images (as 8-bit TIFF files with full metadata) produced by the project. As a result, the archive provides access to uncompressed images and critically marked-up texts of, respectively, all the JPEGs and transcriptions published through the electronic edition.
Figure 6. The UCLA whiteboard created by Lisa McAulay in the last
month of the project to divide up and allocate remaining website tasks.
The electronic edition also contains a selection of custom designed web pages:
Images and Transcriptions Enables users to examine processed spectral ratio image versions of individual pages of the 1871 Field Diary alongside transcriptions. The embedded viewer allows users to view, rotate, and enlarge the cropped spectral ratio images.
Color & Spectral Images Allows users to view color and processed spectral images side by side. Users also have the option to enable and disable synchronized scrolling.
Three Versions of the Text Comparative page that enables users to study the three versions of Livingstone’s text (1871 Field Diary, 1872 Journal, 1874 published text).
Search Page Enables users to search through and sort the full text of the 1871 Field Diary by keyword and significant XML-tagged content
Finally, the electronic edition includes a Project History & Archive section (as users who have reached the current page will know), which offers a detailed and, at times, intimate look into the inner workings of The Livingstone Spectral Imaging Project. Users can now not only overview the different phases by which the project evolved, but also download a representative selection of raw working documents produced in the course of the team’s efforts. As a result, the Project History & Archive section both catalogues the extent of people and resources required to undertake a significant spectral imaging and publication project and offers a roadmap for teams wishing to undertake similar projects in the future.
Documents for Download
  1. Livingstone XML Template, McAulay, February 2011
  2. Livingstone XML Template, Wisnicki, March 2011
  3. XML TEI P5 Encoding Practices: Summary Document
  4. XML Element-Attribute-Value list
  5. Website Task List, Wisnicki with notes from McAulay, October 2011
  6. Diagram of Livingstone Website Architecture, McAulay, March 2012
  7. XSLT Rendering Notes, Wisnicki, Spring 2012
  8. Image Rendering Notes, Wisnicki and Schmitz, Spring 2012
Analysis to Dissemination