File Naming Conventions

David Livingstone Spectral Image Archive

Authors: Doug Emery
Adrian S. Wisnicki
Date: March 30, 2012

Contents

1 General Conventions

1.1 Folio Designation

All core data files are prefixed with a three-part folio designation, with the exception of files that begin with the DLC297a designation. The 297a files only have a two-part folio designation. There are 208 separate folio designations. The complete list of files is provided in the folio index.

Sample folio designations are:

  • DLC297b_149-146_012r
  • NLS10703_021_003r
  • PB_002_002r
  • DLC297a_035v

The first segment consists of the initials of the institution holding the given manuscript followed by the institutional shelfmark. The only exception to this rule are the folia from the Peter Beard collection, which do not receive a shelfmark in the first segment.

The second segment indicates Livingstone's own page number(s), if provided. Roman numerals have been changed to Arabic numerals to ease reading. In addition, if a given folio contains two pages, the page number sequence in the second segment reflects the order of the pages from left to right in the image. For any portions of the diary where Livingstone does not provide page numbers, we have numbered the folia consecutively beginning with 001, with the same number being used for the recto and verso:

  • NLS10703_001_036r
  • NLS10703_001_039v
  • NLS10703_002_037r
  • NLS10703_002_038v

The third segment indicates the institutional page number(s), if provided. In addition, this segment includes the letter "r" (recto) or "v" (verso). For any portions of the diary lacking institutional page numbers, we have numbered the folia consecutively beginning with 001, with the same number being used for the recto and verso:

  • DLC1120b_001r_001r
  • DLC1120b_001v_001v

Undertext page numbers are not identified in this scheme.

1.2 File-Dependent Components

The remainder of the file name, including the extension, indicates the file type. Broadly, there are four types of files:

  1. TIFF image files, ending in tif
  2. Text metadata files, ending in txt
  3. MD5 checksum files ending in``md5``
  4. XML text files, ending in xml

The meaning of the remaining of the file name components depends on its type.

2 Image Files

Each image file has a structured name that identifies the image content and image capture or processing type. The basic structure is illustrated by the following file name.

  • DLC297b_149-146_012r_0_A_ratio_by_0940.tif

Each segment of the filename is separated by underscore characters. The segments of the sample file are:

  • institutional initials plus shelfmark: DLC297b
  • Livingstone's page numbers, converted from Roman to Arabic numerals: 149-146
  • institutional folio number plus recto or verso designation: 012r
  • shot sequence letter: A
  • imaging and processing details: ratio_by_0940 - processing tag: ratio_by_940
  • extension: always tif

In other words, the named file is of a manuscript at the David Livingstone Centre (DLC) under the shelfmark 297b. The imaged folio contains Livingstone's pages 149 and 146. The David Livingstone Centre has numbered this folio 012 and this is the recto (r) of the folio. The image belongs to the shot sequence A and is a spectral ratio in which the 450 nm, 592 nm and 850 nm wavelengths are divided by the 940 nm wavelength (by_940). Each of these segments is described below.

2.1 Shot Sequence Letters

Value: A letter or group of letters, designating the group of images, the shot sequence, that the image, or the source image or images, this image belongs to.

A in the sample DLC297b_149-146_012r_0_A_ratio_by_0940.tif.

The possible values are: A, B, C, D. Each group of 12 to 16 registered images is assigned a shot sequence. The letters A, B, C, and D indicate the order in which the images were taken. The first group of images taken of DLC297b_149-146_012r is called shot sequence DLC297b_149-146_012r_0_A. If a second set was taken, that second sequence would have been called DLC297b_149-146_012r_0_B. A second set may be taken for any number of reasons, as, for example, an error in the first sequence, an adjustment to the position of the object, or an alternate aperture or exposure setting for one or more of the images in the group.

Note that the sequence letter, A, B, C and D, says nothing about the quality of the images in one sequence with respect to another. One sequence was selected for each folio as the best for that folio.

2.2 Imaging and Processing Details

Value: One or more segments used to distinguish the image and ensure unique file names.

ratio_by_0940 in the sample DLC297b_149-146_012r_0_A_ratio_by_0940.tif

2.2.1 Image Processing

There are several types of images generated by special processing, using raw spectral images as sources. These images are:

Color Images

Type: color
  • File Suffix: color
  • Each color image is created using registered, 16-bit flattened TIFF images captured under five visible illuminant bands, 638 nm (red), 592 nm (amber), 535 nm (green), 505 nm (cyan), and 450 nm (royal blue). A set of linear formulae is used to calculate calibrated color values from the five bands at each pixel position, and each image is output using a CIE L*a*b color space.

PCA Image Types

Type: PCA color
  • File Suffix: pca321r_pcolor, pca421r_pcolor, pca621r_pcolor, pca721r_pcolor
  • Psuedocolor image made up of principal component bands with the hue angle rotated
Type: PCA
  • File Suffix: pca321r, pca421r, pca621r, pca721r, pca321r_1, pca321r_2
  • Grayscale image that is extracted from a single channel of the corresponding pca###r_pcolor image (note: the ### indicates the principal component bands used)
Type: PCA multiplied threshold
  • File Suffix: pca321r_adapThresh_multiply, pca321r_1_adapThresh_multiply, pca321r_2_adapThresh_multiply, pca421r_adapThresh_multiply, pca621r_adapThresh_multiply, pca721r_adapThresh_multiply
  • Grayscale image that is the result of the multiplication of the thresholded grayscale image and the corresponding pca###r image (note: the ### indicates the principal component bands used)

Pseudocolor Image Types

Type: intercept
  • File Suffix: intercept
  • The infrared images (700 nm - 940 nm) were fit to a best straight line on a pixelwise basis. This generates "slope" and "intercept" images.
Type: packflat8
  • File Suffix: 0365_packflat8, 0450_packflat8, 0465_packflat8, 0505_packflat8, 0535_packflat8, 0592_packflat8, 0638_packflat8, 0700_packflat8, 0735_packflat8, 0780_packflat8, 0850_packflat8, 0940_packflat8, RABL_packflat8, RABR_packflat8, RAIL_packflat8, RAIR_packflat8
  • A linear contrast stretch applied to the 16-bit single-wavelength images. The black and white values were set 3 standard deviations away from the average value. The values beyond 3 standard deviations were clipped to black or white.
Type: pseudo_0505-0780
  • File Suffix: pseudo_0505-0780
  • The 505 nm and 780 nm wavelengths are combined in a no-veil pseudocolor image with the 780 in the red separation and the 505 in the blue and green separations.
Type: pseudo_0780
  • File Suffix: pseudo_0780
  • The505 nm and 780 nm wavelengths from one side are put into the red and green separations, respectively. The 505 nm wavelength image of the reverse side is reversed and aligned with the front side, and place in the blue separation.
Type: pseudoratio_0505-0780
  • File Suffix: pseudoratio_0505-0780
  • The 505 nm and 780 nm wavelengths are divided by the 940 nm wavelength and then combined in a standard pseudocolor image.
Type: RAIPratio
  • File Suffix: RAIPratio
  • Left and right raking infrared images are divided by the non-raking 940 nm image and used in a standard pseudocolor image.
Type: raking_irdiff
  • File Suffix: raking_irdiff
  • The left and right raking images in infrared are differenced, divided by the non-raking 940 nm wavelength, then linearly stretched to fit 6 standard deviations from white to black.
Type: RAPRratio
  • File Suffix: RAPRratio
  • The right raking blue and infrared images are divided by the non-raking 940 nm image and used in a standard pseudocolor image.
Type: RARR
  • File Suffix: RARR
  • The right raking blue image is divided by the right raking infrared image and then linearly contrast stretched.
Type: ratio_by_0940
  • File Suffix: ratio_by_0940
  • The 450 nm, 592 nm and 850 nm wavelengths are divided by the 940 nm wavelength, stretched to fit 6 standard deviations from white to black and put into the red, green and blue separations respectively.
Type: RIRL
  • File Suffix: RIRL
  • Left and right raking infrared images are differenced and linearly contrast stretched.
Type: sharpie_0505-0780
  • File Suffix: sharpie_0505-0780
  • The 505 nm and 780 nm wavelengths are combined in a no-veil pseudocolor image with the 780 in the red separation and the 505 in the blue and green separations. The sharpie image is made by linearly stretching the difference of the red and blue separations of the pseudocolor image.
Type: sharpieratio_0505-0780
  • File Suffix: sharpieratio_0505-0780
  • The 505 nm and 780 nm wavelengths are divided by the 940 nm wavelength and then combined in a standard pseudocolor image. The sharpie image is made by linearly stretching the difference of the red and blue separations of the pseudocolor image.

In the processing position, these images have the text provided in the "File Suffix" sections above.

2.2.2 Image Illumination

The PhotoShoot camera software that controls the camera and lights, captures all images in a shot sequence, like DLC297b_149-146_012r_0_A, in a regular order and appends a three-digit serial number to the resulting image 001, 002, 003, and so on, up to 012 or 016. The captured images are Adobe Systems Digital Negative (TM) files, and have the extension .dng. For the sequence, DLC297b_149-146_012r_0_A, they are named thus:

  • DLC297b_149-146_012r_0_A_001.dng
  • DLC297b_149-146_012r_0_A_002.dng
  • DLC297b_149-146_012r_0_A_003.dng
  • ...
  • DLC297b_149-146_012r_0_A_012.dng or DLC297b_149-146_012r_0_A_016.dng

The serial number-to-illumination correspondences are:

  • 001 = 365 nm LED illumination
  • 002 = 450 nm LED illumination
  • 003 = 465 nm LED illumination
  • 004 = 505 nm LED illumination
  • 005 = 535 nm LED illumination
  • 006 = 592 nm LED illumination
  • 007 = 638 nm LED illumination
  • 008 = 700 nm LED illumination
  • 009 = 735 nm LED illumination
  • 010 = 780 nm LED illumination
  • 011 = 850 nm LED illumination
  • 012 = 940 nm LED illumination

and, if raking light illuminations were included in the shot sequence:

  • 013 = 940RR raking 940 nm illumination from the right
  • 014 = 465RR raking 465 nm illumination from the right
  • 015 = 940RL raking 940 nm illumination from the left
  • 016 = 465RL raking 265 nm illumination from the left

Note that illumination details including number of sources, wattage, spectral ranges, and their azimuthal angles are provided in the metadata.

3 Supporting Files

There are two types of supporting files: text metadata 'sidecar' files and MD5 checksum files.

Each image file is accompanied by a text file containing all the image's metadata and an MD5 checksum files. Each file type is documented in the external documentation directory of this archive. Each file has the same name as its parent file with either txt or md5 appended. For the file, DLC297b_149-146_012r_0_A_ratio_by_0940.tif these files will be named:

  1. DLC297b_149-146_012r_0_A_ratio_by_0940.tif.txt, and
  2. DLC297b_149-146_012r_0_A_ratio_by_0940.tif.md5.