Monday, April 16, 2018

Specifications for Analog Video Digitization: Examining Community Practices


Here at the Bentley Historical Library, our inventory includes substantial amounts of
moving image materials in a wide variety of obsolete formats. Additionally, we have
prioritized a number of significant moving image collections for digitization in the near
future. In order to ensure preservation and access over the long-term, we are
seeking to formalize our digitization strategy by establishing a contract with a vendor.
To initiate this process, we are writing a Request for Proposals (RFP) for moving
image digitization. One important part of the RFP is outlining detailed specifications
for the transfer of analog materials to digital files. Our main goals for developing
specifications for the RFP are to comply with community best practices as well as
meet the needs of the Library, our technical infrastructure, and our researchers.

There is currently no consensus in the library and archives community on a target
preservation format for analog video. In order to better understand current practices,
I began collecting and comparing specifications across institutions. After a thorough
online search for analog video specifications for digitization, I discovered documentation
from fourteen organizations, including university, public, and state libraries and archives,
that have made their specifications openly available. To make review and comparison of
the specification easier, common factors, such as wrapper and codec information, were
extracted and recorded in a spreadsheet. The following findings are synthesized from
the aggregated specifications.


Specification Documentation. Presentation of specifications and terminology varied
widely across institutions, however, most of the specifications themselves were very
similar. Trends emerged such as common wrapper and codec pairings, color space,
and chroma subsampling. Many institutions provided somewhat limited or incomplete
specifications compared to others. For example, a number of institutions did not include
any specifications for accompanying audio. Some specifications included multiple options
for a particular specification often citing one as preferred and another as acceptable.

Most Common Wrapper and Codec. Quicktime (.mov)/Uncompressed (v210) is the most
commonly used wrapper and codec pairing followed by Matroska (.mkv)/ffv1 (see table and
charts below). In the past few years a growing number of institutions have adopted Matroska
(.mkv)/ffv1. This trend may be due to increased community support including active tool
development and standardization efforts as well as storage considerations.

Number of Institutions Using Pairing
Quicktime (.mov)
Uncompressed (v210)
Matroska (.mkv)
Other pairings include: AVI/ffv1, AVI/JPEG2000, MXF/ffv1, MXF/Uncompressed or JPEG2000

File Size by Codec. The codec selected for a digitization project has a significant impact on
the amount of data produced. Some institutions choose a lossless compression, such as
JPEG2000 or ffv1 to reduce data while maintaining a faithful copy of analog source material.
Based on a blind sampling, Indiana University found that using the Uncompressed (v210)
codec produced files of approx. 100 GB per hour of content. The ffv1 codec averaged 33.2 GB
per hour. Choosing a lossless compression was expected to reduce the amount of data
produced for their entire project by approximately 65%.

Forms response chart. Question title: Wrapper/File format. Number of responses: 14 responses.

Forms response chart. Question title: Video compression/Codec. Number of responses: 14 responses.

Beyond Wrapper and Codec

Frame Size and Aspect Ratio. Most institutions specified an aspect ratio of 4:3 with a
frame size of 720 x 486 for standard definition video. Some varying specifications include
a frame size of 640 x 480 (SD) or 486 x 720 and an aspect ratio derived from the source
material (“Same as source”).

Color Space and Bit Depth. Color space was always YUV/YCbCr with 4:2:2 chroma
subsampling. YUV and YCbCr are often used interchangeably, but YUV is an analog
encoding whereas YCbCr is digital. Although YCbCr is technically more accurate, YUV
is an industry accepted term and understood to mean YCbCr when referring to digital
video. The requirement for bit depth was most often 10-bit, however, a few institutions
allowed for 8- or 10-bit.

Frame Rate and Scanning. Frame rate was most commonly maintained from the analog
source, however some specified 29.97 or 30 fps. One organization required 60 fps based
on the use of interlaced scanning. Three specifications required interlaced scanning while
three others maintain original scanning. Only one organization chose progressive scanning.

Timecode and Closed Captioning. When included, specifications for timecode and
closed captioning were always to maintain the original. For time code, additional instructions
were sometimes included for adding synthetic time coding when no original exists.

Audio Specifications. When specifications for accompanying audio were included, most
institutions required files to be: Uncompressed PCM, 48 kHz, 24-bit, with channels same as
source. Some variations include one organization specifying 2 channel audio and another
allowing for 16- or 24-bit resolution.

I hope these findings will be helpful to others who might be in the process of writing an
RFP or selecting a preservation format for their video materials. It is important to mention
that this sampling is by no means exhaustive. I have since discovered additional specifications,
however, we found this sample size sufficient for our comparison. Feel free to get in touch with
questions or for more information about this work.

In an upcoming post, Melissa Hernández-Durán, Lead Archivist for Audio Visual Curation,
and I will write about our experiences developing metadata requirements for moving image
digitization. Stay tuned!

Friday, February 2, 2018

Conservation Treatment Tiers: An Aid to Prioritization

Staff members often need to know how much time a repair might take in order to prioritize work or to give an estimate to a donor who would like to sponsor a project. In 2017 the staff in the Bentley Conservation Lab devised a more comprehensible method of estimating repair time. A three-tier system didn’t seem detailed enough so we started with four and tweaked it over the next couple months until we settled on our five-tier system.
                Our Tier One category (less than one-hour repair) responds to requests for a quick fix-- examples below. Tier Five designates projects that are very involved and will take more than ten hours. There is a lot of area between “less than one hour” and “more than ten” so we broke it down into three more tiers that fit with our most common types of projects.
                The legend (below right) hangs in our lab for easy reference.  The bar graph is useful in reporting to our administration (through the Business Intelligence Committee) about the types of projects we handle and how long they take. It doesn’t report ongoing work, just the projects that have been completed each month.

Graph for Business Intelligence Report and legend for Conservation Lab

Tier 1: < 1 Hr.

                 A Tier One item might be popped in between longer projects or at the end of a day when starting a larger project doesn’t seem efficient. A Tier One is often done immediately because it is needed by the digitization lab which makes it high priority. Another example is when a researcher in our reading room requests an item and Reference staff finds the item in such need of repair that it might be damaged in handling.  Some examples are mending small tears, ironing wrinkles, and removing sewing or staples.

Ironing on a quick mend

Tier 2: 1 - 3 Hrs.

               Tier Two covers slightly more time-consuming repairs such as making portfolio-style boxes or encapsulating scrapbook leaves when they are too fragile to be rebound but must be protected.

Scrapbook pages encapsulated in polyester film

Inside view of a portfolio style box

A finished portfolio style box

Tier 3: 3 - 5 Hrs.

               Examples of Tier Three jobs are mending maps or drawings, depending on the extent of the tears and number of items. The photos show tears in a map and previous tape repair that needs to be removed from fragile tracing paper.

Damaged drawings on tracing paper drawing

Multiple types of tape on tracing paper drawing

Map torn and separated at the fold

Tier 4: 5 - 10 Hrs.

               This Ann Arbor Film festival document was a hand-made scroll with many types of tapes and adhesives, definitely Tier Four, as were the founding documents of the University of Michigan Philosophical Society. The book was in pieces and so important to the university’s history that it was given a ¾ leather binding.

12 foot Ann Arbor Film Festival collaged scroll

2 images of the scroll, detailing tape, adhesive and loose items

University of Michigan Philosophical Society founding documents, before treatment

Inside detail

Detail of rusty staples and worn signature folds

Finishing the ¾ leather binding

Tier 5: > 10 Hrs. 

         Tier Five projects are those that take over ten hours and we try to estimate just how many that might be. In this case we had a scrapbook of extremely acidic and crumbling paper with newspaper articles that were fragile, wrinkled and torn. We photographed each page before removing the items then used those photos for proper placement on the new pages.  The new scrapbook was larger so the articles could be displayed without overlapping.

Scrapbook, before treatment

Scrapbook pages were numbered and photographed for identification

The photos were used to match fragments of articles for proper placement

Reconstructed articles and polyester film pockets on new scrapbook leaves

Original Cover

Finished scrapbook- at long last!

               Our treatment tiers are serving their purpose and mesh well with the Bentley's system of prioritization. (Hint: it involves a COLORFUL spreadsheet!) More about that in our next rip-roaring installment.