Friday, July 17, 2015

ArchivesSpace + BlackLight = ArcLight

We've mentioned elsewhere on this blog that as part of our Mellon-funded Archivematica-ArchiveSpace-DSpace Workflow Integration project, we are exploring other community initiatives (such as ArcLight) to identify possible synergies and integration points with our endeavors. We also mentioned in a post on what ArchivesSpace isn't that we are contributing to (and very excited!) about the ArcLight project out of Stanford. Well, we thought it was time to stop being coy and get on with it. This is a post on ArcLight--what it is, what it isn't, what's next, and how we've contributed to it so far!

Arclight


Taken straight from their website, ArcLight is "an effort to build a Hydra/Blacklight-based environment to support discovery (and digital delivery) of information in archives, initiated by Stanford University Libraries."

Objectives


ArcLight has three preliminary objectives:
  1. It will support discovery of physical and digital objects, including finding aids described using Encoded Archival Description (EAD) and, for the latter, presentation and delivery of digital materials. 
  2. It will be compatible with Hydra and ArchivesSpace. 
  3. It will be developed, enhanced and maintained by the Hydra/Blacklight community.

With regard to the first objective, we try to support discovery of physical and digital objects, but, as with many things, we could always to it better. We got especially excited about the fact that ArcLight hopes to support presentation and delivery of digital materials. Currently, our users have to jump back-and-forth between the system we use to display EADs and the system we use to provide access to our born-digital and digitized collections (and on top of that in order to actually look at those digital objects, end users have to download them) so this would be a major improvement!

With regard to the second objective, compatibility with Hydra and ArchivesSpace is exactly what we need (although not necessarily in that order). We'll be going live with ArchivesSpace (fingers crossed!) sometime between now and March 31st, 2016, and, as part of a MLibrary-wide effort we'll be moving to a Hydra-based implementation of Deep Blue in the "medium-term" (two to three years). In case you don't know the back story on that latter point, shortly after the grant was awarded, MLibrary decided to adopt Hydra as a repository platform. In light of this development, we hope to fulfill grant requirements by integrating DSpace into the workflow, but will give care to ensure that solutions for sharing data and metadata between systems will also be appropriate in a Hydra environment (and, in general, repository agnostic).

We're also excited about the fact that ArcLight could be integrated with ArchivesSpace. The process now to add or update EADs in our online display is rather cumbersome, involving at least three versions of a finding aid, two versions of the EAD, and up to four people (including one that's doesn't even work at the Bentley!) to get them up and make them live. Integration with ArchivesSpace would cut all of those numbers down to one.

Finally, with regard to the third objective, we're interested in something that is developed by the Hydra/Blacklight community for the same reason we're interested in something that is developed by the Archivematica, ArchiveSpace or DSpace community: the community.

ArcLight Design Process


If you're interested in the ArcLight design process, you can read more about it here. In short, it consists of three stages:

Discovery >>> Information Architecture >>> Interaction & Visual Design (and Development)

The second phase of "Discovery," which kicked off their user-centered design process to produce documentation to guide development, and which we have been contributing to, is just now coming to a close.  Stanford hopes to start the actual development of ArcLight by 2016, and I for one can't wait to see what happens!

What It Isn't


Needless to say, we got very excited about all the things that ArcLight might be. But before I go any further, I need to be upfront about three very important things that ArcLight is not:

  1. It isn't the novel.
  2. It isn't the comic.
  3. It isn't the album.


Three of the things that ArcLight is not. [1] [2] [3]

In fact, I think Andrew Berger, Digital Archivist at the Computer History Museum, said it best (in reference to yet another thing that ArcLight is not):


In All Seriousness, What It Isn't


We did actually get very excited about ArcLight, which is why we began contributing to the project in the first place. We see a lot of synergy with what their doing and what we're doing. In fact, sometimes we get so excited thinking about all the things ArcLight and, for that matter, Hydra, could be that we forgot one of their more important characteristics: that they aren't actually anything, at least not yet, at least not for us in a tangible way.

And the danger about that, of course, at least for [over?]eager archivists like ourselves, is that we end up wanting these systems to be everything! As a result, our Stakeholder Interest and Goals document, which I'll discuss in further detail below, is a bit pie-in-the-sky; we can only hope that our high expectations won't set us up for disappointment!

How We've Contributed


We've contributed in two significant ways to the ArcLight project, by conducting some user interviews and by contributing the aforementioned Stakeholder Interest and Goals document. I'll spend a little bit of time talking about the user interviews, but the rest of the post will mostly be devoted to the Stakeholder Interest and Goals document.

User Interviews


First, doing these user interviews was fun! It was especially interesting to hear from the perspective of both archivists (two from Curation and one from Reference) and researchers. It was also a good reminder that, from a usability standpoint, we probably should have been doing interviews (or something like them) all along.

These interviews have been transcribed and will feature heavily in the next stage of the ArcLight design process. In fact, I just received an email this week that they need to comb through the transcripts and note common issues raised, relevant user scenarios, good quotes that indicate user goals, etc. According to Gary Geisler at Stanford, they will likely share some summary/distillation of the interviews with the broader ArcLight design collaborator group later.

Stakeholder Interest and Goals document


We also created a Stakeholder Interest and Goals document. I've embedded it below (just to be fancy), but you can also take a look at it here.


Basically, this document gives an overview of our access stack (is that a thing?), the grant project and our interest in ArcLight, as well as an acknowledgement that we are ignorant about some parts of how this whole thing will work, that some of functionality we're about to describe may in actuality reside in a repository rather than in ArcLight.

Archivists


In case you aren't interested in reading the whole thing, here are some of my favorite archivist goals (with commentary):
Attractive, best-in-class discovery interface for archival content, flexible enough to change as quickly as best practices for web design change.
It only seems to take two years for a website to look ten years old. Weird how that happens.
Support and recognize EAD elements for search and disambiguation.  At the same time, move away from presenting EAD finding aids as static objects. 
The current way we display EADs allows for some search and disambiguation of EAD elements, but for the most part, EADs don't end up looking or feeling a whole lot different online than when printed out. We're excited about all the ways a BlackLight interface might enhance searching and browsing online.
Support and recognize PREMIS rights and conditions governing access/use so these metadata are acted upon in conducting and presenting search results.  Limit access to archival components and digital objects in a range of ways based on PREMIS statements, including full embargo of highly sensitive content (such that this content would not display in public search results).
The PREMIS rights data model is certainly a richer representation of rights information than anything machine actionable that we do at the moment. We're actually gearing up herefor a big discussion of how rights information will be shared in Archivematica and ArchivesSpace later on today! I'm sure a future post on this blog will be devoted to rights and rights management as it relates to the work we're doing for this grant.
Integration with other platforms, like Aeon, Mirlyn (local catalog), HathiTrust, Archive-It, Archivematica, search engines and metadata aggregators such as ArchiveGrid and DPLA.
We want it all! Not everybody (and sometimes not even a majority of people) come to library and archives collections through the website. We want to enable everyone to find what they need however they end up getting here in the first place.
Assessment through search logs, analytics, download reports, etc.
Again, this is something we're trying to be better about. This would probably require authentication, especially as we are more concerned than ever about determining impact of archival collections on, for example, undergraduate education outcomes and success.
Ensure that any updates, revisions, or additions to ASpace descriptive, administrative, and rights metadata are immediately reflected in the ArcLight interface.
This would help with the cumbersome process I described above.

Researchers


Some of my favorite researcher goals include:
Integrate discovery and display/streaming of digitized/born-digital content, so that researchers don’t have to switch from a discovery layer to a repository.
This was mentioned above, but to elaborate, this would actually prevent researchers from having to switch from a discovery layer to multiple repositories and find their way back again, which is currently the case, since we have different repositories for different media types.

Search options:
  • Provide full-text search of indexable content (including OCR’d scans, plain text, PDFs, Word documents, and/or other relevant file formats).
  • Limit searching/browsing/faceting to particular collections 
  • Permit fuzzy searching/approximation so that researchers do not need to know the exact spelling of subjects, names, or other keywords.

A brief "day in the life" look at our server logs showed that users were having a lot of trouble searching our collections. They were getting thrown off by searches that contained typos and by searches that didn't operate Google-style. We'd like for that not to be an issue.
Communicate relationships between physical and born-digital/digitized components of collections in a usable and meaningful way.
Context is something we talk about a lot here. We don't want researchers to limit themselves to our digital collections at the expense of our physical collections just because they are easier to access. Ideally, this aspect of ArcLight would work well for those with experience with archives and those without.
For complex digital objects, display files/groups of content/archival components in a meaningful way. Allow researchers to view all the files in a folder without (or files in a finding aid) without having to go back and forth to the finding aid.
We'd love to see examples of any repository that can do this in a meaningful way for hierarchical, heterogeneous collections of born-digital objects (not disk images). Feel free to leave a comment if this describes your repository!

Library Information Technology


We're currently working this out! As a stakeholder, they represent many other stakeholders, so their task is a bit more daunting!

Recap


So, as a recap, ArcLight is an exciting initiative out of Stanford that you should follow! Our grant project focuses heavily on more of the back-of-the-house aspects of the curation of digital archives, but an attractive and user friendly interface is the second half of the equation! In fact, it's often what attracts donors in the first place!

If you are interested in participating in the design process for ArcLight, please subscribe to the mailing list used for ArcLight design-related announcements and discussions by sending an email to:

arclight-design-join (at) lists (dot) stanford (dot) edu

[1] "Arc-Light-cover" by Source. Licensed under Fair use via Wikipedia - https://en.wikipedia.org/wiki/File:Arc-Light-cover.jpg#/media/File:Arc-Light-cover.jpg
[2] "Arclight (Marvel villain)" by Source. Licensed under Fair use via Wikipedia - https://en.wikipedia.org/wiki/File:Arclight_(Marvel_villain).jpg#/media/File:Arclight_(Marvel_villain).jpg
[3] "Arclightlau" by Source. Licensed under Fair use via Wikipedia - https://en.wikipedia.org/wiki/File:Arclightlau.jpg#/media/File:Arclightlau.jpg

No comments:

Post a Comment