Digitization in Archives: An Interview with Kathryn Hujda

Kathryn Hujda is an assistant curator at the Performing Arts Archives and Upper Midwest Literary Archives, University of Minnesota Libraries Archives and Special Collections. This is the second part of a two-part interview. Click here to read part one.


MG: What is a special collection?

KH: Special collections contain materials that are unique or rare, and, therefore, subject to specialized storage and security measures. Unlike the books in a circulating library, materials in a special collections library can’t be checked out or loaned like ordinary library books. Instead, they must be accessed in person and/or under staff supervision. These stricter policies are put in place because the objects in archives or special collections are irreplaceable. Typical formats include manuscripts and drafts, correspondence, diaries, still images, moving images, rare books or books with marginalia, and other ephemera.


MG: Once an archive is established for a particular person, institution, etc., are materials added to the collection?

KH: Yes! And more often than one might suspect! It’s not uncommon for writers (or their families) to discover letters squirreled away in odd places or boxes forgotten about in the basement, months or years after a collection is initially established. Some donors prefer to do small deposits over a period of time. We even have established relationships with a few independent publishers that send new materials to the archives on a regular basis as a part of their basic records management.


MG: Many collections all over the world are being preserved digitally. More and more collections become available online, making it easier for the public to access information contained in collections. Are you involved in digitizing collections? Do you see movement in this area in your institution specifically and in general at other institutions?

KH: There’s a very real and urgent need to digitize content in archives and special collections where I work and at other repositories as well. The need partly arises from a desire to stay relevant. In the age of the internet, it’s becoming easier and easier to conduct research without ever leaving your home. More so, however, the need arises from the desire to increase access. As archivists consider ourselves to be guides, not gatekeepers. We strive to keep collections as open and accessible as possible. Digitizing content in order to make it freely available online, or in order to preserve decaying or at-risk artifacts, is just another way to achieve that mission.


MG: Of the collections you assist with preserving, maintaining, and making available to the community, how much is digitized? This seems so valuable in terms of collating information and making research, investigation, and discovery more efficient and available. Are there short- or long-term plans to digitize more content?

KH: When we digitize content at my repository, we typically do so by researcher request or through specific grant projects geared toward digitization and access. Though we have made great strides in the past few years, only a very small percentage of the collections has been digitized. To put things in perspective: the Upper Midwest Literary Archives preserves about 1,500 boxes of materials. Each box includes approximately 1,800 sheets of paper, so that’s about 2,700,000 sheets of paper. Each sheet requires time to create metadata (information to make the object findable) and time to scan. Let’s estimate a modest 5 minutes per object for that work. That’s 13,500,000 minutes, or 225,000 hours, or 5,625 workweeks, or well over 100 years in order to digitize our entire repository. And these figures don’t account for objects that are two-sided, or audio/visual materials, or the new materials we continue to see each year. We are required to be judicious and take advantage of opportunities as they come because there is simply too much material to digitize in full.

There are other barriers to digitization besides time and money. When we receive a new collection, we acquire the rights to the physical objects only. But for a large majority of our collections, copyright is retained by the original copyright holder. Digitizing objects without the permission of the copyright holder is sketchy territory. And it’s particularly problematic for literary collections, where you have hundreds of copyright holders—held by letter writers, presses, and the writer/families themselves.

And even if we were somehow able to digitize an entire collection—regardless of time, money, and permissions required to do so—who’s to say the digital file formats we create will be even still be accessible 100 years from now? These are some of the issues that archivists struggle with when digitizing materials.


MG: I’ve been researching grants available for preservation of collections. It seems like there are a lot of opportunities out there for institutions to find ways of capturing collections digitally.

KH: Financial support for institutions digitizing content is definitely growing. However, as more and more content is created electronically (like camera phone photos or desktop publishing documents, for example), there is also an urgent need to create strategies and best practices for preserving and providing access to born-digital files as well. Dedicated staff and funding go a long way in helping us untangle issues related to digitization for preservation and barriers to digital access.



Melissa Gordon
Advisory Board Member