Data Lifeboat Update 5: Prototypes and policy

[Spectators taking photographs at tomb of John F. Kennedy. Arlington National Cemetery] (LOC)
at The Library of Congress

We are now past the midpoint of our first project stage, and have our three basic prototype Data Lifeboats. At the moment, they run locally via the command line and generate rough versions of what Data Lifeboats will eventually contain—data and pictures.

The last step for those prototypes is to move them into a clicky web prototype showing the full workflow—something we will share with our working group (but may not put online publicly). We are working towards completing this first prototyping stage around the end of June and writing up the project in July.

We’ve made a few key decisions since we last posted an update, namely about who we’re designing for and what other expertise we need to bring in. We still have more questions than answers, but really, that’s what prototyping is for.

Who might do which bit

It took us a while to get to this decision, but once we had gone through the initial discovery phase, it became clear that we need to concentrate our efforts on three key user groups:

Flickr members – People who’ve uploaded pictures to Flickr, have set licenses and permissions, and may either be happy or not happy for their pictures to be put into Data Lifeboats.
Data Lifeboat creators – Could be archivists or other curatorial types looking to gather sets of pictures to copy into archives elsewhere, whether that be an institution like The Library of Congress, or a family archivist with a DropBox account.
Dock operators – This group is a bit more speculative, but, we envision that Data Lifeboats could actually land (or dock) in specific destinations and be treated with special care there. Our ideal scenario would be to develop a network of docks–something we’ve been calling a “Safe Harbor Network”—made up of members that are our great and good cultural organizations: they are already really good at keeping things safe over the long term.

It’ll be good to flesh the needs and wants of these three groups out in more detail in our next stage. If you are a Flickr member reading this, and want to share your story about what your Flickr account means to you, we’d love to hear it.

Web archive vs object archive

Some digital/web preservation experts take the opinion that it’s archivally important to also archive the user interface of a digital property in order to fully understand a digital object’s context. This has arguably resulted in web archives containing a whole lot more information and structural stuff than is useful or necessary. It’s sort of like archiving the entire house within which the shoebox of photos was found.

We have decided that archiving the flickr.com interface itself is not necessary for a Data Lifeboat, and we will be designing a special viewer that will live inside each Data Lifeboat to help people explore its contents.

Analysing the need for new policy

The Data Lifeboat idea is about so much more than technology. Even though that’s certainly challenging, the more we think about it, the more challenging the social and ethical aspects are. It’s gritty, complex stuff, made moreso by the delicate socio-technical settings available to Flickr members, like privacy, search settings, and licensing. The crosshatch of these three vectors makes managing stable permissions over time harder than weaving a complicated textile!

Once we narrowed down our focus to these specific user groups it also became clear that we need to address the (very) complex legal landscape surrounding the potential for archiving of Flickr images external to the service. It’s particularly gnarly when you start considering how permissions might change over time, or how access might shift for different scales of audience. For example, a Flickr member might be happy for Data Lifeboats containing their images to be shared with friends of friends, but a little apprehensive about them being shared with a recognized cultural institution that would use them for research. They may be much less happy for their Flickr pictures to be fully archived and available to anyone in perpetuity.

To help us explore these questions, and begin prototyping policies for each type of user group we foreses, we have enlisted the help of Dr. Andrea Wallace of the Law School at the University of Exeter. She is working with us to develop legal and policy frameworks tailored to the needs of each of these three groups, and to study how the current Flickr Terms of Service may be suitable for, or need adaption around, this idea of a Data Lifeboat. This may include drafting terms and conditions needed to create a Data Lifeboat, how we might be able to enhance rights management, and exploring how to manage expiration or decay of privacy or licensing into the future.

Data Lifeboat prototypes

We have generated three different prototype Data Lifeboats to think with, and show to our working group:

Photos tagged with “Flickrhq”: This prototype includes thousands of tagged images of ‘life working at Flickr’, which is useful to explore the tricky aspects of collating other people’s pictures into a Data Lifeboat. Creating it revealed a search foible, whereby the result set that is delivered by searching via a tag is not consistent. Many of the pictures are also marked as All Rights Reserved, with 33% having downloads disabled. This raises juicy questions about licensing and permissions that need further discussion.
Two photos from each Flickr Commons Member: We picked this subset because Flickr Commons photos are earmarked with the ‘no known copyright restrictions’ assertion, so questions about copying or reusing are theoretically simpler.
All photos from the Library of Congress (LoC) account: Comprising roughly 42,000 photos also marked as “no known copyright restrictions,” this prototype contains a set that is simpler to manage as all images have a uniform license setting. It was also useful to generate a Data Lifeboat of this size as it allowed us to do some very early benchmarking on questions like how long it takes to create one and where changes to our APIs might be helpful.

Preparing these prototypes has underscored the challenges of balancing the legal, social, and technical aspects of this kind of social media archiving, making clear the need for a special set of terms & conditions for Data Lifeboat creation. They also reveal the limitations of tags in capturing all relevant content (which, to some extent, we were expecting) and the user-imposed restrictions set on images in the Flickr context, like ‘can be downloaded.’

Remaining questions?

OMG, so many. Although the prototypes are still in progress, they have already stimulated great discussion and raised some key questions, such as:

How might user intentions or permissions change over time and how could software represent them?
How could the scope or scale of sharing influence how shared images are perceived, updated, and utilized?
How can we understand how different use cases and how archivists/librarians could engage with the Data Lifeboats?
How important is it to make sure Data Lifeboats are launched with embedded rights information, and how might those decay over time?
How should we be considering the descriptive or social contexts that accompany images, and how should they inform subsequent decisions about expiration dates?

Long term sustainability and funding models

It’s really so early to be talking about this – and we’re definitely not ready to present any actual, reasonable, viable models here because we don’t know enough yet about how Data Lifeboats could be used or under what circumstances. We did do a first pass review of some obvious potential business models, for example:

A premium subscription service that allows Flickr.com users to create personalized Data Lifeboats for their own collections.
A consulting service for institutions and individuals who want to create Data Lifeboats for specific archival purposes.
Developing training and certification programs for digital archivization that uses Data Lifeboats as the foundation.
Membership fees for members of the Safe Harbor network, or charging fees for access to the Data Lifeboat archives.

While there were aspects to each that appealed to our partners, there were also significant flaws so overall, we’re still a long way from having an answer. This is something else we’re planning to explore more broadly in partnership with the wider Flickr Commons membership in subsequent phases of this project.

Next steps

This month we’ll be wrapping up this first prototyping phase supported by the National Endowment for the Humanities. After we’ve completed the required reporting, we’ll move into the next phase in earnest, reaching out to those three user groups more deliberately to learn more about how Data Lifeboats could operate for them and what they would need them to do.

Two upcoming in-person events!

We’re also very happy to be able to tell you the Mellon Foundation has awarded us a grant to support this next stage, and we’re especially looking forward to running two small events later in the year to gather people from our Flickr Commons partner institutions, as well as other birds of a feather, to discuss these key challenges together.

If you’d like to register your interest in attending one of these meetings, please let us know via this short Registration of Interest form. Please note, these will be small, maybe 20ish people at each, and registering interest does not guarantee a spot, and we’ve only just begun planning in earnest.

Flocks of Blue Geese and Snow Stop at the Squaw Creek National Wildlife Refuge near Mound City, Missouri…10/1974
at The U.S. National Archives

Zoology specimens, Australasian Antarctic Expedition Reports, 1911-1914

From Desiderata to READMEs: The case for a C.A.R.E.-full Data Lifeboat Pt. II

The second of a two-part blog post detailing possible prompts for Data Lifeboat creators to encourage ethical and informed collecting

Welcoming our first 2025 Research Fellows

Emily Fitzgerald & Molly Sherman are joining us as our first research fellow pair to continue development of their Reproductive Reproductions project.

From Desiderata to READMEs: The case for a C.A.R.E.-full Data Lifeboat Pt. I

The first of a two-part blog post detailing the origins and approaches to ethical archiving in the Data Lifeboat tool.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.