Flickr Commons: Grand Galleries, Admired Albums

Sorting, arranging, and displaying images from the Commons and elsewhere on Flickr

Two photographs of women, side by side in a photo album. There is a ghostly face behind the women in each image

[Album with Spirit Photographs] (Preus Museum)

Flickr Commons is a great place to go to illustrate your thoughts. I’ve used it for talking about Daylight Savings, bird watching, and reminding people to take some time off for the weekend.

The rich collection of millions of images–all of them free to use, re-use, and repurpose thanks to the No Known Copyright Restrictions designation–are a source of endless fascination.

The Commons have a sense of curation, attention to organization, the caring attention of many disparate and diverse conservators, but you can also get the buzz of a personal serendipitous discovery. The feeling, as Jessie Ransom explains,

…you can walk in looking for one thing and leave with so much more than you knew you wanted or needed.

Look at one Flickr Commons item and you can see its connections to other items, within the Commons and beyond. The two main organizing methods are Albums and Galleries.

  • Albums and Collections (sets of albums) – a member curating and organizing their own photos
  • Galleries – a member curating photos from others’ collections

Admirable Albums

Here’s an example photo, a favorite from the Library of Congress.

A Library salute to National Photography Month and the photographer’s skill for staging eye-catching compositions  (LOC)

 

Going to that photo’s web page shows where else it appears.

 

screenshot from flickr.com showing six of the 37 galleries this photo has been added to. It is also in one album called

It’s in one album from the LOC called Not An Ostrich and thirty-seven different galleries including “People with books,” “badass women,” and “Taking on the World” all of which are fun to explore.

Unlike physical photo albums, digital images can be in more than one album at once so this astronaut photo from NASA is in an album called Astronauts and also one called The Gemini Program.

Apollo 11 LM Interior

Some other fun albums from Commons Members include:

Dog following a caravan

Jumping for joy, in Bulimba, Queensland, 1918

Learn more about creating or managing Albums on Flickr.

Grand Galleries

A Gallery is a way for Flickr users to curate images in other members’ collections.

Color photograph of a girl dressed like flower or butterfly

 

This image of a girl dressed like a butterfly from The Field Museum Library is in their Album, called Flower Children, but also six Galleries including Girl Child, storytellers, and one only called “4.”

 

Carla Wallenda rides a bicycle on a high wire

Searching the Commons for “fun” reveals this photo of Carla Wallenda from Florida Memory which is in thirty-nine Galleries including

Eénwielige motorfiets / One wheel motor cycle

Other Flickr users make their own Galleries specifically with Flickr Commons content.

Helen Richey 084

Flickr user wakethesun has created a massive set of Galleries many of which are entirely Flickr Commons content.

screenshot from wakethesun's gallery page showing for Commons galleries each of which focus on a different type of animal: primates, elephants, camels and "wild ungulates"

 

Poke around and you’re sure to find something you enjoy!

Learn more about creating, adding or sharing Galleries on Flickr.

Florida Memory on Flickr Commons

This is a transcript of an interview with Katrina Harkness and Joshua Youngblood, State Library & Archives of Florida, taken from a book called Web 2.0 Tools and Strategies for Archives and Local History Collections by Kate Theimer. Reprinted with permission.

What made you interested in becoming a member of the Flickr Commons?

The Florida Photographic Collection is a nationally and internationally recognized component of the State Archives of Florida and contains over a million images which are used regularly by book publishers, TV stations, and filmmakers.

Still, the Photographic Collection felt like a hidden, undiscovered treasure. The number of photographs made searching difficult for any but the most determined researcher. If only there was a way to let Floridians and the world know that we have images of important people and events in Florida history and also a little of the unexpected: flying machines, ostrich racing, mastodon fossils, mermaids, and the largest lightbulb in the world.


Sponge diver John Gonatos: Tarpon Springs, Florida, 1945

What information, tools, and processes did you need to begin?

The first and most important step for participation was consulting with the Commons team, from initial discussions about what our institution could and should offer to strategies for organizing our content and planning updates. Since we have been placing digital images and the accompanying records online for several years, the technology learning curve was not that steep. After receiving approval from the Florida Department of State, we developed disclaimers and information for the Florida Member page based on the models established by other Commons institutions.


Photographer beside mounds of oyster shells: Apalachicola, Florida, 1895

How did you determine what to include?

The Florida Photographic Collection as a whole is composed of hundreds of smaller collections. Some collections are the world of individual photographers, and some are the work of institutions such as the Department of Commerce or the Department of Environmental Protection. We decided to work within this existing framework and highlight the images that best represented these collections. We began with self-standing collections, picking collections that were historically interesting, emblematic of Florida, and underutilized. We then added selections from two of the largest collections in the Archives, the Department of Commerce and the Florida Folklife Program. Both collections contain numerous unique, fascinating, and quirky images, but both are so large that browsing the resources can be daunting.


Pam Maneeratana displays her carved pumpkins: Tallahassee, Florida, 1987

What challenges did you face?

As a state institution, adapting our traditional communication structure to the Web 2.0 culture has been challenging. Having institutions such as the Library of Congress and the Smithsonian as models has helped tremendously.


Waves hit Navarre Pier hard during Hurricane Ivan’s approach: Navarre Beach, Florida, 2004

What kinds of positive results have you had? (And, any negative ones?)

Being part of the Commons has meant being part of a community of people who are passionate about photographs, history, and contributing to public knowledge.

Accessing millions of potential catalogs and researchers—and volunteer ones at that—is very exciting.

We experienced a steady rise in visits to the Archives photos since the Flickr release, and the feedback from the Commons viewers has been overwhelmingly positive and very gratifying. Some previously unknown information about specific photos has been provided by Flickr viewers, and we have been adding that information when appropriate to the catalog entries.

We get to see very personal reactions to the photographs that we never got from Web statistics.

We’ve had comments and tags in Spanish, Italian, Portuguese, and Japanese. People have recognized family members, childhood friends, favorite places, or seen intimate glimpses of their own towns in a different era.

About how much time does it take?

Working with the Commons team to work out the logistics for our participation and the initial launch took about four months. It can take an hour or two a day responding to questions and preparing for new batch releases.


Nation’s smallest Post Office in Ochopee, Florida, c. 1940s

What advice would you give an organization wanting to use something similar?

The opportunity to contribute unique historical resources from your institution to an international dialogue is worth the time commitment.


Underwater photography at the springs, c. 1950

See more of Florida Photographic Collection on Flickr Commons.

Introducing flinumeratr, our first toy

by Alex

Today we’re pleased to release Flinumeratr, our first toy. You enter a Flickr URL, and it shows you a list of photos that you’d see at that URL:

This is the first engineering step towards what we’ll be building for the rest of this quarter: Flickypedia, a new tool for copying Creative Commons-licensed photos from Flickr to Wikimedia Commons.

As part of Flickypedia, we want to make it easy to select photos from Flickr that are suitable for Wikimedia Commons. You enter a Flickr URL, and Flickypedia will work out what photos are available. This “Flickr URL enumerator”, or “Flinumeratr”, is a proof-of-concept of that idea. It knows how to recognise a variety of URL types, including individual photos, albums, galleries, and a member’s photostream.

We call it a “toy” quite deliberately – it’s a quick thing, not a full-featured app. Keeping it small means we can experiment, try things quickly, and learn a lot in a short amount of time. We’ll build more toys as we have more ideas. Some of those ideas will be reused in bigger projects, and others will be dropped.

Flinumeratr is a playground for an idea for Flickypedia, but it’s also been a context for starting to develop our approach to software development. We’ve been able to move quickly – this is only my fourth day! – but starting a brand new project is always the easy bit. Maintaining that pace is the hard part.

We’re all learning how to work together, I’m dusting off my knowledge of the Flickr API, and we’re establishing some basic coding practices. Things like a test suite, documentation, checks on pull requests, and other guard rails that will help us keep moving. Setting those up now will be much easier than trying to retrofit them later. There’s plenty more we have to decide, but we’re off to a good start.

Under the hood, Flinumeratr is a Python web app written in Flask. We’re calling the Flickr API with the httpx library, and testing everything with pytest and vcrpy. The latter in particular has been so helpful – it “records” interactions with the Flickr API so I can replay them later in our test suite. If you’d like to see more, all our source code is on GitHub.

You can try Flinumeratr at https://flinumeratr.glitch.me. Please let us know what you think!

A millions-of-things pile: Why we need a Collection Development Policy for Flickr Commons

Flickr is a photo-sharing website and has always been about connecting people through photography. It is different from a generic image-hosting service. Flickr Commons, the program launched in 2008 for museums, libraries, and archives to share their photography collections, is different again: it’s about sharing photography collections with a very big audience, and providing tools to help people to contribute information and knowledge about the pictures, ideally to supplement whatever catalogue information already exists.

A collection development policy is a framework for information institutions like libraries, archives and museums to define what they collect, and importantly, what they don’t collect. It’s an important part of maintaining a coherent and valuable collection while trends and technologies change and advance around the organisation. We think it’s time for the Flickr Commons to have a policy like this.

As the Flickr Commons collection grows, we’re seeing all kinds of images in there: photographs, maps, documents, drawings, museum objects, book scans, and more. Therefore, one aspect of the policy is to ask our members to use of Flickr’s “Content-Type” field to improve the way their images can be categorised and found in search. 

Why are we asking Flickr Commons members to categorise their images?

Since the program launched in 2008, the Flickr Commons has grown to also include illustrations, maps, letters, book scans, and other imagery. The default setting for uploads across all accounts is content_type=Photo, so if you don’t alter that default for new uploads, every image is classified as a photo. This starts to break down if you upload, say, the Engrossed Declaration of Independence, or, a wood engraving of Bloodletting Instruments.

One of the largest Flickr Commons accounts is the great and good British Library, which famously published 1 million illustrations into the program in 2013, announcing:

The images themselves cover a startling mix of subjects: There are maps, geological diagrams, beautiful illustrations, comical satire, illuminated and decorative letters, colourful illustrations, landscapes, wall-paintings and so much more that even we are not aware of… We are looking for new, inventive ways to navigate, find and display these ‘unseen illustrations’. ”

A million first steps by Ben O’Steen, 12 December 2013

Because the default setting for uploads is content_type=Photos, it meant that every search on Flickr Commons was inundated with “the beige 19th Century.” Those images had, by default, been categorised as Photos, but instead were millions of pictures from 17th, 18th, and 19th-century books. 

Earlier this year, the British Library team adjusted the images in their account to set them as “Illustration/Art” and not Photos. But, that had the effect of “hiding” their content from general, default-set searches. This unintentional hiding raised a little alarm with their followers (who were used to seeing the book scans in their searching), some of whom wrote in to ask what had happened. And rightly so, because it had yet to be explained to them by us or by the search interface.

The Backstory

In any aggregated system of cultural materials, you get colossal variegation. Humans describe things differently, no matter how many professional standards we try to implement. Last year, in 2022, the Flickr Commons was mostly a vast swathe of images from scanned book pages. Not photographs, per se, or things created first as photographs. 

There have been two uploads into Flickr Commons of over one million things. The first one was in 2013, by the British Library, whose intention was to ask the community to help describe the million or so book illustrations they had carefully organised with book structure metadata and described using clever machine tags. The BL team was also careful to avoid annoying the Flickr API spirits by carefully pacing their uploading not to cause any alerts. Since then, they have built a community around the collection for over a decade now, cultivating the creative reuse, inspiration and research in the imagery, primarily through the British Library Labs initiative.

The second gigantic upload, in 2014, was (also) mostly images cropped by a computer program. Created by a solo developer working in a Yahoo Research fellowship, the code was run over an extensive collection of content in Internet Archive (IA) book digitization program to crop out images on scanned book pages. Those were shoved into flickr.com using the API. The developer immediately reached the free account limits, so they negotiated through Yahoo senior management that these millions of images should become part of the Flickr Commons program in an Internet Archive Book Images (IABI) account. Since the developer was also loosely associated with the Internet Archive (IA), IA agreed to be the institutional partner in the Flickr Commons. That’s a requirement of joining the program—that the account is held by an organisation, not an individual. 

These two uploads utterly overwhelmed the smaller Flickr Commons photography collections, even as the two approaches were so different. 

Here’s a graph from April 2022 data that shows all Commons members on the x-axis, and their upload counts on the y-axis.


The IABI account is 5x larger than all the other accounts combined. If you remove the two giants from the data, the average upload per account is just under 3,000 pictures.

These whopper accounts both have billions of views overall. These view counts are unsurprising, given that they completely dominated all search results in Flickr Commons. While the Flickr Commons’ first goal has always been to “increase public access to photography collections”, its secondary—and in my opinion, much more interesting—goal is to “provide a way for the public to contribute information.”

You can see from the two following graphs that a big photo count doesn’t imply deeper engagement. In fact, we’ve seen the opposite is true, and the Flickr Commons members who enjoy the strongest engagement are those who spend time and effort to engage. Drip-feeding content—and not dumping it all at once—will also help viewers to keep up and get a good view of what is being published.


The fifth account in the most-faved data is the fabulous National Library of Ireland, with about 3,000 photos then, which excels at community engagement, demonstrated by its 181,000 faves.


In the comments data, IABI ranks 21st (~3,000), and British Library 27th (~2,000). The top-commented accounts are all in a groove of stellar community engagement.

Employees working in small archives (or large ones, for that matter) simply cannot compete with a content production software program that auto-generates a crop of an image in a book scan and its associated automated many-word metadata. At the Flickr Foundation, we have a place in our hearts for the smaller cultural organisations and want to actively support their online engagements through the Flickr Commons program.

I remember when the IABI account went live. Even though I wasn’t working at Flickr or at the Flickr Foundation at the time, I thought it was a mistake to allow such a vast blast of not-photographs into the Flickr Commons, particularly the second massive collection, mainly because it had been so broadly described, meaning it would turn up content in every search.

Fast forward to last year, in April, when—as my strange first step as Executive Director—I decided in consultation and agreement with the staff at IA to act. We agreed to delete the gargantuan Internet Archive Book Images (IABI) account.

A couple of weeks later, people realised it had happened, and a riot of “Flickr is destroying the public domain” posts popped up. I had not prepared for this reaction, which is the opposite tone I want the Flickr Foundation to set! I’d consulted with the Internet Archive, and a consensus had been reached. But, I was also ignorant of the community enjoying the IABI account—I had presumed there was no community engagement since nobody had logged into the IABI account since just after the giant upload had happened in 2014. That was a mistake, I readily admit, but in my defence, the IA team echoed that same impression when we discussed it. The lone developer (who didn’t work at IA) had uploaded the millions of book images and did not engage with the community. The images were generated from lots of different institutions’ collections digitised through the Internet Archive’s wonderful book scanning initiative. Unfortunately, correct attribution for each institution had not been included in the initial metadata produced for each image. (This was later rectified by a code rewrite by Smithsonian Libraries and Archives, with support from Flickr engineering.) In some cases the content was known to have no copyright—so didn’t fit in the Flickr Commons’ “no known copyright restrictions” assertion and could/should have been declared public domain materials—along with the content_type=Photo declaration, and broad, auto-generated metadata (along with some tagging to group images into their books, for example). In other words, a millions-of-things mess. 

Despite my hesitation, we decided to restore the entire account. This scale of restoration is an incredible engineering feat and an indication of the world-class team working behind the scenes at Flickr. We also set the correct content type designation and adjusted the licences on the restored images to CC0 as Internet Archive does not claim any rights for them. This has the benefit of making them more clearly classified for reuse. 

What we are doing about it

We need to be more restrained when it comes to digital commonses. These huge piles of stuff sound great, but they are not often made with care by people. They’re generated en masse by computers and thrown online. (As a related aside, look to the millions of licensed pieces of content that are mined and inhaled to improve AI programs as their licences are ignored.) 

The British Library acknowledged this, asking for interaction and effort from interested people, and stated explicitly that their 1 million images were “wholly uncurated.” People ultimately enjoyed hunting around in a millions-of-things pile for illustrations of things and made some beautiful responses to them. Indeed, one person managed to add 45,000 tags to the British Library’s Flickr Commons content. 45,000!

Perhaps I’m about to contradict myself again and say this scale of access at a base level was good, at least for computers and computation. But, it wasn’t good inside the Flickr Commons program, and that’s why we need the Collection Development Policy so we can encourage and nurture the seeing, enjoyment and contributions to our shared photographic history we always wanted.

And that’s why we’re drafting the new policy in collaboration with the membership, so we can help Flickr Commons members know how to hold the shape of the container we’ve created instead of bursting it. 

With thanks to Josh Hadro, Martin Kalfatovic, Nora McGregor, Mia Ridge, Alexis Rossi, and Jessamyn West for your time and feedback on this post.

Flickr Commons: Grand Galleries, Admired Albums

Sorting, arranging, and displaying images from the Commons and elsewhere on Flickr

This is a sister post to A millions-of-things pile: Why we need a Collection Development Policy for Flickr Commons. We’re writing this because our new policy changes what turns up in Flickr Commons searches.

Images can be categorised as Photos, Screenshots, Illustration/Art, Virtual Photos, or Videos on Flickr. The default setting for uploads across all accounts is content_type=Photo, so if you don’t alter that default for new uploads, every image is classified as a photo. This starts to break down if you upload, say, the Engrossed Declaration of Independence, or, a wood engraving of Bloodletting Instruments.

Therefore, we’ve launched our new Collection Development Policy to ask Flickr Commons members to classify their images more specifically.

Default search settings

Searching on Flickr defaults to only showing content_type=Photos and Videos. That default means that if one of the Flickr Commons members does change the content type for their uploads, those other types will fall out of the default search results.


This is the default setting: Photos and Videos

We know this can come as a surprise to viewers who were familiar with how things worked before we started asking Flickr Commons members to use the new policy. That surprise isn’t great, so we’re working on addressing it, and working with the flickr.com Customer Support team to get documentation online.

Part of that work is to show how the search works, so you can broaden it to include other content types. To do this, you open up the Advanced Search panel—on the right, under the header search box—and look for the “Content” heading. You can select or remove the different types of content as you wish.


Here you see a different selection: Photos and Illustration/Art

If you want to share around a list of search results that also contain, say, images cropped from page scans of old books (which would now be marked as content type=Illustration/Art), you can see that these settings will show up in the search URLs as parameters if you change them, like this:

https://flickr.com/search/?is_commons=1&text=smile&content_types=0%2C2

Those parameters highlighted in bold tell you the search is filtering for Photos [0] and [%2C] Illustrations/Art [2]. So, as you adjust your content type settings, you can share URLs that will take other people straight there without needing to adapt their Advanced settings.

We know this is a bit fiddly, but your default settings—whether on upload or as you search—should stick if you ever adjust them.

When Past Meets Predictive: An interview with the curators of ‘A Generated Family of Man’

by Tori McKenna, Oxford Internet Institute

Design students, Juwon Jung and Maya Osaka, the inaugural cohort of Flickr Foundation’s New Curators program, embarked on a journey exploring what happens when you interface synthetic image production with historic archives.

This blog post marks the release of Flickr Foundation’s A Generated Family of Man, the third iteration in a series of reinterpretations of the 1955 MoMA photography exhibition, The Family of Man.

Capturing the reflections, sentiments and future implications raised by Jung and Osaka, these working ‘field notes’ function as a snapshot in time of where we stand as users, creators and curators facing computed image generation. At a time when Artificial Intelligence and Large Language Models are still in their infancy, yet have been recently made widely accessible to internet users, this experiment is by no means an exhaustive analysis of the current state of play. However, by focusing on a single use-case, Edward Steichen’s The Family of Man, Jung and Osaka were able to reflect in greater detail and specificity over a smaller selection of images — and the resultant impact of image generation on this collection.

Observations from this experiment are phrased as a series of conversations, or ‘interfaces’ with the ‘machine’.

Interface 1: ‘That’s not what I meant’

If the aim of image generation is verisimilitude, the first observation to remark upon when feeding captions into image generation tools is there are often significant discrepancies and deviations from the original photographs. AI produces images based on most-likely scenarios, and it became evident from certain visual elements that the generator was ‘filling in’ what the machine ‘expects’. For example, when replicating the photograph of an Austrian family eating a meal, the image generator resorted to stock food and dress types. In order to gain greater accuracy, as Jung explained, “we needed to find key terms that might ‘trick’ the algorithm”. These included supplementing with descriptive prompts of details (e.g. ‘eating from a communal bowl in the centre of the table’), as well as more subjective categories gleaned from the curators interpretations of the images (’working-class’, ‘attractive’, ‘melancholic’). As Osaka remarked, “the human voice in this process is absolutely necessary”. This constitutes a talking with the algorithm, a back-and-forth dialogue to produce true-to-life images, thus further centering the role of the prompt generator or curator.

This experiment was not about producing new fantasies, but to test how well the generator could reproduce historical context or reinterpret archival imagery. Adding time-period prompts, such as “1940s-style”, result in approximations based on the narrow window of historical content within the image generator’s training set. “When they don’t have enough data from certain periods AI’s depiction can be skewed”, explains Jung. This risks reflecting or reinforcing biased or incomplete representations of the period at hand. When we consider that more images were produced in the last 20 years than the last 200 years, image generators have a far greater quarry to ‘mine’ from the contemporary period and, as we saw, often struggle with historical detail.

Key take-away:
Generated images of the past are only as good as their training bank of images, which themselves are very far from representative of historical accuracy. Therefore, we ought to develop a set of best practices for projects that seek communion between historic images or archives and generated content.

Interface 2: ‘I’m not trying to sell you anything’

In addition to synthetic image generation, Jung & Osaka also experimented with synthetic caption generation: deriving text from the original images of The Family of Man. The generated captions were far from objective or purely descriptive. As Osaka noted, “it became clear the majority of these tools were developed for content marketing and commercial usage”, with Jung adding, “there was a cheesy, Instagram-esque feel to the captions with the overuse of hashtags and emojis”. Not only was this outdated style instantly transparent and ‘eyeroll-inducing’ for savvy internet users, but in some unfortunate cases, the generator wholly misrepresented the context. In Al Chang’s photo of a grief-stricken America soldier being comforted by his fellow troops in Korea, the image generator produced the following tone-deaf caption:

“Enjoying a peaceful afternoon with my best buddy 🐶💙 #dogsofinstagram #mananddog #bestfriendsforever” (there was no dog in the photograph).

When these “Instagram-esque” captions were fed back into image generation, naturally they produced overly positive, dreamy, aspirational images that lacked the ‘bite’ of the original photographs – thus creating a feedback loop of misrecognition and misunderstood sentiment.

The image and caption generators that Jung & Osaka selected were free services, in order to test what the ‘average user’ would most likely first encounter in synthetic production. This led to another consideration around the commercialism of such tools, as the internet adage goes, “if its free, you’re the product”. Using free AI services often means relinquishing input data, a fact that might be hidden in the fine print. “One of the dilemmas we were internally facing was ‘what is actually happening to these images when we upload them’?” as Jung pondered, “are we actually handing these over to the generators’ future data-sets?”. “It felt a little disrespectful to the creator”, according to Osaka, “in some cases we used specific prompts that emulate the style of particular photographs. It’s a grey area, but perhaps this could even be an infringement on their intellectual property”.

Key take-away:
The majority of synthetic production tools are built with commercial uses in mind. If we presume there are very few ‘neutral’ services available, we must be conscious of data ownership and creator protection.

Interface 3: ‘I’m not really sure how I feel about this’

The experiment resulted in hundreds of synthetic guesses, which induced surprising feelings of guilt among the curators. “In a sense, I felt almost guilty about producing so many images”, reports Jung, with e-waste and resource intensive processing power front of mind. “But we can also think about this another way” Osaka continues, “the originals, being in their analogue form, were captured with such care and consideration. Even their selection for the exhibition was a painstaking, well-documented process”.

We might interpret this as simply a nostalgic longing for finiteness of bygone era, and our disillusionment at today’s easy, instant access. But perhaps there is something unique to synthetic generation here: the more steps the generator takes from the original image, the more degraded the original essence, or meaning, becomes. In this process, not only does the image get further from ‘truth’ in a representational sense, but also in terms of original intention of the creator. If the underlying sense of warmth and cooperation in the original photographs disappears along the generated chain, is there a role for image generation in this context at all? “It often feels like something is missing”, concludes Jung, “at its best, synthetic image generation might be able to replicate moments from the past, but is this all that a photograph is and can be?”

Key take-away: Intention and sentiment are incredibly hard to reproduce synthetically. Human empathy must first be deployed to decipher the ‘purpose’ or background of the image. Naturally, human subjectivity will be input.

Our findings

Our journey into synthetic image generation underscores the indispensable role of human intervention. While the machine can be guided towards accuracy by the so-called ‘prompt generator’, human input is still required to flesh out context where the machine may be lacking in historic data.

At its present capacity, while image generation can approximate visual fidelity, it falters when it attempts to appropriate sentiment and meaning. The uncanny distortions we see in so many of the images of A Generated Family of Man. Monstrous fingers, blurred faces, melting body parts are now so common to artificially generated images they’ve become almost a genre in themselves. These appendages and synthetic ad-libs contravene our possible human identification with the image. This lack of empathic connection, the inability to bridge across the divide, is perhaps what feels so disquieting when we view synthetic images.

As we have seen, when feeding these images into caption generators to ‘read’ the picture, only humans can reliably extract meaning from these images. Trapped within this image-to-text-to-image feedback loop, as creators or viewers we’re ultimately left calling out to the machine: Once More, with Feeling!

We hope projects like this spark the flourishing of similar experiments for users of image generators to the critical and curious about the current state of artificial “intelligence”.

Find out more about A Generated Family of Man in our New Curators program area.

Making A Generated Family of Man: Revelations about Image Generators

Juwon Jung | Posted 29 September 2023

I’m Juwon, here at the Flickr Foundation for the summer this year. I’m doing a BA in Design at Goldsmiths. There’s more background on this work in the first blog post on this project that talks about the experimental stages of using AI image and caption generators.

“What would happen if we used AI image generators to recreate The Family of Man?”

When George first posed this question in our office back in June, we couldn’t really predict what we would encounter. Now that we’ve wrapped up this uncanny yet fascinating summer project, it’s time to make sense out of what we’ve discovered, learned, and struggled with as we tried to recreate this classic exhibition catalogue.

Bing Image Creator generates better imitations when humans write the directions

We used the Bing Image Creator throughout the project and now feel quite familiar with its strengths and weaknesses. There were a few instances where the Bing Image Creator would produce surprisingly similar photographs to the originals when we wrote captions, as can be seen below:

Here are the caption iterations we made for the image of the judge (shown above, on the right page of the book):

1st iteration:
A grainy black and white portrait shot taken in the 1950s of an old judge. He has light grey hair and bushy eyebrows and is wearing black judges robes and is looking diagonally past the camera with a glum expression. He is sat at a desk with several thick books that are open. He is holding a page open with one hand. In his other hand is a pen. 

2nd iteration:
A grainy black and white portrait shot taken in the 1950s of an old judge. His body is facing towards the camera and he has light grey hair that is short and he is clean shaven. He is wearing black judges robes and is looking diagonally past the camera with a glum expression. He is sat at a desk with several thick books that are open. 

3rd iteration:
A grainy black and white close up portrait taken in the 1950s of an old judge. His body is facing towards the camera and he has light grey hair that is short and he is clean shaven. He is wearing black judges robes and is looking diagonally past the camera with a glum expression. He is sat at a desk with several thick books that are open. 

Bing Image Creator is able to demonstrate such surprising capabilities only when the human user accurately directs it with sharp prompts. Since Bing Image Creator uses natural language processing to generate images, the ‘prompt’ is an essential component to image generation. 

Human description vs AI-generated interpretation

We can compare human-written captions to the AI-generated captions made by another tool we used, Image-to-Caption. Since the primary purpose of Image-to-Caption.io is to generate ‘engaging’ captions for social media content, the AI-generated captions generated from this platform contained cheesy descriptors, hashtags, and emojis.

Using screenshots from the original catalogue, we fed images into that tool and watched as captions came out. This non-sensical response emerged for the same picture of the judge:

“In the enchanted realm of the forest, where imagination takes flight and even a humble stick becomes a magical wand. ✨🌳 #EnchantedForest #MagicalMoments #ImaginationUnleashed”

As a result, all of the images generated from AI captions looked like they were from the early Instagram-era in 2010; highly polished with strong, vibrant color filters. 

Here’s a selection of images generated using AI prompts from Image-to-Caption.io

Ethical implications of generated images?

As we compared all of these generated  images, it was our natural instinct to instantly wonder about the actual logic or dataset that the generative algorithm was operating upon. There were also certain instances where the Bing Image Creator would not be able to generate the correct ethnicity of the subject matter in the photograph, despite the prompt clearly specifying the ethnicity (over the span of 4-5 iterations).

Here are some examples of ethnicity not being represented as directed: 

What’s under the hood of these technologies?

What does this really mean though? I wanted to know more about the relationship between these observations and the underlying technology of the image generators, so I looked into the DALL-E 2 model (which is used in Bing Image Creator). 

DALL-E 2 and most other image generation tools today use the diffusion model to generate a new image that conveys the same, if not the most similar, semantic information of the input caption. In order to correctly match the visual semantic information to the corresponding textual semantic information, (e.g. matching the image of an apple to the word apple) these generative models are trained with large subsets of images and image descriptions online. 

Open AI has admitted that the “technology is constantly evolving, and DALL-E 2 has limitations” in their informational video about DALL-E 2.  

Such limitations include:

  • If the data used to train the model has been flawed and contains images that are incorrectly labeled, it may produce an image that doesn’t correspond to the text prompt. (e.g. if there are more images of a plane matched with the word car, the model can produce an image of a plane from the prompt ‘car’) 
  • The model may exhibit representational bias if it hasn’t been trained enough on a certain subject (e.g. producing an image of any kind of monkey rather than the species from the prompt ‘howler monkey’) 

From this brief research, I realized that these subtle errors of Bing Image Creator shouldn’t be simply overlooked. Whether or not Image Creator is producing relatively more errors for certain prompts could signify that, in some instances, the generated images may reflect the current visual biases, stereotypes, or assumptions that exist in our world today. 

A revealing experiment for our back cover

After having worked with very specific captions for hoped-for outcomes, we decided to zoom way out to create a back cover for our book. Instead of anything specific, we spent a short period after lunch one day experimenting with very general captioning to see the raw outputs. Since the theme of The Family of Man is the oneness of mankind and humanity, we tried entering the short words, “human,” “people,” and “human photo” in the Bing Image Creator.

These are the very general images returned to us: 

What do these shadowy, basic results really mean?
Is this what we, humans, reduce down to in the AI’s perspective? 

Staring at these images on my laptop in the Flickr Foundation headquarters, we were all stunned by the reflections of us created by the machine. Mainly consisting of elementary, undefined figures, the generated images representing the word “humans” ironically conveyed something that felt inherently opposite. 

This quick experiment at the end of the project revealed to us that perhaps having simple, general words as prompts instead of thorough descriptions may most transparently reveal how these AI systems fundamentally see and understand our world.

A Generated Family of Man is just the tip of the iceberg.

These findings aren’t concrete, but suggest possible hypotheses and areas of image generation technology that we can conduct further research on. We would like to invite everyone to join the Flickr Foundation on this exciting journey, to branch out from A Generated Family of Man and truly pick the brains of these newly introduced machines. 

Here are the summarizing points of our findings from A Generated Family of Man:
  • The abilities of Bing Image Creator to generate images with the primary aim of verisimilitude is impressive when the prompt (image caption) is either written by humans or accurately denotes the semantic information of the image.
  • In certain instances, the Image Creator performed relatively more errors when determining the ethnicity of the subject matter. This may indicate the underlying visual biases or stereotypes of the datasets the Image Creator was trained with.
  • When entering short, simple words related to humans into the Image Creator, it responded with undefined, cartoon-like human figures. Using such short prompts may reveal how the AI fundamentally sees our world and us. 

Open questions to consider

Using these findings, I thought that changing certain parameters of the investigation could make interesting starting points of new investigations, if we spent more time at the Flickr Foundation, or if anyone else wanted to continue the research. Here are some different parameters that can be explored:

  • Frequency of iteration: increase the number of trials of prompt modification or general iterations to create larger data sets for better analysis.
  • Different subject matter: investigate specific photography subjects that will allow an acute analysis on narrower fields (e.g. specific types of landscapes, species, ethnic groups).
  • Image generator platforms: look into other image generator softwares to observe distinct qualities for differing platforms.

How exciting would it be if different groups of people from all around the world participated in a collective activity to evaluate the current status of synthetic photography, and really analyze the fine details of these models? Maybe that wouldn’t scientifically reverse-engineer these models but even from qualitative investigations, findings emerge. What more will we be able to find? Will there be a way to match, cross-compare the qualitative and even quantitative investigations to deduce a solid (perhaps not definite) conclusion? And if these investigations were to take place in intervals of time, which variables will change? 

To gain inspiration for these questions, take a look at the full collection of images of A Generated Family of Man on Flickr!

Creating A Generated Family of Man

Author: Maya Osaka

Find out about the process that went into creating A Generated Family of Man, the third volume of A Flickr of Humanity.

A Flickr of Humanity is the first project in the New Curators program, revisiting and reinterpreting The Family of Man, an exhibition held at MoMa in 1955. The exhibition showcased 503 photographs from 68 countries, celebrating universal aspects of the human experience. It was a declaration of solidarity following the Second World War. 

For our third volume of A Flickr of Humanity we decided to explore the new world of generative AI using Microsoft Bing’s Image Creator to regenerate The Family of Man catalog (30th Anniversary Edition). The aim of the project was to investigate synthetic image generation to create a ‘companion publication’ to the original, and that will act as a timestamp, to showcase the state of generative AI in 2023.

Project Summary

  1. We created new machine-generated versions of photographs from The Family of Man by writing a caption for each image and passing it through Microsoft Bing’s Image Creator. These images will be referred to as Human Mediated Images (HMI.)
  2. We fed screenshots of the original photographs into ImageToCaption, an AI-powered caption generator which produces cheesy Instagramesque captions, including emojis and hashtags. These computed captions were then passed into Bing’s Image Creator to generate an image only mediated by computers. These images will be referred to as AI-generated Images (AIGI).

We curated a selection of these generated images and captions into the new publication, A Generated Family of Man.

Image generation process

It is important to note that we decided to use free AI generators because we wanted to explore the most accessible generative AI.

Generating images was time-consuming. In our early experiments, we generated several iterations of each photograph to try and get it as close to the original as possible. We’d vary the caption in each iteration to work towards a better attempt. We decided it would be more interesting to limit our caption refinements so we could see and show a less refined output. We decided to set a limit of two caption-writing iterations for the HMIs.

For the AIGIs we chose one caption from the three from the first set of generated responses. We’d use the selected caption to do one iteration of image generation, unless the caption was blocked, in which case we would pick another generated caption and try that. 

Once we had a good sense of how much labour was required to generate these images, we set an initial target to generate half of the images in the original publication. The initial image generation process, in which we spawned roughly 250 of the original photographs took around 4 weeks. We then had roughly 500 generated images with (about half HMIs and half AIGIs), and we could begin the layout work.

Making the publication

The majority of the photographs featured in The Family of Man are still in copyright so we were unable to feature the original photographs in our publication. That’s apart from the two Dorothea Lange photographs we decided to feature, and which have no known copyright. 

We decided to design the publication to act as a ‘companion publication’ to the original catalog. As we progressed making the layout, we imagined the ideal situation: the reader would have an original The Family of Man catalogue to hand to compare and contrast the original photographs and generated images side by side. With this in mind we designed the layout of the publication as an echo of the original, to streamline this kind of comparison.

It was important to demonstrate the distinctions between HMI and AIGI versions of the original images, so in some cases we shifted the layout to allow this.

Identifying HMIs and AIGIs

There was a lot of discussion around whether a reader would identify an image as an HMI or AIGI. All of the HMI images are black and white—because “black and white” and “grainy” were key human inputs in our captions to get the style right—while most of the AIGI images came out in colour. That in itself is an easy way to identify most of the images. We made the choice to use different typefaces on the captions too.

It is fascinating to compare the HMI and AIGI imagery, and we wanted to share that in the publication. So, in some cases, we’ve included both image types so readers can compare. Most of the image pairs can be identified because they share the same shape and size. All HMIs also sit on the left hand side of their paired AIGI. 

In both cases we decided that a subtle approach might be more entertaining as it would leave it in the readers hands to interpret or guess which images are which.

To watermark, or not to watermark?

Another issue that came up was around how to make it clear which images are AI-generated as there are a few images that are actual photographs. All AI images generated by Bing’s Image Creator come out with a watermark in the bottom left corner. As we made the layout, some of the original watermarks were cropped or moved out of the frame, so we decided to add the watermarks back into the AI-generated images in the bottom left corner so there is a way to identify which images are AI-generated.

Captions and quotes

In the original The Family of Man catalog, each image has a caption to show the photographer’s name, the country the photograph was taken in, and any organizations  the photograph is associated with. There are also quotes that are featured throughout the book. 

For A Generated Family of Man we decided to use the same typefaces and font sizes as the original publication. 

We decided to display the captions that were used to generate the images because we wanted to illustrate our inputs, and also those that were computer-generated. Our captions are much longer than the originals, so to prevent the pages from looking too cluttered, we added captions to a small selection of images. We decided to swap out the original quotes for quotes that are more relevant to the 21st century.

Below you can see some example pages from A Generated Family of Man.

Reflection

I had never really thought about AI that much before working on this project. I’ve spent weeks generating hundreds of images and I’ve gotten familiar with communicating with Bing’s Image Creator. I’ve been impressed by what it can do while being amused and often horrified by the weird humans it generates. It feels strange to be able to produce an image in a matter of seconds that is of such high quality, especially when we look at images that are not photo-realistic but done in an illustrative style. In ‘On AI-Generated Works, Artists, and Intellectual Property ‘, Ryan Merkley says ‘There is little doubt that these new tools will reshape economies, but who will benefit and who will be left out?’. As a designer it makes me feel a little worried about my future career as it feels almost inevitable, especially in a commercial setting, that AI will leave many visual designers redundant. 

Generative AI is still in its infancy (Bing’s Image Creator was only announced and launched in late 2022!) and soon enough it will be capable of producing life-like images that are indistinguishable from the real thing. If it isn’t already. For this project we used Bing’s Image Creator, but it would be interesting to see how this project would turn out if we used another image generator such as MidJourney, which many consider to be at the top of its game. 

There are bound to be many pros and cons to being able to generate flawless images and I am simultaneously excited and terrified to see what the future holds within the field of generative AI and AI technology at large.

British Library & Flickr Commons: The many hands (and some machines) making light work

By Nora McGregor, Digital Curator in the Digital Scholarship Department of the British Library

Over a recent cup of coffee, George Oates, the indefatigable founder of Flickr Commons and now Executive Director of the Flickr Foundation, asked me if any memorable moments stood out during our long relationship with the Commons since British Library first joined nearly a decade ago. Of course a multitude of inspired engagements instantly filled my mind like some exploding word cloud and I could’ve easily prattled on until our cups dried up and the shop shutters went down. But one emerges from all the rest for me as the most shining example of all and that is what we’ve come to call “The tale of Chico vs the Machine”.

 

British Library digitised image from page 57 of "A Strange Elopement. ... Illustrations by W. H. Overend"
British Library digitised image from page 57 of “A Strange Elopement. … Illustrations by W. H. Overend” | Flickr

 

Our Flickr Commons story began in 2013 when we were looking for inventive ways to improve the discoverability of a new and exceedingly eclectic collection of 19th century illustrations we’d recently collated. Plucked from the pages of our digitised books by an algorithm built by Ben O’Steen in British Library Labs, this unique and sizable image collection was largely untagged and undescribed. Each image had associated with it only the title of the book and page it came from, but no other details to describe it, such as what the image itself depicted. We needed a curious, smart, engaged, and global audience to set their eyes and collective expertise on it, to help us tag and describe them so we could create meaningful subcollections and improve searchability. We also needed a powerful API to enable working with such a large collection, and the millions of interactions it may potentially garner, at scale. We happily found both in the Flickr Commons. 

In late 2014, we had been chatting with artist Mario Klingemann aka Quasimondo who had happened upon this wild, wonderful and wholly uncurated collection of ours in Flickr Commons and was keen to create a series of artworks using the images. As part of his craft he was mixing automatic image classification with manual confirmation to identify and tag tens of thousands of the images – ranging from maps to ships, portraits to stones – to discover more from within the collection, and in turn, make them more discoverable for others.

 

16 x 16 Colourful Faces from the British Library Collection

16 x 16 Colourful Faces from the British Library Collectio… | Flickr
By Mario Klingemann

The result of data mining the British Library Commons Collection, identifying colorful plates using some image analysis and subsequently using face detection to extract the faces contained therein.

 

As we were running some statistics around the algorithmically generated tags Mario was creating and adding back to individual images for us via the Flickr API (something in the region of 30,000 at that point if I recall), we noticed that yet another user had already contributed something in the region of 45,000 tags to the collections. Assuming this user was similarly a dabhand with an image classification algorithm, we were absolutely gobsmacked to discover that, at closer inspection, no, actually, these contributions were all added by hand! Not only were these invaluable image tags being manually contributed by one person, but they were expertly and thoughtfully individually crafted. They did not simply identify general objects or themes in each image like “ship”, which in itself was of incalculable value for improving search, particularly when no such simple descriptions existed at all. These tags were of a rare and profound quality. To illustrate, for 19th century biblical images, the user, only known to us by his handle, had added specific biblical passage numbers for which the scene depicted referred to!

 

British Library digitised image from page 394 of "The eventful voyage of H.M. Discovery Ship 'Resolute' to the Arctic Regions in search of Sir J. Franklin. ... To which is added an account of her being fallen in with by an American Whaler after her abando

British Library digitised image from page 394 of “The eventful voyage of H.M. Discovery Ship ‘Resolute’ to the Arctic Regions in search of Sir J. Franklin. … To which is added an account of her being fallen in with by an American Whaler after her abandonment … and of her presentation to Queen Victoria by the Government of the United States” | Flickr

 

The sheer scale, quality and value of this singular Flickr user’s personal contribution was so staggering, we immediately sought them out to personally thank them and to ask if we could recognise their work publicly through our BL Labs Award programme, at the very least. And yet, more surprises were to come. When we approached them with our gratitude and our offer of recognition we were very politely rebuffed! They shared with us that as they had been bedbound, it was they who wanted to express their gratitude for the opportunity to remain active in the world in some meaningful way. They told us that days spent trawling through and tagging such a wild and unruly collection, in the knowledge that they’re helping others to find these same gems, was reward enough and I can tell you, it was a response that no one in our team will ever forget. We attempted a few more times to shower them with accolades in some agreeable way but every time our overtures were politely declined on the same grounds.

This memory makes my heart swell and it’s a tale that so perfectly encapsulates the variety of valuable interactions –from the very intimate and human, to the technologically innovative and computationally driven – that the Flickr Commons community and platform has supported.

To give just one example, since 2015, 50,000 maps have been found and tagged by humans, and machines working alongside each other individually or as part of community events. They’ve all been georeferenced and are now being added back into the British Library catalogue as individual collection items in their own right – bringing direct benefits to current and future users of our historical image collections as more wonderful images are surfaced.

Screenshot of an old map on a newer map

Explore the georeferencer or the British Library’s Flickr Albums.

Every tag contributed, whether expertly crafted by human hand, or machine learned by an algorithm, has helped to make thousands, if not millions of unseen historical images from British Library collections more discoverable and we simply could not have gotten this far in curating this massive and wonderful collection without the Flickr Commons. 

By Nora McGregor, Digital Curator in the Digital Scholarship Department of the British Library

Juwon Jung | Posted 18 July 2023

Kickstarting A Generated Family of Man: Experimenting with Synthetic Photography 

Ever since we created our Version 2 of A Flickr of Humanity, we’ve been brainstorming different ways to develop this project at the Flickr Foundation headquarters. Suddenly, we came across the question: what would happen if we used AI image generators to recreate The Family of Man

  • What could this reveal about the current generative AI models and their understanding of photography?
  • How might it create a new interpretation of The Family of Man exhibition?
  • What issues or problems would we encounter with this uncanny approach?

 

We didn’t know the answers to these questions, or what we might even find, so we decided to jump on board for a new journey to Version 3. (Why not?!)

We split our research into three main stages:

  1. Research into different AI image generators
  2. Exploring machine-generated image captions
  3. Challenges of using source photography responsibly in AI projects

And, we decided to try and see if we could use the current captioning and image generation technologies to fully regenerate The Family of Man for our Version 3.

 

Stage 1. Researching into different AI image generator softwares

Since the rapid advancements of generative artificial intelligence in the last couple of years, hundreds of image-generating applications, such as DALL-E 2 or Midjourney, have been launched. In the initial research stage, we tested different platforms by creating short captions of roughly ten images from The Family of Man and observing the resulting outputs.

Stage 1 Learnings: 

  • Image generators are better at creating photorealistic images of landscapes, objects, and animals than close-up shots of people. 
  • Most image generators, especially those that are free, have caps on the numbers of images that can be produced in a day, slowing down production speed. 
  • Some captions had to be altered because they violated terms and policies of the platforms; certain algorithms would censor prompts with potential to create unethical, explicit images (e.g. Section A photo caption – the word “naked” could not be used for Microsoft Bing)

We decided to use Microsoft Bing’s Image generator for this project because it produced images with highest quality (across all image categories) with most flexible limits on the quantity of images that could be generated. We’ve tested other tools including Dezgo, Veed.io, Canva, and Picsart

 

Stage 2. Exploring image captions: AI Caption Generators

Image generators today primarily operate based on text prompts. This realisation meant we should explore caption generation software in more depth. There was much less variety in the caption-generating platforms compared to image generators. The majority of the websites we found seemed intended for social media use. 

Experiment 1: Human vs machine captions

Here’s a series of experiments done by rearranging and comparing different types of captions—human-written and artificially generated—with images to observe how it alters the images generated, their different expression and, in some cases, meaning: 

Stage 2 Learnings: 

  • It was quite difficult to find a variety of caption generating software that generated different styles of captions because most platforms only generated “cheesy” social media captions, 
  • In the platforms that generated other styles of captions (not for social media), we found the depth and accuracy of the description was really limited, for example, “a mountain range with mountains.”

 

Stage 3. Challenges of using AI to experiment with photography?!

Since both the concept and process of using AI to regenerate  The Family of Man is experimental, we encountered several dilemmas along the way:

1. Copyright Issues with Original Photo Use 

  • It’s very difficult to obtain proper permission to use photos from the original publication of The Family of Man since the exhibition contains photos from 200+ photographers in different locations and for different publications. Hence, we’ve decided to not include the original photos of The Family of Man in the Version 3 publication.
  • This is disappointing because having the original photo alongside the generated versions would allow us to create a direct visual comparison between authentic and synthetic photographs.
  • All original photos of The Family of Man used in this blog post were photographed using the physical catalogue in our office.

2. Caption Generation 

  • Even during the process of generating captions, we are required to plug in the original photo of The Family of Man so we’ve had to take screenshots of the online catalogue available in The Internet Archive. This can still be a violation of the copyrights policies because we’re adopting the image within our process, even if we don’t explicitly display the original image. We also have a copy of The Family of Man publication purchased by the Flickr Foundation here at the office.

 

4. Moving Forward..

Keeping these dilemmas in mind, we will try our best to show respect to the original photographs and photographers throughout our project. We’ll also continuously repeat this process/experimentation to the rest of the images in The Family of Man to create a new Version 3 in our A Flickr of Humanity project.