Selfiecity (www.selfiecity.net) investigates how people photograph themselves with mobile phones in 5 cities around the world. The project analyzes 3200 Instagram selfies shared in New York, Moscow, Berlin, Bangkok, and Sao Paulo (640 from each city).
Selfies were already subject of many discussions in popular media. However, if we simply scan images tagged as selfie on Instagram, or observe people around us taking self-portraits, it’s hard to quantify patterns, or systematically compare selfies from multiple cities taken by people who differ in age and gender. Are all selfies taken by young people? Do men take many selfies? Are we all trying to copy celebrities in choosing how we represent ourselves? Are there any significant differences between selfies shared in New York and Moscow, or Berlin and Bangkok? Selfiecity is the first project which investigates such questions systematically, using carefully assembled large sample of selfies photos and tools of statistics, data science and data visualization.
In 2013 Nadav Hochman together with Lev Manovich and Jay Chow analyzed and visualized 2.3 million Instagram photos collected in 13 of global cities (phototrails.net). Building on the experience of this project, Lev Manovich and Daniel Goddemeyer of DigitalThoughtFacility, who are collaborating on a range of projects, decided to assemble a larger team to continue the work with Instagram photographs.
The new team includes media theorists, an art historian, data scientists, visual designers and programmers. Working between New York, Germany and California, the team put
together a new project that brings together multiple perspectives of its members. Lev Manovich (a pioneer in the analysis of visual social data) coordinated the project, while Moritz Stefaner (one of the leading visualization designers in the world) was responsible for creative direction and visualizations.
Selfiecity combines Findings about the demographics of people taking selfies and their poses and expressions; a number of media visualizations (imageplots) which assemble thousands of photos to reveal the interesting patterns; and an innovative interactive application (selfiexploratory) which allows visitors to explore the whole set of 3200 photos, sorting and filtering it to find new patterns.
In addition, selfiecity also includes essays (Theory section) about history of photography and the selfie phenomena, the functions of images in social media, and media visualization method.
People take less selfies than often assumed
Depending on the city, only 3-5% of images we analyzed were actually selfies.
Significantly more women
In every city we analyzed, there are significantly more women selfies than men selfies (from 1.3 times as many in Bangkok to 1.9 times more in Berlin). Moscow is a strong outlier – here, we have 4.6 times more female than male selfies!
(While we don’t have this data for other countries, in the U.S. proportion of female to male instagram users is close to 1:1).
A young people’s sport? Indeed.
Most people in our photos are pretty young (23.7 estimated median age). Bangkok is the youngest city (21.0), whereas NYC is the oldest (25.3). Men’s average age is higher than that of women in every city. Surprisingly, more older men (30-) post selfies on Instagram than women.
Bangkok, Sao Paulo are all smiles
Our mood analysis revealed that you can find lots of smiling faces in Bangkok (0.68 average smile score) and Sao Paulo (0.64). People taking selfies in Moscow smile the least (only 0.53 on the smile score scale).
Women strike more extreme poses, especially in Sao Paulo
Women’s selfies show more expressive poses; for instance, the average amount of head tilt is 50% higher than for men (12.3° vs. 8.2°). Sao Paulo is most extreme – there, the average head tilt for females is 16.9°!
These findings present only some of the patterns we found. Other findings will be presented in a series of blog posts at softwarestudies.com. In general, we discovered that each of our five cities is an outlier in its own unique way. Depending on which dimension of comparison we choose, one of the cities usually stands out. However, then we combine many dimensions together, Moscow and Bangkok stand out from other cities.
What can we learn from social media?
What can we learn than we analyze social media, such as selfie photos shared by people on Instagram?
Do we learn about society – cultural and social differences in different locations around the world?
Or do we learn about popular photography in the age of Instagram and mobile phones – what people like to photograph, preferred compositions, points of view, colors and so on?
Or do we learn about particular software mediums, their affordances and conventions, and particular creative options they favor? (For example, all Instagram photos are square; all users have access to the same set of filters; selfie compositions are limited by what can be captured by a phone held by the person taking a photo of herself/himself.)
We believe that projects such as selfiecity (and our earlier phototrails) allow us to ask all these questions. At the same time, it may be very hard or even impossible to separate the three dimensions – Instagram as a window into social reality, as a contemporary vernacular photography, and as a software medium. (The same would apply to other social platforms such as Twitter and Facebook).
The collection of selfies photos from Instagram took many steps. When you browse Instagram, at first it looks like it contains a large proportion of selfies. A closer examination reveals that the large percentage are not selfies, but photos taken by other people. We wanted to use only single person true selfies for the project.
The team partned with Gnip, the world’s largest provider of social data (gnip.com). After developing the software that interfaces with Gnip service, in September 2013 we started to collect Instagram photos in different locations. After many tests, we focused on central areas in five cities located in North America, Europe, Asia, and South America. In each city we chose the central area, keeping these areas approximately the same size.
We wanted to collect images and data under equal condition, so we selected a particular week (Monday through Sunday, Dec 4-Dec 12, 2013) for the project. The following are the numbers of photos shared on Instagram in the central areas of our 5 cities, according to Instagram data provided by Gnip:
Sorted by size:
NYC – 207K
Bangkok – 162K
Moscow – 140K
Sao Paolo – 123K
Berlin – 24K
Total: 656K photos.
We have placed the locations of all these photos on the maps available online, so you can see what areas are used and how selfies are distributed (maps can be zoomed):
Sau Paolo: http://cdb.io/1jEK3b7
To locate selfies photos, we randomly selected 120,000 photos (20,000-30,000 photos per city) from the total of 656,000. 2-4 Amazon’s Mechanical Workers tagged each photo. We experimented with different forms of a question, and the best results were for the simplest one: “does this photo shows a single selfie”?
We then selected top 1000 photos for each city (i.e., photos which at least 2 workers tagged as a single person selfie). We submitted these photos to Mechanical Turk, asking the 3 “master workers” not only to verify that a photo shows a single selfie, but also tag gender and guess the age of a person.
As the final step, at least one member of the project team examined all these photos manually. While most photos were tagged correctly (apparently most Mechanical Turk workers knew what a selfie was), we found some mistakes. We wanted to keep the data size the same (to make visualizations comparable), so our final set contains 640 selfie photos for every city.
The sample set of selfies photos was analyzed using state-of-the-art face analysis software from Orbeus Inc. (rekognition.com). The software analyzed the faces in the photos, generating a number of measurements, including face size, orientation, emotion, presence of glasses, presence of smile, whether eyes are closed or open, and others.
We have used these measurements in two ways: 1) compare all photos between cities, genders and ages using the measurements (see Findings); 2) we also included some of the measurements in the selfiexploratory interactive application.
The software also guessed gender and age of a person in each photo. We found that the gender guesses were generally consistent with the guesses of Mechanical Turk workers, whereas the age estimates differed significantly.
Typically data visualization shows simple data such as numbers. However, a single number can’t summarize a photo. It is not a “data point” but a whole world, rich in meanings, emotions and visual patterns. This is why showing all photos in the visualizations is the key strategy of the project. We call this approach “media visualization.” (The approach was described in a number of articles by Lev Manovich available on softwarestudies.com.)
“Showing the high level patterns in the data — the big picture — as well as the individual images has been an important theme in our project. How can we find summarizations of big data collections, which still respect the individuals, and don’t strip away all the interesting details? This has become a quite central question to us, not only with respect to selfies,” reflects Moritz Stefaner, the lead information visualization designer on the project.
Blended Video Montages
We present video montages of 640 selfies from each city. The images are aligned with respect to eye position and sorted by the head tilt angle. The animations combine individual photos to create more abstract representations, which still show details of these images and the context. These animations represent an artistic reflection on the tension between individual shots and high-level patterns, and are meant to provide the audience not only with a way to rapidly experience a high number of images, but also present the “aggregate face of a city”.
Case by case inspection of photos can reveal a lot of detail, but it is difficult to quantify the patterns observed. We created visualizations that are composed from single images (imageplots). They show distributions of genders and ages in different cities. At the same time, they make possible to reflect on and validate these high level patterns through inspection of individual images.
In this exploratory visualization, visitors can filter the photos by demographic variables, city, and estimated face features extracted by software. The application combines both human judgments and computer measurements – two ways of seeing the photos. The gender and age graphs on the right use human tags and guesses. All other graphs on the left use faces measurements done by software. Whenever a selection is made, the graphs are updated in real-time, and the bottom area displays all photos that match. The result is an innovative, fluid method of browsing and spotting patterns in large sets of media. “We see a big potential in this type of interface and plan to extend it to other applications, such as museum collections or personal media”, explains Dominikus Baur, lead developer and UI designer.