MozFest 2014: Gotta lotta analog data? Crowdsourcing may make it useful for you and fun for readers

When we think of data, we almost always think of computers. But when it comes to data that was created before the digital area —  handwritten notes, ancient maps or printed documents, for example — nothing beats human eyes to quantify and verify. And when many human eyes are needed, journalists have the option to crowdsource their data.

At MozFest this weekend, Mike Tigas of ProPublica and Jeremy B. Merrill of The New York Times facilitated a session that touched upon four projects leading the movement in crowdsourcing data. Here's a look at a few projects and why people want to get involved:

Free the Files

ProPublica’s Free the Files tool began crowdsourcing back in 2012 when it asked users to “free a file” about political TV ads, by recording the advertiser, agency and gross total dollar amount spent on an ad. In the time since the project’s launch during the 2012 elections about 18,000 documents have been "freed" out of more than 43,000.

Free the Files asks users to verify information that computers can't automatically ascertain from scanned files.

It takes two reports of identical information to verify a file’s data, according to Merrill. Once this occurs, the file’s information has been freed, and ProPublica publishes the information.

NYPL's Building Inspector

The New York Public Library crowdsourced maps in 2013 via a project called Building Inspector. The tool asks users to identify colors, realign building boundaries and input addresses of 1850s New York City. By relying on the input of users to verify information, the data stored in these scanned images of discolored maps can be used with modern cartographic tools.

Madison

Additional historical information from New York City has made its way to the crowdsourcing platform via The New York Times’ Madison project. Madison asks users to click one of several buttons to indicate whether or not the depicted image is an advertisement or some variation of. Although the digitization of The Times’ archives has largely focused on editorial, advertisements can be a telling indication of history as well. With Madison, The Times intends to unlock this data.

Madison takes a look at the ads, not articles, that exist in old New York Times newspapers.

CrowData

Another crowdsourcing data tool, CrowData, was released this year and is available on github. CrowData comes out of an initiative by La Nacion called VozData, which uses the public to verify records released about political spending, much like ProPublica does with Free the Files.

Why do readers contribute to data projects?

Free the Files and VozData incorporate gamification into verification by asking users to log in and displaying a leaderboard of the top contributors. With gamification, “you make people more engaged with some data that they otherwise would maybe never have looked at,” Tigas said.

Projects like Free the Files are alternatives to paying many people for hours of work pouring over data. Because they rely on free work from the public, the verifiers must be interested in the information and believe that the news company is going to do something beneficial with the data.

Merill said that an ideal case for crowdsourcing data is one that provides an exchange that goes both ways. “Where you learn something meaningful about the world that you didn’t know before by doing this,” he said.

Others feel a personal connection to the projects. “There are people who are really interested in their neighborhood histories,” Tigas said. “And so they get something out of [Building Inspector]  because they learn what their neighborhood looked like.”

“There’s a whole segment of readers that are really interested in being involved with the news that they read,” he said.

About the author

Mallory Busch

Undergraduate Fellow

Latest Posts

  • A Big Change That Will Probably Affect Your Storymaps

    A big change is coming to StoryMapJS, and it will affect many, if not most existing storymaps. When making a storymap, one way to set a style and tone for your project is to set the "map type," also known as the "basemap." When we launched StoryMapJS, it included options for a few basemaps created by Stamen Design. These included the "watercolor" style, as well as the default style for new storymaps, "Toner Lite." Stamen...

    Continue Reading

  • Introducing AmyJo Brown, Knight Lab Professional Fellow

    AmyJo Brown, a veteran journalist passionate about supporting and reshaping local political journalism and who it engages, has joined the Knight Lab as a 2022-2023 professional fellow. Her focus is on building The Public Ledger, a data tool structured from local campaign finance data that is designed to track connections and make local political relationships – and their influence – more visible. “Campaign finance data has more stories to tell – if we follow the...

    Continue Reading

  • Interactive Entertainment: How UX Design Shapes Streaming Platforms

    As streaming develops into the latest age of entertainment, how are interfaces and layouts being designed to prioritize user experience and accessibility? The Covid-19 pandemic accelerated streaming services becoming the dominant form of entertainment. There are a handful of new platforms, each with thousands of hours of content, but not much change or differentiation in the user journeys. For the most part, everywhere from Netflix to illegal streaming platforms use similar video streaming UX standards, and...

    Continue Reading

  • Innovation with collaborationExperimenting with AI and investigative journalism in the Americas.

    Lee este artículo en español. How might we use AI technologies to innovate newsgathering and investigative reporting techniques? This was the question we posed to a group of seven newsrooms in Latin America and the US as part of the Americas Cohort during the 2021 JournalismAI Collab Challenges. The Collab is an initiative that brings together media organizations to experiment with AI technologies and journalism. This year,  JournalismAI, a project of Polis, the journalism think-tank at...

    Continue Reading

  • Innovación con colaboraciónCuando el periodismo de investigación experimenta con inteligencia artificial.

    Read this article in English. ¿Cómo podemos usar la inteligencia artificial para innovar las técnicas de reporteo y de periodismo de investigación? Esta es la pregunta que convocó a un grupo de siete organizaciones periodísticas en América Latina y Estados Unidos, el grupo de las Américas del 2021 JournalismAI Collab Challenges. Esta iniciativa de colaboración reúne a medios para experimentar con inteligencia artificial y periodismo. Este año, JournalismAI, un proyecto de Polis, la think-tank de periodismo...

    Continue Reading

  • AI, Automation, and Newsrooms: Finding Fitting Tools for Your Organization

    If you’d like to use technology to make your newsroom more efficient, you’ve come to the right place. Tools exist that can help you find news, manage your work in progress, and distribute your content more effectively than ever before, and we’re here to help you find the ones that are right for you. As part of the Knight Foundation’s AI for Local News program, we worked with the Associated Press to interview dozens of......

    Continue Reading

Storytelling Tools

We build easy-to-use tools that can help you tell better stories.

View More