Lee este artículo en español.
How might we use AI technologies to innovate newsgathering and investigative reporting techniques? This was the question we posed to a group of seven newsrooms in Latin America and the US as part of the Americas Cohort during the 2021 JournalismAI Collab Challenges. The Collab is an initiative that brings together media organizations to experiment with AI technologies and journalism.
This year, JournalismAI, a project of Polis, the journalism think-tank at the London School of Economics and Political Science, and powered by the Google News Initiative, invited the Knight Lab to be the regional partner in the Americas to facilitate the collaboration between the participants from the Western Hemisphere.
From May to November, the participants from La Nación in Argentina, MuckRock and Bloomberg News in the US, Data Crítica in México, AzMina in Brazil, CLIP in Latin America, and Ojo Público in Perú worked in three different collaborative project teams.
The teams explored ways in which artificial intelligence technologies can support investigative journalists around the world in their work holding the powerful accountable, and shining light on underreported issues.
The journey of collaboration, experimentation and research materialized in the projects:
- DockIns, to analyze large sets of documents
- The Political Misogynistic Discourse Monitor, to detect gender violence in social media
- From Above, centered in the analysis of satellite imagery for reporting on environmental issues.
This is the story of the Americas Cohort’s work.
Facilitating collaboration and experimentation
For the Collab, the Knight Lab created an approach that helped to set the tone of a flexible and open mindset to see where the journey will lead. This meant that the path of experimentation was not pre-established: “the work we do is rooted in human-centered design processes, we don’t answer the questions that we pose, but we try to create a process that enables participants and us, facilitators, to figure out the answers together”, says Jeremy Gilbert, Medill’s Knight Chair in Digital Media.
“When there are new areas sometimes there isn’t really a curriculum, so part of learning something new is acknowledging that it is kind of a frontier that there is a need to explore and discover how things fit together” explains Joe Germuska, director of the Knight Lab.
At the beginning of the Collab, the Americas Cohort’s participants and facilitators met once a week for a month a half. During these sessions, participants shared what brought them to the Collab, their experience, and the journalism challenges they have. They brainstormed ideas and discussed projects that they would be interested in exploring during the following months. Because of the limitation of connecting remotely, these conversations were intentionally designed for participants to talk with different people at every opportunity.
At the end of this period, participants were invited to pitch ideas they wanted to pursue. All three ideas presented to the group were selected, and every individual participant chose which team they wanted to join. From here, the teams set monthly milestones and identified the technical and editorial work needed to achieve them. They also had weekly meetings to discuss their progress, roadblocks, and solutions. Once a month, all the teams met together, along with the Knight Lab team, in sessions dedicated to sharing updates and feedback across projects.
Along with me as a project manager for the JournalismAI Collab Challenges in the Americas, the Knight Lab team members that guided this process are Jeremy Gilbert, Professor, Knight Chair in Digital Media Strategy; Joe Germuska, Director and Chief Nerd; and Scott Bradley, Senior Engineer.
The projects
DockIns
Team DockIns tackled two challenges. 1) trying to structure public data that is usually hidden in PDFs, and 2) the reality that natural language processing (NLP) solutions are not as powerful in other languages as they are in English. For this, the team developed Project DockIns and the tool SideKick, a machine learning platform that hosts, reads, gives insights, and classifies documents. This tool has been designed to help journalists to interrogate large sets of documents in both English and Spanish, and will help automate ongoing accountability coverage so that watchdog reporters can monitor for trends, outliers, and insights in even the largest document collections without intensive technical training or complicated setup and maintenance. SideKick can work using the DocumentCloud platform or as an open-source stand-alone version.
Being part of the Collab is an investment from the present to the future. It is to learn, to collaborate, and to continue moving forward
Momi Peralta, La Nación (Team DockIns)
- Read about Project Dockins.
- Read SideKick’s documentation.
- Read Testing two Named Entity Recognition models on Spanish documents.
Participants and organizations:
Delfi Arambillet (La Nación - Argentina), Rigo Carvajal (CLIP - Costa Rica), Claudia Chávez (Ojo Público - Perú), Gianco Huamán (Ojo Público - Perú), Mitch Kotler (MuckRock - US), Michael Morisy (MuckRock - US), Martín Pascua (La Nación - Argentina), Momi Peralta (La Nación - Argentina), Gianfranco Rossi (Ojo Público - Perú).
Political Misogynistic Discourse Monitor
The Political Misogynistic Discourse Monitor was conceived with the purpose of investigating how gender violence is spread in social media, especially in the cases where it is initiated or stimulated by political figures on Twitter. This team has been building up from the project MonitorA, previously developed by AzMina. The team developed a natural language processing (NLP) model capable of working in both Portuguese and Spanish to create a more effective way to automate the process of analyzing large amounts of tweets and determine if those are misogynistic messages. The work of tagging and training the algorithm to build the tool led the team to collaborate with researcher Iván Meza-Ruiz from Mexico, who joined efforts to help fine-tune the NLP model. The result is the first stage of a prototype that can tell if a tweet is misogynistic. In the future, the team will make the tool available for others interested in mapping gender violence in social media.
Being at the Collab helped me to understand what pieces we need to start... What I appreciate the most is building these networks, these communities. The real win is for the community. We now have a space for an innovative project.
José Luis Peñarredonda, CLIP (Team Political Misogynistic Discourse Monitor)
Participants and organizations:
Fer Aguirre (Data Crítica - México), Helena Bertho (AzMina - Brazil), Gaby Bouret (La Nación - Argentina), Bárbara Libório (AzMina - Brazil), Marina Gama Cubas da Silva (AzMina - Brazil), José Luis Peñarredonda (CLIP - Colombia).
- Explore the GitHub repository.
From Above
Team From Above started working together with the purpose of using artificial intelligence and satellite imagery to identify visual indicators to chase stories. The selection of satellite imagery was to go around language barriers, experiment with open source tools, and have a different approach to both understand and investigate biodiversity loss on the planet. Their exploration was also with the intention of demystifying AI and creating a better understanding of what it means to use computer vision algorithms and to train a model. The team navigated the excitement of learning together with the challenge of limited access to high-quality images. The collective learning process and the ups and downs motivated the creation of A Journalist’s Guide to using AI + Satellite Imagery for Storytelling.
I think it's incredibly valuable [being in the project] because I think everybody that is involved now has a much better understanding of the space. We all have a much better understanding of what can and can't be done.
David Ingold, Bloomberg News (Team From Above)
Participants and organizations:
David Ingold (Bloomberg - US), Flor Coehlo (La Nación - Argentina), Gibrán Mena (Data Crítica - México), María Teresa Ronderos (CLIP - Colombia), Shreya Vaidyanathan (Bloomberg - US).
- Read A Journalist's Guide to using AI + Satellite Imagery for Storytelling.
- Explore the GitHub repository
The learnings
The 2021 JournalismAI Collab Challenges has been a unique opportunity to learn about how each context creates different challenges for investigative journalists, like different levels of access to resources, or language limitations. Nevertheless, when working together it is possible to create solutions that help many, because the problems to investigate are very similar, and don’t belong to one country or one continent.
While in investigative journalism it is not always straightforward to create solutions that can bring automation at a great scale, working towards innovation together can set an easier path for everyone. These are three of the big lessons from the Americas Cohort:
- Better to experiment now to be ready for the future. Not all experiments will bring positive results, it could happen that you will store your experimentation for a while, let it marinate, but doing it now will help you to be an early adopter soon.
- Let the problem lead you. When working on a story, you may want to have specific tools or resources, but as obstacles start appearing, reframe your questions, interrogate what you really are looking for, this can lead to changing your first approach, and push you to think creatively out of the box to continue experimenting.
- And, overall, collaboration is the key!
Get in touch with the Knight Lab if you’re interested in these tools, and similar collaborations and initiatives like The Data-Driven Reporting Project.
You can explore all the projects developed by the participants from around the globe in the sessions of the 2021 JournalismAI Festival on the youtube channel of Polis at LSE.
About the author