As a computer scientist about to graduate from the Medill School of Journalism, I have a front row seat on the intersection of data and journalism. Unfortunately, as Alberto Cairo has pointed out, there is still a lot of work to be done to properly combine the two.
The first ever Knight-Mozilla OpenNews’ #SRCCON seeks to mind meld data savvy journalists, many of whom also attended NICAR. While most conferences are a collection of talks that might have a Q&A session at the end, the sessions at #SRCCON were built around fostering discussions.
Since I have an internship at the Washington Post starting in six weeks, I decided to start the conference at the session “How NOT to Skew with Statistics.”
One of the recurring themes was that newsrooms, especially legacy ones, still have a gap of understanding between data journalists, digital journalists and everyone else.
Quality data journalism, just like investigative journalism, requires a substantial amount of time and effort. Unfortunately, many editors perceive the technologically adept at being able to do more things faster, and set unrealistic deadlines.
It is “difficult to do quality, news sensitive projects for TV,” said Latoya Peterson of Al Jazeera. Especially when “deadline is at 4pm.”
Since data journalists are a new addition to most teams, we focused the discussion on how to peacefully incorporate ourselves into existing processes. Everyone agreed that it was best for all parties to meet without any looming deadlines so that expectations could be set. That way the data journalists could educate everyone else about realistic timeframes.
Feeling confused about verifying data? ProPublica made a checklist (don't get scared by the 1st header): https://t.co/RKlNkWOXhP #SRCCon
— Latoya Peterson (@LatoyaPeterson) July 24, 2014
Simple graphs, assuming the data has been vetted, may only take 15 minutes. More difficult displays would take an hour or two, while intensive graphics require a full day or more.
Basically — explain that not everything is quick and simple.
If anyone tries to rush something to production, someone suggested that you inform the other party in person, with witnesses around. That way they can’t just ignore an email. Someone else yelled out to tell them “this would be publishing lies.”
Ironically, a perfect example of a misleading graph was displayed to those who filled out a survey and then viewed the results:
Continuing on the same theme, I spent the afternoon at the session “Science Vs. Journalism.” We discussed the similarities and differences between scientists and data journalists. On the white board, a pseudo Venn diagram was created.
The basic premise was that data journalism was starting to carry the perceived significance of scientific discovery without going through the same rigor. Scientists tend to be specialists who do things slower while data journalists are general practitioners who are under deadline and expected to be quick.
Highlights from @veltman’s #srccon session:
Science: can’t quote anonymous sources
Journalism: aggression
Both: mistakes are forever
— Nicole Zhu (@nicolelzhu) July 24, 2014
For both, mistakes are “high stakes” and can taint the perpetrator’s reputation “forever.”
At the end a “wish list” was created. One of the items suggested that data journalists provide better validation and replication of each other’s work, similar to the scientific method. Another suggestion was to hire an in-house scientist, which currently sounds extreme. Then again, a decade ago it was extreme to have a data journalist in the newsroom.
The Thursday sessions gave me insight into both the rewards and the hardships to being a data scientist in a newsroom. Although it was also eye opening to listen to practicing data journalists get things off their chest, the solution-generating atmosphere gave me insight on how to deal with issues.
As Tasneem Raja from Mother Jones said at the end of the first session, “this is like therapy.”