NICAR16: A glimpse at an Excel-free future thanks to relational databases

Some people come to NICAR with goals, whether they were networking goals or technical goals or partying goals. I came to my first NICAR with next to nothing. I wasn’t quite sure what to expect, other than it being half a week’s worth of sessions and panels about data, journalism, and everything in between, which it turns out, is quite a bit.

One thing I had been meaning to learn for a while, though, was how to use and query relational databases. I’ve worked in MongoDB and other schemaless databases before (read: I’ve worked with big JSON files), mostly because of their convenience and because strangers on the Internet who ran tutorial websites told me to. But I also knew in the back of my mind that I should probably learn how to use relational databases too, since the two do serve very different use cases and each has their pros and cons.

I went from knowing of something to knowing about it, but I still wasn’t quite ready to drop the prepositions and just start knowing it

It was only natural then that I started NICAR off with a 9 a.m. session titled “Intro to database manager MySQL I” with Paula Lavigne, a reporter at ESPN. The talk wasn’t a deep dive into MySQL or even into relational databases in general, nor was it trying to be. The talk, however, was a good first introduction to MySQL using Navicat to supplement the basic operations covered with some helpful displays.

Immediately after the session, I downloaded and set up a MySQL server on my laptop and started playing around with a couple of CSVs I had lying around on my computer. I quickly realized that 1) I was looking at what could be the beginning of an Excel-free future, 2) I’d probably been going way too long not knowing how use and query relational databases, and 3) I still didn’t know how to use and query relational databases.

I was in that state everyone finds himself in at one point or another: I went from knowing of something to knowing about it, but I still wasn’t quite ready to drop the prepositions and just start knowing it. There was still a lot I wasn’t sure of: what are best practices for using relational databases? What can relational databases do, and what can’t they do? Also, just by virtue of being exposed to data journalism through the Knight Lab, I was aware of at least two other relational database managers other than MySQL, namely SQLite and PostgreSQL. What were the differences between the three, and why would I want to use one over the other?

Luckily, there was another session at 10:15 a.m. on Friday called “Counting and summing with SQL” presented by Meghan Hoyer, a data journalist with the Associated Press. That talk was concerned mostly with using SQL commands to perform operations across records in tables, and used SQLite and the Firefox SQLite Manager. There, I learned a few important things about SQLite that set it apart from MySQL: an entire SQLite database is contained in a single file, which makes it easier to share if you’re working on a project where multiple people need to touch and modify the data at the same time or if you’d like to hand the data off to someone else or just keep a portable copy with you. It also doesn’t require you to spin up a server like MySQL does, which makes it extremely lightweight. There are some other key differences between the database management systems (which I highly recommend you read about here if you’re interested), including many that weren’t included in either session, but the exposure and hands-on experience with the tools were what really gave me the kick I needed to look into it more and make decisions for myself about which systems suited my requirements best. Relational databases are by no means a revolutionary or novel idea, but they are important for any data journalist to have in his or her toolkit.

About the author

Josh Shi

Student Fellow

Latest Posts

  • A Google Spreadsheets change affecting TimelineJS users

    Google recently changed something about their Sheets service which is causing many people to run into an error when they are making a new timeline. Note: there should be no impact on existing timelines! After this change, many of you click on the "preview" and get this message: An unexpected error occurred trying to read your spreadsheet data [SyntaxError] Timeline configuration has no events. There is a straightforward work-around, but it requires those of you who have...

    Continue Reading

  • How Americans think and feel about gun violence

    A man killed his wife, then himself. I want you to see his face and learn that he enjoyed fishing with his grandchildren. A small-time drug dealer is shot by two men in a parking lot. I find his Facebook profile and a photo shows him striking a playfully irreverent pose, giving the camera the middle finger. The photo’s comments take a mournful turn after a certain date. “Rest easy bro ???” Gun Memorial runs...

    Continue Reading

  • Software developers interested in journalism: Northwestern and The Washington Post want you!

    Northwestern University and The Washington Post are offering a unique opportunity for two talented software developers interested in applying their programming skills in media and journalism. Here’s the proposition: (1) a full-tuition scholarship to earn a master’s degree in journalism at Northwestern University, followed by (2) a six-month paid internship with The Post’s world-class engineering team, with the possibility of subsequent full-time employment. These opportunities are made possible by the John S. and James L....

    Continue Reading

  • What happened when Gun Memorial let anyone contribute directly to victim profiles

    If you’re reporting local or niche news, there’s a good chance that your audience collectively knows more about the story than you do. That’s especially true for us at Gun Memorial, a small publication with a nationwide mission of covering every American who is shot dead. In our latest, mostly successful, experiment, we let readers add to our stories without editor intervention. This article shares some lessons from that experience. Asking for reader contributions A...

    Continue Reading

  • How conversational interfaces make the internet more accessible for everyone

    This story is part of a series on bringing the journalism we produce to as many people as possible, regardless of language, access to technology, or physical capability. Find the series introduction, as well as a list of published stories here. In 2004, human-computer interaction professor Alan Dix published the third edition of Human-Computer Interaction along with his colleagues, Janet Finley, Gregory Abowd, and Russell Beale. In a chapter called “The Interaction,” the authors wrote...

    Continue Reading

  • Three tools to help you make colorblind-friendly graphics

    This story is part of a series on bringing the journalism we produce to as many people as possible, regardless of language, access to technology, or physical capability. Find the series introduction, as well as a list of published stories here. I am one of the 8% of men of Northern European descent who suffers from red-green colorblindness. Specifically, I have a mild case of protanopia (also called protanomaly), which means that my eyes lack...

    Continue Reading

Storytelling Tools

We build easy-to-use tools that can help you tell better stories.

View More