NICAR16: A glimpse at an Excel-free future thanks to relational databases

Some people come to NICAR with goals, whether they were networking goals or technical goals or partying goals. I came to my first NICAR with next to nothing. I wasn’t quite sure what to expect, other than it being half a week’s worth of sessions and panels about data, journalism, and everything in between, which it turns out, is quite a bit.

One thing I had been meaning to learn for a while, though, was how to use and query relational databases. I’ve worked in MongoDB and other schemaless databases before (read: I’ve worked with big JSON files), mostly because of their convenience and because strangers on the Internet who ran tutorial websites told me to. But I also knew in the back of my mind that I should probably learn how to use relational databases too, since the two do serve very different use cases and each has their pros and cons.

I went from knowing of something to knowing about it, but I still wasn’t quite ready to drop the prepositions and just start knowing it

It was only natural then that I started NICAR off with a 9 a.m. session titled “Intro to database manager MySQL I” with Paula Lavigne, a reporter at ESPN. The talk wasn’t a deep dive into MySQL or even into relational databases in general, nor was it trying to be. The talk, however, was a good first introduction to MySQL using Navicat to supplement the basic operations covered with some helpful displays.

Immediately after the session, I downloaded and set up a MySQL server on my laptop and started playing around with a couple of CSVs I had lying around on my computer. I quickly realized that 1) I was looking at what could be the beginning of an Excel-free future, 2) I’d probably been going way too long not knowing how use and query relational databases, and 3) I still didn’t know how to use and query relational databases.

I was in that state everyone finds himself in at one point or another: I went from knowing of something to knowing about it, but I still wasn’t quite ready to drop the prepositions and just start knowing it. There was still a lot I wasn’t sure of: what are best practices for using relational databases? What can relational databases do, and what can’t they do? Also, just by virtue of being exposed to data journalism through the Knight Lab, I was aware of at least two other relational database managers other than MySQL, namely SQLite and PostgreSQL. What were the differences between the three, and why would I want to use one over the other?

Luckily, there was another session at 10:15 a.m. on Friday called “Counting and summing with SQL” presented by Meghan Hoyer, a data journalist with the Associated Press. That talk was concerned mostly with using SQL commands to perform operations across records in tables, and used SQLite and the Firefox SQLite Manager. There, I learned a few important things about SQLite that set it apart from MySQL: an entire SQLite database is contained in a single file, which makes it easier to share if you’re working on a project where multiple people need to touch and modify the data at the same time or if you’d like to hand the data off to someone else or just keep a portable copy with you. It also doesn’t require you to spin up a server like MySQL does, which makes it extremely lightweight. There are some other key differences between the database management systems (which I highly recommend you read about here if you’re interested), including many that weren’t included in either session, but the exposure and hands-on experience with the tools were what really gave me the kick I needed to look into it more and make decisions for myself about which systems suited my requirements best. Relational databases are by no means a revolutionary or novel idea, but they are important for any data journalist to have in his or her toolkit.

About the author

Josh Shi

Student Fellow

Latest Posts

  • Introducing StorylineJS

    Today we're excited to release a new tool for storytellers.

    StorylineJS makes it easy to tell the story behind a dataset, without the need for programming or data visualization expertise. Just upload your data to Google Sheets, add two columns, and fill in the story on the rows you want to highlight. Set a few configuration options and you have an annotated chart, ready to embed on your website. (And did we mention, it looks great on phones?) As with all of our tools, simplicity...

    Continue Reading

  • Join us in October: NU hosts the Computation + Journalism 2017 symposium

    An exciting lineup of researchers, technologists and journalists will convene in October for Computation + Journalism Symposium 2017 at Northwestern University. Register now and book your hotel rooms for the event, which will take place on Friday, Oct. 13, and Saturday, Oct. 14 in Evanston, IL. Hotel room blocks near campus are filling up fast! Speakers will include: Ashwin Ram, who heads research and development for Amazon’s Alexa artificial intelligence (AI) agent, which powers the...

    Continue Reading

  • Bringing Historical Data to Census Reporter

    A Visualization and Research Review

    An Introduction Since Census Reporter’s launch in 2014, one of our most requested features has been the option to see historic census data. Journalists of all backgrounds have asked for a simplified way to get the long-term values they need from Census Reporter, whether it’s through our data section or directly from individual profile pages. Over the past few months I’ve been working to make that a reality. With invaluable feedback from many of you,......

    Continue Reading

  • How We Brought A Chatbot To Life

    Best Practice Guide

    A chatbot creates a unique user experience with many benefits. It gives the audience an opportunity to ask questions and get to know more about your organization. It allows you to collect valuable information from the audience. It can increase interaction time on your site. Bot prototype In the spring of 2017, our Knight Lab team examined the conversational user interface of Public Good Software’s chatbot, which is a chat-widget embedded within media partner sites.......

    Continue Reading

  • Stitching 360° Video

    For the time-being, footage filmed on most 360° cameras cannot be directly edited and uploaded for viewing immediately after capture. Different cameras have different methods of outputting footage, but usually each camera lens corresponds to a separate video file. These video files must be combined using “video stitching” software on a computer or phone before the video becomes one connected, viewable video. Garmin and other companies have recently demonstrated interest in creating cameras that stitch......

    Continue Reading

  • Publishing your 360° content

    Publishing can be confusing for aspiring 360° video storytellers. The lack of public information on platform viewership makes it nearly impossible to know where you can best reach your intended viewers, or even how much time and effort to devote to the creation of VR content. Numbers are hard to come by, but were more available in the beginning of 2016. At the time, most viewers encountered 360° video on Facebook. In February 2016, Facebook......

    Continue Reading

Storytelling Tools

We build easy-to-use tools that can help you tell better stories.

View More