MozFest 2013: Journalists should command the command line

Journalists who want to learn more technology often jump into HTML, CSS and Javascript. Those are great places to start (as Knight Lab and others have written before), but if you want to maximize the potential of your computer, one of the first things you should learn is the command line!

Some quick background: Regular computer users access the computer via a graphical user interface (GUI). This interface allows you to interact with the machine using a mouse and images on your screen to make the computer do what you want.

But you can also navigate and control your computer by using text-based commands on something called the command line. By firing up a program called Terminal (at least on Mac), you can enter text commands and navigate through your computer.

Noah Veltman, a 2013 Knight-Mozilla Fellow at BBC News, led a session at MozFest called “Solve A Murder Mystery on the Command Line." The game involved a folder of .txt files with thousands of lines. To solve the murder mystery, we had to search through the files using the command line to pull out specific phrases.

After channeling my inner detective during the session, it occurred to me that I didn’t know how to use the command line until a couple months after I started programming. Once I started using the command line, I understood my machine much more and learned programming more quickly.

I talked to Noah after the session to gather a couple more ideas on why journalists should learn the command line.

It makes it easier to work with text data


When it comes to working with text data, especially large and messy files, command line tools are your best friends. Datasets might come to you as .csv files from a government agency, or as .txt files from local companies. To navigate through these files or to extract useful and interesting information the command line helps you do so quickly.

You might wonder, Why not just open Excel or Access to look at the data? While Excel works great with a 50MB file, the software isn't nearly as nimble with a 5GB file. By way of example, Illinois School Board School Scorecard files include more than 10,000 columns and precious little metadata to explain what’s contained in each one. The big and complicated files are very hard to comprehend and it turns out to be a challenge to analyze the data.

With the command line, a couple basic commands will help you analyze text data:

cat filename
The cat command allows you to read the files. It opens a text file and prints the content in terminal. Type the word cat followed by a space and the name of the file.

grep option pattern filename
The grep command is used to search text for specific patterns. For example if I would like to search for the word “corgi” in a file called corgi.txt. I would enter the command:
grep “corgi” corgi.txt.

head option filename
The head command reads the first few lines of any files. For example if I would like to look at the first 20 lines of a file I would enter the command:
head -n 20 corgi.txt

It lets you use tools built just for journalists


Tons of tools built for journalists involve using the command line, if only during the installation phase. You’ll often hear that “software package X is exactly what you need" for a particular task, but then you wind up looking at a GitHub repo where you learn that installation of that software involves various command line operations.

csvkit, for example, converts files to .csv and helps you to clean up and standardize data. Using csvkit and the command line, you can filter a .csv down to a subset of columns, search and filter rows, join various .csv files, etc. These tasks are done by simple commands like in2csv, csvcut csvgrep, etc. As opposed to copying and pasting from one Excel file to another, these short commands allow you to clean and organize your data in a matter of seconds.

It’s a gateway to full scripting languages


Noah referred to command line tools as “a gateway drug” to actually learning a full scripting language like Python. Once you begin to feel comfortable with a few basic commands, you can begin to combine them in all sorts of ways to get what you need even if you cannot write a custom data processing script.

There are plenty of reasons to learn the command line. Hopefully your curiosity is sparked. If it is, click the links below to learn more:

http://cli.learncodethehardway.org/book/
http://www.youtube.com/watch?v=Fzn6jbaw6O0&feature=related
http://www.linuxjournal.com/content/downloading-entire-web-site-wget
http://csvkit.readthedocs.org/en/latest/

About the author

KK Rebecca Lai

Undergraduate Fellow

Latest Posts

  • Introducing StorylineJS

    Today we're excited to release a new tool for storytellers.

    StorylineJS makes it easy to tell the story behind a dataset, without the need for programming or data visualization expertise. Just upload your data to Google Sheets, add two columns, and fill in the story on the rows you want to highlight. Set a few configuration options and you have an annotated chart, ready to embed on your website. (And did we mention, it looks great on phones?) As with all of our tools, simplicity...

    Continue Reading

  • Join us in October: NU hosts the Computation + Journalism 2017 symposium

    An exciting lineup of researchers, technologists and journalists will convene in October for Computation + Journalism Symposium 2017 at Northwestern University. Register now and book your hotel rooms for the event, which will take place on Friday, Oct. 13, and Saturday, Oct. 14 in Evanston, IL. Hotel room blocks near campus are filling up fast! Speakers will include: Ashwin Ram, who heads research and development for Amazon’s Alexa artificial intelligence (AI) agent, which powers the...

    Continue Reading

  • Bringing Historical Data to Census Reporter

    A Visualization and Research Review

    An Introduction Since Census Reporter’s launch in 2014, one of our most requested features has been the option to see historic census data. Journalists of all backgrounds have asked for a simplified way to get the long-term values they need from Census Reporter, whether it’s through our data section or directly from individual profile pages. Over the past few months I’ve been working to make that a reality. With invaluable feedback from many of you,......

    Continue Reading

  • How We Brought A Chatbot To Life

    Best Practice Guide

    A chatbot creates a unique user experience with many benefits. It gives the audience an opportunity to ask questions and get to know more about your organization. It allows you to collect valuable information from the audience. It can increase interaction time on your site. Bot prototype In the spring of 2017, our Knight Lab team examined the conversational user interface of Public Good Software’s chatbot, which is a chat-widget embedded within media partner sites.......

    Continue Reading

  • Stitching 360° Video

    For the time-being, footage filmed on most 360° cameras cannot be directly edited and uploaded for viewing immediately after capture. Different cameras have different methods of outputting footage, but usually each camera lens corresponds to a separate video file. These video files must be combined using “video stitching” software on a computer or phone before the video becomes one connected, viewable video. Garmin and other companies have recently demonstrated interest in creating cameras that stitch......

    Continue Reading

  • Publishing your 360° content

    Publishing can be confusing for aspiring 360° video storytellers. The lack of public information on platform viewership makes it nearly impossible to know where you can best reach your intended viewers, or even how much time and effort to devote to the creation of VR content. Numbers are hard to come by, but were more available in the beginning of 2016. At the time, most viewers encountered 360° video on Facebook. In February 2016, Facebook......

    Continue Reading

Storytelling Tools

We build easy-to-use tools that can help you tell better stories.

View More