Travis Swicegood's real world data lessons from Texas Tribune

Travis Swicegood

Travis Swicegood, director of technology at  Texas Tribune, spoke this week at the latest Hacks/Hackers Chicago Meet-up about the challenges of working with public data — real world data, as Swicegood calls it.

There are plenty of challenges in collecting, managing and presenting data from a state the size of Texas — 26 million people, 254 counties, five major cities and a gross state economy of $1.2 trillion. Swicegood shared just a few of the challenges of managing and wrangling data:

“Data frequently, frequently, frequently disappears.”

Swicegood described “real world data” as being dirty and unpredictable. Despite those negative attributes, lost data is even worse. He has a copy of every dataset he works with at the Tribune — a habit he formed from after seeing data go missing from local locations as well as government websites. He also distinguished the Texas Tribune’s civic datasets from “big data.” While Tribune datasets may include millions of records, Swicegood noted data collections of, say, social media organizations are much larger and constantly growing.

“Making assumptions about the data is something that can get you in lots of trouble.”

Tribune beat reporters act as domain experts for data and in-depth “explorers.” The dedicated data folks might be best suited to manipulation and vizualization, but a beat reporter can help a developer understand datasets and explain apparent discrepancies. They've also got an ability to understand what makes them really interesting thanks to time on the beat.

“This is why more citizens don't grab big government datasets.”

Swicegood recounted the difficulties of collecting inconsistent school data from the 254 Texas counties for the Tribune's Public Schools Explorer. Not to mention the occasional 700-column wide CSV file and the reports from 141 agencies to compile the Tribune's famous state employee salary database.

Swicegood also shared his data tools and techniques for working with dirty, inconsistent data. He recommended analysts capture their calculations in scripts, document their data munging and use version control so their operations can be replicated when new data is released.

"Our CEO likes to say we're The Boy Who Lived."

The Tribune launched within months of two other non-profit news organizations, Chicago News Cooperative and Bay Citizen. Four years later Texas Tribune is the only one still running as originally envisioned — Chicago News Coop shut down in 2012, and Bay Citizen has partnered and rebranded a few times. Swicegood said the Tribune’s business model and focus on Texas (a state with plenty of wealthy donors) have helped it succeed.

“I like to say we're a technology company that produces a journalism based product.”

Texas Tribune has always been an online only publication. Swicegood’s slowly winning converts to his position … to the chagrin of some in the newsroom, he said.

Latest Posts

  • Building a Community for VR and AR Storytelling

    In 2016 we founded the Device Lab to provide a hub for the exploration of AR/VR storytelling on campus. In addition to providing access to these technologies for Medill and the wider Northwestern community, we’ve also pursued a wide variety of research and experimental content development projects. We’ve built WebVR timelines of feminist history and looked into the inner workings of ambisonic audio. We’ve built virtual coral reefs and prototyped an AR experience setting interviews...

    Continue Reading

  • A Brief Introduction to NewsgamesCan video games be used to tell the news?

    When the Financial Times released The Uber Game in 2017, the game immediately gained widespread popularity with more than 360,000 visits, rising up the ranks as the paper’s most popular interactive piece of the year. David Blood, the game’s lead developer, said that the average time spent on the page was about 20 minutes, which was substantially longer than what most Financial Times interactives tend to receive, according to Blood. The Uber Game was so successful that the Financial...

    Continue Reading

  • With the 25th CAR Conference upon us, let’s recall the first oneWhen the Web was young, data journalism pioneers gathered in Raleigh

    For a few days in October 1993, if you were interested in journalism and technology, Raleigh, North Carolina was the place you had to be. The first Computer-Assisted Reporting Conference offered by Investigative Reporters & Editors brought more than 400 journalists to Raleigh for 3½ days of panels, demos and hands-on lessons in how to use computers to find stories in data. That seminal event will be commemorated this week at the 25th CAR Conference, which...

    Continue Reading

  • Prototyping Augmented Reality

    Something that really frustrates me is that, while I’m excited about the potential AR has for storytelling, I don’t feel like I have really great AR experiences that I can point people to. We know that AR is great for taking a selfie with a Pikachu and it’s pretty good at measuring spaces (as long as your room is really well lit and your phone is fully charged) but beyond that, we’re really still figuring...

    Continue Reading

  • Capturing the Soundfield: Recording Ambisonics for VR

    When building experiences in virtual reality we’re confronted with the challenge of mimicking how sounds hit us in the real world from all directions. One useful tool for us to attempt this mimicry is called a soundfield microphone. We tested one of these microphones to explore how audio plays into building immersive experiences for virtual reality. Approaching ambisonics with the soundfield microphone has become popular in development for VR particularly for 360 videos. With it,...

    Continue Reading

  • Audience Engagement and Onboarding with Hearken Auditing the News Resurrecting History for VR Civic Engagement with City Bureau Automated Fact Checking Conversational Interface for News Creative Co-Author Crowdsourcing for Journalism Environmental Reporting with Sensors Augmented Reality Visualizations Exploring Data Visualization in VR Fact Flow Storytelling with GIFs Historical Census Data Information Spaces in AR/VR Contrasting Forms Of Interactive 3D Storytelling Interactive Audio Juxtapose Legislator Tracker Storytelling with Augmented Reality Music Magazine Navigating Virtual Reality Open Data Reporter Oscillations Personalize My Story Photo Bingo Photojournalism in 3D for VR and Beyond Podcast Discoverability Privacy Mirror Projection Mapping ProPublica Illinois Rethinking Election Coverage SensorGrid API and Dashboard Sidebar Smarter News Exploring Software Defined Radio Story for You Storyline: Charts that tell stories. Storytelling Layers on 360 Video Talking to Data Visual Recipes Watch Me Work Writing and Designing for Chatbots
  • Prototyping Spatial Audio for Movement Art

    One of Oscillations’ technical goals for this quarter’s Knight Lab Studio class was an exploration of spatial audio. Spatial audio is sound that exists in three dimensions. It is a perfect complement to 360 video, because sound sources can be localized to certain parts of the video. Oscillations is especially interested in using spatial audio to enhance the neuroscientific principles of audiovisual synchrony that they aim to emphasize in their productions. Existing work in spatial......

    Continue Reading

Storytelling Tools

We build easy-to-use tools that can help you tell better stories.

View More