Posts

Archive of posts with the tag

  • Five data scraping tools for would-be data journalists

    This past Fall, I spent time with the NPR News Apps team (now known as NPR Visuals) coding up some projects, working mainly as a visual/interaction designer. But in the last few months, I’ve been working on a project that involves scraping newspaper articles and Twitter APIs for data. I was a relative beginner with Python — I’d pair coded a bit with others and made some basic programs, but nothing too complicated. I knew...

    Continue Reading

  • Hack or Hacker? Know when it is appropriate to access data and when it is not

    Attending NICAR14 as a computer science student without a journalism background was an interesting experience, to say the least. Never have I been surrounded by so many journalists (and developers) who were so passionate about data and the tools that can help them attain it. As the journalism and developer worlds are converging and as access to information is becoming ever more important, the question of “when it is appropriate to access data and when is...

    Continue Reading

  • Web scrapers for journalists: Haystax and other graphical interface systems

    I’ve spent my last weeks as a Knight Lab student fellow exploring web scrapers for non-programmers through an open source browser plugin called Haystax. As a journalism student who picked up computer science, I love scraping because you create a program that acts like a reporter, tracking the information you want from web pages you specify. It’s a useful technique to save journalists time copying and pasting data from an organization’s website, and scraping can...

    Continue Reading

subscribe via RSS