
Archive of posts with the tag

  • Five data scraping tools for would-be data journalists

    This past Fall, I spent time with the NPR News Apps team (now known as NPR Visuals) coding up some projects, working mainly as a visual/interaction designer. But in the last few months, I’ve been working on a project that involves scraping newspaper articles and Twitter APIs for data. I was a relative beginner with Python — I’d pair coded a bit with others and made some basic programs, but nothing too complicated. I knew...

    Continue Reading

  • Web scrapers for journalists: Haystax and other graphical interface systems

    I’ve spent my last weeks as a Knight Lab student fellow exploring web scrapers for non-programmers through an open source browser plugin called Haystax. As a journalism student who picked up computer science, I love scraping because you create a program that acts like a reporter, tracking the information you want from web pages you specify. It’s a useful technique to save journalists time copying and pasting data from an organization’s website, and scraping can...

    Continue Reading

subscribe via RSS