A Northwestern University joint initiative of Medill School of Journalism, Media, Integrated Marketing Communications and the Robert R. McCormick School of Engineering & Applied Science. Northwestern University joint initiative of Medill & McCormick School of Engineering.

twXplorer — A smarter way to search Twitter

TwXplorer, a new social-media research tool launched today by the Knight Lab, started with one journalist who told us he had a problem.

Peter Slevin, a Medill faculty member, has been working on a book about Michelle Obama. As part of that work, he periodically tracks her place in the “global conversation” by searching Twitter for references to the first lady.

What he gets back: a long list of tweets mentioning Michelle Obama. He can do little more than scroll through them, jot down notes about what he finds and tweak his search terms. As a way of keeping tabs on what people are saying about the first lady, it isn’t very effective — or efficient.

“The problem when searching Twitter for a very common term such as ‘Michelle Obama’ is that there are few if any filters on what you get back,” Slevin said. “Especially in breaking news situations, you find yourself scrolling through a very long and undifferentiated list of tweets.”

Slevin’s problem inspired the first iteration of twXplorer this spring, the work of a team of journalism and computer science students in a class led by me and Larry Birnbaum of the Knight Lab faculty.  Over the summer, the Lab’s staff built the production version.

TwXplorer adds value to Twitter searches in different ways.  Here are the key things you can do with twXplorer:

See search results four ways

For any search terms you enter into twXplorer, you get four different ways to see your search results:

  1. Up to 500 recent tweets containing the terms you entered.
  2. In tweets that include your search terms, a bar graph showing the most popular other words that appear.
  3. The most popular hashtags included in tweets containing your search terms
  4. The most popular links in tweets containing your search terms.

Understand what Twitter users are talking about

At its core, twXplorer is a tool for searching Twitter in order to understand the global conversation about a topic. TwXplorer improves on Twitter search (and other search tools such as Topsy) by displaying the bar graph of the most common words and phrases used on Twitter in addition to whatever it was you searched for. In counting the most popular terms, twXplorer groups together any terms that have the same word stems — for instance, “president” and “presidency” are combined. Add it all up and you get a good, visual overview of what people are saying about your topic.

Discover unexpected, but relevant content

The links section of twXplorer is a great way to find news coverage and commentary related to your search terms. For instance, the class team at one point searched for “New York bike sharing.” They discovered articles about how useful the city’s new bike-sharing program would be for tourists — a topic they hadn’t thought about.

Find good hashtags to follow or add to your tweets

The list of most popular hashtags could be useful to you in at least a couple of ways. First, you might discover a hashtag that you’d like to keep track of regularly.  To maximize the audience for a planned tweet of your own, you might find a good hashtag or two to add to it.

Refine your search through a “drill-down” approach

It’s well-established that most people who use a search engine type in only a word or two — even though longer queries will often produce more relevant results. TwXplorer has a clever approach for helping you refine your search: you simply click on a term or hashtag on the search results page and you see information only for the subset of tweets that include that term. This is an easy way to refine your search to find exactly what you’re looking for. I, for one, would love to see a similar feature on other kinds of search tools — even Google’s web search.

Save your searches

Journalists — and others — often are interested in understanding what’s being said on Twitter at different points in time. Slevin, for instance, wanted to be able to go back in time to see what Twitter users were saying about Michele Obama in the past. TwXplorer allows you to save your search — capturing all four views of your search results — any time you want to. Then you can go back and explore those results, drilling down to refine your searches. You can also delete saved searches that are no longer useful for you.

See hot topics on your Twitter lists

Once we realized how useful twXplorer could be, we looked for other ways to apply its technology within the limits of Twitter’s Application Programming Interface. We discovered that we could apply the basic twXplorer search approach not only to recent tweets, but to the latest tweets being posted by members of any Twitter lists you have created or subscribed to. If you use lists to collect tweets from Twitter users you are interested in, this feature can be incredibly helpful. For instance, I have created a “hackshackers” list consisting of journalists who do computer programming and data analysis. But I scan tweets from this list only rarely. Using twXplorer, at any given time, I can see what the hot topics are among these Twitter users.

* * *

The first version of twXplorer was built by two undergraduate computer science students (Jeanette Huang and Allen Zeng) and journalism master’s student Miguel Huerta.  They formed one of eight teams in the most recent of our “collaborative innovation” classes, which are a great platform to test new software ideas, to generate creative solutions and to involve students in the development of new tools for journalists, publishers and media consumers.

“It was one of the most successful projects in the spring class,” Birnbaum said. “We could immediately see the value for journalists and other users, and it was clear that the Knight Lab could launch it for public use in a reasonable time frame.”

Working with Larry and me, Knight Lab developer Jennifer Wilson built the new version of twXplorer this summer, with help from the Lab’s design/research fellow Jessica Soberman (MSJ 2013) and art director Aaron Salmon.

You might be interested in knowing a little more about how twXplorer works. Here’s an overview:

* OAuth sign-in: Before using twXplorer, you sign in with your Twitter ID. This has several implications. First, it means Twitter will identify your search request as coming from you rather than twXplorer as a website, which means many people can use twXplorer concurrently without fear of running up against the limits of Twitter’s API. Second, it means we can access your Twitter lists for the twXplorer lists feature. And finally, it means we can associate saved searches with your Twitter account rather than making you create a separate twXplorer login.

* Search by language: By default, the twXplorer search looks for tweets in the language associated with your Twitter profile. But you can search for tweets in 12 languages. Twitter offers “best-effort” language detection, which is not perfect but can help you find tweets written in those languages.

* Find up to 500 tweets: To provide a relatively swift response, and comply with Twitter’s API limits, twXplorer finds the 500 most recent tweets that include your search terms, then does not display those that Twitter codes as retweets (“new style” retweets as opposed to those where the content is preceded by RT). If twXplorer reports finding 400 tweets, it means it found 400 unique tweets — which you can scroll through — and 100 “new style” retweets.

* Zero in on the most relevant terms: TwXplorer excludes common words like “the.” Then it looks not only for single words, but also user mentions (such as @KnightLab) and hashtag text (#chicago counts as “chicago”). It also looks for “bigrams” (two-word phrases) that show up more than once. If a bigram is common (say, “white house”), twXplorer doesn’t count them again as separate terms.  TwXplorer groups terms together when they have a common stem (“look,” “looks” and “looking” are counted as the same term).

* Count the terms: The number that appears next to any term, hashtag or link is the number of tweets that include that term. The counts include terms used in retweets, although the retweets are not all displayed.

* Drill down: When you click on any term, hashtag or link in your search result, twXplorer returns only the subset of search results containing the term you clicked on. If you click on a second term, the subset of tweets is narrowed even further. For instance, if you filter separately by “chicago” and “bulls,” twXplorer will display search results only for tweets containing both terms.

Like what you see?

Northwestern University Knight Lab advances news media innovation through exploration, experimentation and education. The Lab's free publishing tools help to make information more meaningful and promote quality storytelling on the Internet.

About the author

About Posted on September 16, 2013 Posted by

Rich Gordon

Professor and Director of Digital Innovation, Medill School of Journalism
Journalism/tech intersection, my passion for 25 years: data journalism, Miami Herald web director, now hacker journalism. #iloveinnovation
  • Congrats to the entire team! This is an incredible tool and I can already see it improving how I do searches and devise strategies #iloveinnovationtoo