Accessing Twitter data: the firehose

I’ve been using the Twitter API to search tweets, which has been good for ad hoc queries and low volume analysis, returning hundred to low thousands of tweets. I’d heard about the Twitter ‘firehose’ and the potential to access all tweets and thought I’d dig a bit deeper.

Twitter provide three forms of access (see BrightPlanet):

  1. Twitter’s Search API (pull): limited to 3200 tweets per request, only accesses the last 5000 tweets for a keyword, number of requests allowed in a time period is limited
  2. Twitter’s Streaming API (push): users register a set of criteria (keywords, usernames, locations, named places, etc.) and as tweets match the criteria, they are pushed directly to the user. Provides a sample of tweets – anywhere between 1% and 40% of available tweets
  3. Twitter’s Firehose: guaranteed to provide 100% of tweets that match your search criteria

Option 3 sounds wonderful, why would one ever want to bother with the limitations of 1 and 2? Because 1 and 2 are free and 3 incurs a substantial charge. Twitter work with four companies, who have access to the Firehose and provide access to data users:

  1. Gnip
  2. Topsy
  3. Data sift
  4. NTT Data

Following Apple’s purchase of Topsy in December 2013, Twitter acquired Gnip in April 2014, leaving Data sift and NTT (who are based in Japan and focus on Japanese tweet data) as independent providers of Firehose data (theguardian).

With access to the Firehose starting at around $500 to $3000 per month depending on the specificity of access, it’s out of reach for all but the most well-endowed academics. In early 2014 Twitter gave limited (and competitive) access to the Firehose (via Gnip) to academics (Wired, Poynter). The scheme closed on 15 March 2014 and is not currently accepting submissions. To get an idea of the ways in which Twitter data is used for research, visit the Twitter engineering site.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s