Library of Congress almost done archiving 170 billion tweets

Published time: January 07, 2013 18:08
Edited time: January 07, 2013 22:08
(AFP Photo / Lional Bonaventure)

Don’t let that dwindling number of Twitter followers drag you down: users of the successful social networking service are about to have a huge new audience as the Library of Congress nears archiving every public tweet ever sent.

In a new statement released by the largest library in the world, Washington’s premiere research center says it is almost done with the first steps in a project involving a massive trove of micromessages sent over Twitter going all the way back to when the site first got off the ground in 2006 [.pdf].

In April 2010, Twitter announced that every public tweet published since its inception would be added to the Library of Congress so that the Untied State’s top researchers could have access to a then-untapped form of correspondence that was thought to be on the way to becoming as commonplace as snail mail. Today, they admit that they’ve almost reached that goal.

“The Library’s first objectives were to acquire and preserve the 2006-10 archive; to establish a secure, sustainable process for receiving and preserving a daily, ongoing stream of tweets through the present day; and to create a structure for organizing the entire archive by date. This month, all those objectives will be completed,” the Library announced last week.

Of course, newer users of Twitter won’t be forgotten either. Once the Library secured a method of collecting all archived tweets, it couldn’t just end there. In February 2011 they began receiving “current” tweets sent after the 2010 cutoff, and by last month they’ve figured that in all there are now roughly 170 billion public tweets in their archives.

As the social networking site only increases in terms of users, that figure is expected to only get bigger.

“The volume of tweets the Library receives each day has grown from 140 million beginning in February, 2011 to nearly half a billion tweets each day as of October, 2012,” the library claims.

As one can imagine, such a spectacular amount of information isn’t exactly easy to make sense of. The Library says they are sitting on around 133.2 terabytes of tweets at the moment — so many messages that running a search for a single keyword can take as long as 24 hours right now.

“This is an inadequate situation in which to begin offering access to researchers, as it so severely limits the number of possible searches,” the Library explains. “The Library’s focus now is on confronting and working around the technology challenges to making the archive accessible to researchers and policymakers in a comprehensive, useful way.”

“It is clear that technology to allow for scholarship access to large data sets is lagging behind technology for creating and distributing such data. Even the private sector has not yet implemented cost-effective commercial solutions because of the complexity and resource requirements of such a task.”

The Library says they are now pursuing partnerships with the private sector “to allow some limited access capability in our reading rooms,” and Gawker reports that they’ve already received requests from over 400 researchers who want to feast their eyes on the billions upon billions of tweets. Don’t think for a minute that that means anyone is invited over to comb through their collection though. Under their contract with Twitter, the library can only allow access to public tweets sent longer than six months ago, and only to “bona fide researchers” who are prohibited from conducting commercial research at the library.

Gnip, a Colorado-based social media enterprise company picked by Twitter to handle moving the Tweets from Silicon Valley to the nation’s capital, tells Talking Points Memo that they think the final product will be amazing in terms of what it can do to modern researchers.

“Gnip believes Twitter represents the largest archive of human behavior to have ever existed. We’re thrilled that we’re able to partner with the Library of Congress to help make this data available to researchers. At Gnip, we believe that the value from social data is limitless and often get inquiries from academic researchers looking to analyze social data from Twitter. We’re excited by the progress the Library of Congress has made so far,” the company states.

Twitter says that they will be completely caught up on collecting all older tweets sometime during January.

Comments (12)

registered (unregistered) 09.01.2013 17:43

Of course they don't care about the average user tweets; they are looking for the ones by 'terrorists' such as Ocuppy Wall Street, NRA Supporters, antiwar activists, anti/pro abortion and so on...

0

Undo

MonsantoBioTerroristUSA (unregistered) 09.01.2013 09:28

Welcome to East Germany USA. USA Naziland is putting in place a science fiction film Remember the interesting science fiction COLOSSUS  the Forbin Project. Its about two  super hyper computers (one from USSR and one from U-SS-A  that take control of the world. In the film however the computers get rid of the politicians and to prove to humans they are insects nukes a big city. These globalists are enemies of humanity. They belong in prison. USA is NO I repeant the land of the brave and the free. Never was. We now see the real face of the scum running USA.

0

Undo

Waste and experiment of human behaviour (unregistered) 09.01.2013 02:45

Like any other  "social" network, twitter is an experiment and a grab of  info for tyrannic governments like the US to use it against their own people.  It only works for people who make money out of it, or attention whores. Get a f***ing MEGAPHONE instead to let people know what you are doing every day. What a shame! Congress should be fixing the budget instead! Good thing I don't have a twitter account. What a f***ing waste of time.

+1

Undo

View all comments (12)
Add comment

By posting your comment, you agree to abide by our Posting rules

Log in to comment in full, or comment anonymously under character-limit restriction.

100 Text

– required fields

Register or

Name

Password

Show password

Register

or Register

Request a new password

Send

or Register

To complete a registration check
your Email:

or Register

A password has been sent to your email address

Edit profile

Name

New password

Retype new password

Current password

Save

Cancel

Follow us