Bibliyotèk la nan Kongrè a Genyen yon masiv Twitter Achiv yo

{\rtf1\ansi\ansicpg1252\cocoartf1187\cocoasubrtf340 {\fonttbl\f0\fswiss\fcharset0 ArialMT;} {\colortbl;\red255\green255\blue255;\red26\green26\blue26;\red0\green0\blue233;} \margl1440\margr1440\vieww10800\viewh8400\viewkind0 \deftab720 \pard\pardeftab720\sa220 \f0\b\fs24 \cf2 BY SHANLEY REYNOLDS\uc0\u8232 ANCHOR ZACH TOOMBS \b0 \uc0\u8232 \u8232 The Library of Congress now holds every Tweet posted in the company\'92s first four years of existence. Yes. All of them. {\field{\*\fldinst{HYPERLINK "http://www.ktvu.com/"}}{\fldrslt \cf3 \ul \ulc3 KTVU}} and {\field{\*\fldinst{HYPERLINK "http://www.ny1.com/"}}{\fldrslt \cf3 \ul \ulc3 NY1 explain}}.\uc0\u8232 \ \b \'93It's taken four years to hit 21 billion tweets. but now the library congress will archiving and indexing all of them. the library of congress is not clear on how the ongoing archive will be used.\'94 \b0 \uc0\u8232 \u8232 \b \'93Twitter is donating its entire archive to the library. The library says it has collected about 170 billion public tweets from twitter since the first one was posted in 2006\'94 \b0 \uc0\u8232 \u8232 The project was started in April 2010 when Twitter and the Library teamed up to create the archive. The Library reports they now have received all of the tweets from 2006-2010. In all, that\'92s...\u8232 \u8232 \b \'93...approximately 170 billion tweets totaling 133.2 terabytes for two compressed copies.\'94 \b0 \uc0\u8232 \u8232 The Library of Congress might have all that information, but, {\field{\*\fldinst{HYPERLINK "http://www.buzzfeed.com/jwherrman/library-of-congress-falls-behind-on-twitter-archiv"}}{\fldrslt \cf3 \ul \ulc3 Buzzfeed reports}}, they don\'92t really seem to know what to do with it.\uc0\u8232 \u8232 \b \'93In fact, it appears that the federal institution \'97 famed for its preservation of physical documents \'97 has bitten off more than it can technically chew on a big data project of immense scope and expense.\'94 \b0 \uc0\u8232 \u8232 However, {\field{\*\fldinst{HYPERLINK "http://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire-twitter-archive/"}}{\fldrslt \cf3 \ul \ulc3 The Library of Congress said in its blog\ulnone }}this venture really shouldn\'92t be that surprising because they aren\'92t JUST about books. \'a0They\'92ve been collecting web based material since 2000.\uc0\u8232 \u8232 \b (GFX_Library of Congress)\uc0\u8232 \'93Today we hold more than 167 terabytes of web-based information, including legal blogs, websites of candidates for national office, and websites of Members of Congress.\'94\u8232 {\field{\*\fldinst{HYPERLINK "http://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire-twitter-archive/"}}{\fldrslt \cf3 \ul \ulc3 http://blogs.loc.gov/loc/2010/04/how-tweet-it-is-library-acquires-entire-twitter-archive/}} \b0 \ \uc0\u8232 And the reasoning for not making the completed archive accessible to the public?\u8232 \u8232 Because of the volume of tweets, as of now, it could take up to 24 HOURS to search for a term in the archive, which{\field{\*\fldinst{HYPERLINK "http://mashable.com/2013/01/05/library-of-congress-twitter/"}}{\fldrslt \cf3 \ul Mashable says\ulnone }}\'93isn\'92t a workable option.\'94\uc0\u8232 \u8232 {\field{\*\fldinst{HYPERLINK "http://mashable.com/2013/01/05/library-of-congress-twitter/"}}{\fldrslt \cf3 \ul \ulc3 Mashable also says}} once the Library of Congress does come up with a reasonable way of searching the archive, don\'92t expect any average Joe to have easy access.\uc0\u8232 \u8232 \b \'93The Library has agreed not to make most of the archive easily downloadable on its website, and researchers must agree not to use the data in the archive for commercial purposes in order to gain access to it.\'94 \b0 \ }