This special report by the Pew Research Center's Project for Excellence in Journalism on the social media reaction to the first 2012 presidential debate employed media research methods that combine PEJ's content analysis rules developed over more than a decade with computer coding software developed by Crimson Hexagon. This report is based on separate examinations of more than 5.8 million tweets, 6,300 blog posts and 262,000 Facebook posts.
Crimson Hexagon is a software platform that identifies statistical patterns in words used in online texts. Researchers enter key terms using Boolean search logic so the software can identify relevant material to analyze. PEJ draws its analysis samples from several million blogs, all public Twitter posts and a random sample of publicly available Facebook posts. Then a researcher trains the software to classify documents using examples from those collected posts. Finally, the software classifies the rest of the online content according to the patterns derived during the training.
According to Crimson Hexagon: "Our technology analyzes the entire social internet (blog posts, forum messages, Tweets, etc.) by identifying statistical patterns in the words used to express opinions on different topics." Information on the tool itself can be found at http://www.crimsonhexagon.com/ and the in-depth methodologies can be found here http://www.crimsonhexagon.com/products/whitepapers/.
Crimson Hexagon measures text in the aggregate and the unit of measure is the ‘statement' or assessment, not the post or Tweet. One post or Tweet can contain more than one statement if multiple ideas are expressed. The results are determined as a percentage of the overall conversation.
The time frame for the analysis is October 3 9:00 p.m. EST through October 4 9:00 a.m. EST, 2012.
PEJ used Boolean searches to narrow the universe to relevant posts. Common terminology posted by users varies for each platform. Therefore, PEJ used slightly different search filters for each.
For blogs, PEJ used the following search filter:
(barack OR obama OR Mitt OR Romney OR Lehrer)
For Twitter and Facebook, the search filter was:
(barack OR obama OR debate OR Mitt OR Romney OR Lehrer OR moderator OR zinger OR bigbird OR "big bird" OR 2012 OR candidate OR president OR PBS)