Facebook IPO Not Selling on Social Media Methodology
This special report by the Pew Research Center's Project for Excellence in Journalism on the news coverage surrounding the Facebook IPO employed media research methods that combine PEJ's content analysis rules developed over more than a decade with computer coding software developed by Crimson Hexagon. This report is based on separate examinations of more than 6 million tweets, 120,000 blog posts and 6,800 Facebook posts in order to look at the amount and frame of the conversation related to the company’s IPO. Crimson Hexagon is a software platform that identifies statistical patterns in words used in online texts. Researchers enter key terms using Boolean search logic so the software can identify relevant material to analyze. PEJ draws its analysis samples from several million blogs, all public Twitter posts and a random sample of publicly available Facebook posts. Then a researcher trains the software to classify documents using examples from those collected posts. Finally, the software classifies the rest of the online content according to the patterns derived during the training. According to Crimson Hexagon: "Our technology analyzes the entire social internet (blog posts, forum messages, Tweets, etc.) by identifying statistical patterns in the words used to express opinions on different topics." Information on the tool itself can be found at http://www.crimsonhexagon.com/ and the in-depth methodologies can be found here http://www.crimsonhexagon.com/products/whitepapers/. Crimson Hexagon measures text in the aggregate and the unit of measure is the ‘statement' or assessment, not the post or Tweet. One post or Tweet can contain more than one statement if multiple ideas are expressed. The results are determined as a percentage of the overall conversation. The time frame for the analysis is May 14-20, 2012. PEJ used Boolean searches to narrow the universe to relevant posts. Because “Facebook” and “FB” are common terms used on social media to discuss things other than the company’s IPO, PEJ used a set of terms that appeared in numerous irrelevant messages to expedite the software’s training in regards to identifying off-topic posts. Common terminology posted by users varies for each platform. Therefore, PEJ used slightly different search filters for each. For Blogs, PEJ used the following search filter: (Zuckerberg OR Zuckerburg OR Facebook OR FB OR IPO OR Saverin) AND -("fb friends" OR "Like us on" OR "Like our facebook" OR "Morning FB" OR "like our page" OR "Like us on" OR "notify you via" OR "check us out" OR "please share it with" OR "join us on" OR "FB link" OR "follow me on" OR "Share this on") For Twitter, the search filter was: (Zuckerberg OR Zuckerburg OR Facebook OR FB OR IPO OR Saverin) AND -(mother OR mothers OR "fb friends" OR birthday OR Goodnight OR Played OR "Like us on" OR "Like our facebook" OR "Morning FB" OR "like our page" OR “10 Facebook Tips” OR “#fb” OR “join us on”) For posts on Facebook, the search filter was: (Zuckerberg OR Zuckerburg OR IPO OR Saverin) AND -(mother OR mothers OR "fb friends" OR birthday OR Goodnight OR Played OR "Like us on" OR "Like our facebook" OR "Morning FB" OR "like our page") Note: During the CH training process, the program learns to identify relevant posts and exclude messages that are not related from the results. For this project, only posts that are about Facebook’s IPO, the growth of Facebook, the economic impact of Facebook, Mark Zuckerberg or Eduardo Saverin were included. All other messages, included ones about an individual’s personal use of Facebook, were excluded. |
|
|