November 9, 2011

Cain's Bad Stretch--A Campaign Coverage Update

About the Study

The findings are based on research that combines the conventional content analysis research methods conducted by human researchers with algorithmic technology developed by the company Crimson Hexagon. In this combined approach, researchers analyze media content for tone, using PEJ’s traditional rules and strict intercoder testing methods to assure reliability and accuracy. Those researchers then train the algorithm until it can replicate the results the researchers arrived at themselves. The power of the computers to code massive quantities of content in ways that replicate the human coding makes it possible for the study to examine what comes close to a census of all the news media offered to Americans via RSS feeds, providing a much deeper and more powerful sense of the media in the U.S. than traditional "sampling" can give. Samples of media offer a useful proxy, but only that. The comparison of the elite sample and the broad spectrum of media provide a sense of how those two cuts of news media compare. To be assured that the algorithm is accurate and current, researchers "retrain" the algorithm each week with new content, and test that the algorithm continues to produce accurate results.[1]

To arrive at the "elite" media sample, PEJ created a list of outlets that mirrored those in the Project’s weekly News Coverage Index. That sample involves media from five different sectors of national media-print, cable, broadcast, radio or audio, and online. The list of outlets is derived to include a broad range of outlets representative of the traditional or elite media universe.[2]

A number of people at the Project for Excellence in Journalism worked on this report. Associate Director Mark Jurkowitz and Director Tom Rosenstiel wrote the report. The creation of the monitors using the Crimson Hexagon software was supervised by Tricia Sartor, the manager of the weekly news index, and senior researcher Paul Hitlin. Researchers Kevin Caldwell and Nancy Vogt and content and training coordinator Mahvish Shahid Khan created and ran monitors using the computer technology. Tricia Sartor and researcher Steve Adams produced the charts. Dana Page handled the web and communications for the report.


FOOTNOTES 

[1] Extensive testing by Crimson Hexagon and PEJ has demonstrated that the tool is 97% reliable, that is, in 97% of cases analyzed, the technology’s coding has been shown to match human coding. In addition, PEJ conducted examinations of human intercoder reliability to show that the training process for complex concepts is replicable. Those tests came up with results that were within 85% of each other.

[2] Using Crimson Hexagon’s technology, which retrieves media content via RSS feeds, researchers found that five of those outlets could not be included. The radio programs of Rush Limbaugh, Ed Schultz and Sean Hannity did not have RSS feeds, though Hannity’s cable web content is coded through the FoxNews feed. Crimson Hexagon’s technology could not retrieve data from the Wall Street Journal’s RSS feeds because of a paywall, and the content from Google News aggregation of content produced by others was already coded elsewhere in Crimson Hexagon’s sample.