New Media Index Methodology

Overview

The New Media Index is a weekly report that captures the news agenda of social media, with a focus on blogs, Twitter and YouTube. These platforms are an important part of today's news information narrative and shape the way Americans interact with the news. The expansion of online blogs and other social media sites has allowed news-consumers and others outside the mainstream press to have more of a role in agenda setting, dissemination and interpretation. Through this New Media Index PEJ aims to find out what subjects in the national news the online sites focus on, and how that compared with the narrative in the traditional press.

Social media and the technologies available to help track it are also continuously changing and evolving. To stay current and reflect the social media conversation as it relates to news, PEJ constantly re-evaluates its methods of tracking and analyses.

After studying new tracking options, as well has noting adjustments made by some of the web tracking sites PEJ has used to gather posts and tweets, the Project made changes to the methodology beginning in August 2011 to both update and diversify the organizations it uses to gather and sort the top news stories each day.  

Since PEJ began monitoring social media in January 2009, it relied on the tracking site Icerocket to determine the most linked-to stories on blogs and Tweetmeme for the most linked to stories on Twitter. For a time, PEJ also relied on compilation for blog material with a news web tracking list from Technorati, but that tracking was suspended.

The new method continues to use the links as a proxy for measuring what social media are discussing, but allows for a wider range of sources and expands the possible types of subjects that appear among the most-discussed. While the old method relied on one source for each type of social media, the new relies on four, making the selection criteria more robust and the list more diverse.

Universe and Calculations

For blogs, both Technorati and Icerocket provide lists of the most-discussed stories at any given moment. Those will be monitored daily and a list for the week compiled. For the discussion on Twitter, two tracking sites will be used that list the top news subject each day, Tweetmeme and Twitturly.

The new method differs from the previous method in two primary ways. In the previous method, the top stories were reported in terms of the percent of links. Now, since the new tracking lists do not offer the precise number of links going to each URL, the ranking will be determined by the number of times each subject appears in the daily list each week. (See below for a specific description of how the current list is calculated.)

As a result of these changes, statistical comparisons between reports issued before August 2011 and those after are difficult to make. The earlier method included a percentage of links for each top story, but the new method instead offers a simple ranking of stories in order. This new system of ranking is stronger and more intuitive because offering percentages for online content is problematic. The internet is ever-changing and growing, and there is no constant baseline or denominator to calculate percentages from.

Every weekday at 9 a.m. EST, a PEJ researcher captures the lists from each of the four tracking web sites and records the five top stories on each. Each site uses a different method for the creation of their particular lists.

For the top stories on blogs:

  • Technorati indexes more than a million English-language blogs. The site uses its own algorithm to determine the daily list of the “Hottest Blogosphere Items” that tracks the number of blogs linking to a given story, along with the authority and influence of such blogs. Technorati does not disclose the details of its algorithm, but there are two reasons why PEJ believes the site is accurate and reliable. One, the stability of Technorati and the acceptance of its measures as an industry standard make the site a frequently used resource. Two, an examination of the top stories on Technorati over several months shows that their lists comport well with other aggregator sites.
  • Icerocket’s list of “Top Blog Posts” aggregates the top stories discussed in the blogosphere at any given time. Based on the tracking of more than 3,000 blog posts a day, Icerocket’s algorithm incorporates the number of blogs linking to a specific article along with the “rank” or popularity of different blogs. In the previous methodology, the NMI used the page that tracked the most popular “news” stories. That list, however, has become limited to more traditional news sites. Thus, PEJ has switched to a broader Icerocket list, its Top Blog Posts page, to allow for a wider range of sources.

Each of the top stories (10 combined from the two sites each weekday) is coded by PEJ staff for its primary storyline or focus. At the end of the week, researchers count the number of times each storyline appeared out of the 50 stories and determine the ranking of subjects based on those frequencies. If two or more stories appeared the same number of times, the final ranking is determined by factoring in how highly the tied stories appeared throughout the week.

For the top stories on Twitter:

  • Tweetmeme tracks all the links from public Tweets and lists which stories were linked-to most often over the previous 24 hours. As with Icerocket, PEJ has changed the page on Tweetmeme that is followed. Previously, the NMI included the “News” page. Currently, the NMI follows the main “Everything” page so that more sources and topics can be measured.
  • Twitturly uses a similar methodology. The site tracks every time someone tweets a URL and ranks the URLs that get the most links over the previous 24 hours. PEJ uses Twitturly’s “News” page, which includes any type of URL except for pictures and videos.

The method for determining the top subjects on Twitter is the same as is used for blogs.

For all the sites captured, only stories written in English are included in the sample. Links to pictures on yfrog and videos on Twitcasting are also excluded.

Differences from the NCI

In addition to the base calculation, there are three differences between the NMI and the NCI to note: 

1. While the capture times for the Web sites included in the News Coverage Index rotate each day, a decision was made to keep the times the same for the New Media Index. The reasoning is that since these lists compile the number of links to stories over a 24 or 48-hour window, rotating the time of capture would result in different increments of times between each capture. Through testing, PEJ has discovered that the stories on the lists change significantly more over a 24-hour period than they do over a 12 or 16-hour period.

2. While the News Coverage Index is comprised of primarily U.S.-based media outlets, the aggregators of blogs and other social media include both U.S. and non-U.S. blogs. In addition, stories that are linked to can be from non-U.S. sources.

3. PEJ’s weekly News Coverage Index includes Sunday newspapers while the New Media Index is Monday through Friday.

YouTube Videos

The New Media Index also includes a section of the most popular news video on YouTube each week.  

Each Friday at noon EST, a PEJ staff member captures the list of most viewed news and politics videos on YouTube over the previous week. These videos are categorized as such on the YouTube site and are often a mix of mainstream news reports, raw footage relating to breaking events, or other types of public affairs clips. PEJ determines the top five most viewed videos as they are listed on YouTube’s page at the time of capture.

Note: After consulting various reference guides and outside consultants on usage, the Project has chosen to refer to its several weekly content analysis reports as “indexes”—the version largely accepted in journalism—instead of “indices”—a term used more frequently in scientific or academic writing.