March 13, 2014

Social, Search and Direct

Methodology

A number of people contributed to this report. Amy Mitchell, Director of Journalism Research, oversaw the research project and served as lead author of the report.

The report was written by Amy Mitchell, Associate Director Mark Jurkowitz and Research Associate Kenneth Olmstead. Graphics Director Michael Keegan and Research Analyst Katerina Eva-Matsa developed the graphics and tables. The report was number checked by Kenneth Olmstead and copy edited by Research Associate Jan Boyles.

comScore Methodology

comScore is a leading global web analytics firms that contracts with thousands of different companies from publishers such as the Financial Times to companies like Best Buy. comScore’s metrics of online audience size and behaviors are based primarily on a recruited panel of Internet users worldwide, and then supplemented with server side metrics from website tagged by comScore (referred to by comScore as “server-centric”). comScore combines these two methods to create what they call a Unified Digital MeasurementTM.

A “tag” (sometimes called a “tracking pixel”) is placed on content provider’s websites in comScore’s network of partner sites and clients. Each time a user visits a site that contains one of these tags, an “event” is then logged on the tag host server (where comScore stores information about all of the tags in comScore’s network). An “event” in this context can be a visit to a website, loading a video or the delivery of an ad to the consumer.  Along with the event, several other pieces of information are recorded such as: the Internet Protocol (IP) address of the computer visiting the site and thus the “tag,” the type of browser (e.g. whether the user is using Chrome or Firefox) or cookies on that user’s browser. The worldwide panel consists of 2 million panel members in 170 countries. This study is based on the U.S. panel only, and consists of 1 million panel members. This main panel consists of desktop and laptop Internet usage only.  For mobile devices, comScore’s has a smaller, entirely separate U.S. mobile panel, which is used for a small section of this report, consisting of 14,000 panelists.

comScore recruits members for its main panel in two ways. The first is through banner ads across a variety of affiliate websites through which users can volunteer to be on comScore’s panel. The second is through third party application providers where a user is offered a free piece of software in exchange for being part of comScore’s panel. Panel members download software on their computers that tracks their online visits without attaching any personally identifiable information to the traffic data, thereby ensuring anonymity.

There are two resulting panels from this process: the home panel and the work panel. The home panel consists of users who are accessing the Internet using a device in their home. The work panel consists of users accessing the Internet through a device owned by their employer. comScore then weights the panel to build a representative sample of  U.S. Internet users ages two and older.

Unlike in a random dial survey, this is not a randomly selected and nationally representative group of U.S. internet users. But it is one of three or four analytics firms in the world that has a panel of this size.

Though the data are not a random sample, once the panel measurements are collected they are adjusted to reflect the demographic profile of the country in which the panelists live. These demographics measurements are generally based on national censuses, though in some cases comScore does its own demographic surveys.

For example, after it obtains traffic data from a panel, comScore analysts may discover they have a smaller percentage of males than the online population at large. They then add more results from males so that they are correctly represented.  Again these measurements are corrected to the Internet using population, not the general population of a country.

In addition to demographic weighting, comScore attempts to ensure they are tracking the Internet usage of the correct panelists in a household with more than one panel member. comScore says that 60.7% of the devices being tracked are single-user devices (so there is no need to ascertain who is using the device while their Internet usage is being tracked).

To identify the user in any given session, comScore uses a process it refers to as “Session Assignment Technology,” one of a host of methods by which comScore can identify the who is using that device.

Of the remaining 40% or so of sessions that are on multi-user devices, about half can be identified through what comScore refers to as “session markers.” These are cues such as e-mail addresses or a form in which a user indicates his or her age and gender. In these cases, comScore uses these cues to ascertain which member of the household is using the machine.

For the remaining half of multi-user machines that cannot be identified with session markers, other methods are employed to ascertain who the user is. Biometric markers can be used, for most adults have a unique style and rhythm to their typing, which comScore can identify. In other cases, comScore can identify usage patterns over time, such as time of day modeling or by identifying the types of sites that users repeatedly visit. In these ways, comScore can establish a pattern of individual users.

All of these methods are validated against a subset of US panel machines, where self-identifying pop-ups occasionally asks a user ‘who are you?” during a browsing session.

comScore’s full methodology can be viewed here.

Email Referral Data Analysis

As mentioned above, referral data is the information that each visitor to a site brings with him/her to the site, which tells comScore where that user came from before they arrived at that site (how that user got to the site). This data is imperfect. In some cases all referral data is stripped out and, therefore, cannot be counted by comScore in any category. Referral data can also occasionally be labeled incorrectly. Email tends to be the trickiest in this regard. And visitors coming from an email to a site can occasionally be labeled as coming directly to the site. The Pew Research Center asked comScore to examine the extent to which this could occur in their data.

For an email referral to be mistaken as a direct referral in the comScore system, a visitor would most likely need to be logged into their email around the time they visited the news site. Thus, comScore took three of the most trafficked news websites (CNN.com, Huffingtonpost.com and Yahoo! U.S. News), and examined all referrals that came to those sites within a 1-hour window of a user being logged into an email session in one of the five largest email providers (Gmail, Yahoo! Mail, Outlook, AOL Email, and Xfinity WebMail). Given these parameters, comScore found that that the greatest possible false labeling at these sites could be as follows:

  • 5% of sessions that were directly referred to CNN occurred within or directly after an email session
  • 9% of sessions at Huffingtonpost.com
  • 9% of sessions at Yahoo U.S. News

These data illustrate that 9% of the sessions on Huffingtonpost.com, for example, could have been characterized as ‘direct’, but in actuality were sessions coming to Huffingtonpost.com from email. Again, this number reflects a possible mischaracterization of a session as a direct referral rather than an email referral. Carried through that would be 5% of the 20% of direct sessions – so ultimately 1% of all sessions.

While this is not a perfect measure, it indicates that for three of the biggest news sites at least this mislabeling happens infrequently.

Methodology for This Report

The 26 news sites studied here were pulled from three lists of top news sites averaged over April, May and June of 2013. The first was comScore’s top “general news sites” based on monthly unique audience. The second was comScore’s list of top newspaper sites, again based on monthly unique audience. Third, was Facebook’s internal list, shared with Pew Research, of most-shared Facebook pages of news outlets, based on the first two weeks of May and first two weeks of June.

The result was a list of 26 news sites, 15 of which appeared on at least two of those lists. The Pew Researcher Center’s Journalism Project then asked comScore to provide several kinds of data for each website.

First, comScore identified monthly visitors coming to the site from four different paths: users directly typing in the URL or using a bookmark; users clicking on a search result; users following a link on Facebook; or users that came through a mix of other referral methods such as email, other social media and other news sites. These data reflect the number of visitors following each path. Some individuals could use more than one of these paths in a given month. Therefore, the percentages of visitors coming from each method do not add up to 100.

Second, comScore provided engagement metrics for each site overall, as well as the traffic coming to the site through each of the four pathways. There were three main measurements: average monthly minutes per visit, average monthly pages per visitor and average monthly visits per visitor.

The average monthly minutes per visit is how long a user spends, on average, viewing a given site per visit. For example, per visit (sometimes referred to as a ‘session’) the average visitor to abcnews.go.com spends 2 minutes and 54 seconds viewing the site before closing the page or navigating away from abcnews.com to another site.

The average pages viewed per month is one way to measure the amount of content a viewer consumes. A “page” is a single URL on a website that a user is viewing. In the case of a news outlet, this page could hold any kind of content from a text story to a video. For example, visitors to bbc.co.uk consumed 7.1 pages per visit on average per month.

Finally, average monthly visits per visitor is a measure of how many times a user returns to a site over the course of a month (both overall and then via each pathway). Overall, for example, the average visitor to Buzzfeed.com returns to the site 2.8 times over the course of a month.

comScore also provided demographic data on all of the sites. Demographic breakdowns for age, gender and household income were provided for all the sites. In addition to a site’s overall demographic breakdown, comScore was also able to provide demographic breakdowns by referral type.

Separately, Pew Research asked comScore to provide statistics demonstrating traffic to the media outlets’ mobile websites, based on its mobile panel. In five cases, comScore was also able to provide traffic data to a site’s mobile applications, but was limited to only five because of the mobile panel’s size (discussed in detail in the mobile section above.)