January 29, 2018

Sources Shared on Twitter: A Case Study on Immigration

An analysis of 9.7 million tweets reveals that news organizations played the largest role in which content was linked to compared with other information providers

As news organizations battle charges of “fake news,” compete with alternate sources of information, and face low levels of trust from a skeptical public, a new Pew Research Center study suggests that news outlets still play the largest role in content that gets shared on Twitter, at least when it comes to one contentious issue in the news: immigration.

The study, which aimed to better understand the types of information sources that users on one popular social media platform may see about a major national policy issue, finds that news organizations play a far larger role than other types of content providers, such as commentary or government sites. During the first month of Donald Trump’s presidency, roughly four-in-ten of the 1,030 most linked-to sites in immigration-related tweets (42%) were outlets that purport to do original reporting – what the study refers to as the News Organizations category. And the prominent role these sites played becomes even greater when looking at the frequency with which they were shared: Fully 75% of the tweets during this time period linked to News Organizations.

The study also finds little clear evidence that “fake news” sites were a major factor in the information stream on Twitter around immigration. Overall, just 2% of the sites catalogued in the study appeared on at least one of three external lists of “fake news” sites, and the vast majority of sites classified as News Organizations were established at least a year before the 2016 election, suggesting they were not created solely for influence during the election.

While the study does not directly address the broader question of “fake news” entities’ influence on the public, or examine who is sharing what types of sites, it does shed light on the degree to which consumers are exposed to different types of information providers on a policy issue debated in the news.

Immigration as a case study

The focus of this analysis is tweets about immigration, a subject chosen because of its key role in news during the first month of Trump’s presidency. Between Jan. 25-27, 2017, Trump signed a series of executive orders that altered federal rules around immigration. Most notably, this included the executive order that restricted entry to the U.S. by people from certain countries. In the following days, protests erupted across the country, particularly at airports where the status of some international travelers subject to the executive order was unclear. A few weeks later, on Feb. 9, 2017, the 9th U.S. Circuit Court of Appeals blocked enforcement of this executive order. A revised version of the administration’s immigration order is still pending before the U.S. Supreme Court.

Researchers found that this topic received considerable attention on Twitter – more than 20 million tweets that matched immigration-related keywords were posted from Jan. 20-Feb. 20, 2017, the first month of the Trump presidency; 11.5 million of these tweets had links to external sources and were the focus of this analysis.

Terminology/Attributes measured

  • Broad category and specific grouping: The different kinds of sites linked to in immigration-related tweets were grouped into three broad categories: 1) News Organizations, which include legacy and digital-native news organizations; 2) Other Information Providers, which include digital-native commentary/blogs, nonprofit/advocacy organizations, government institutions/public officials, digital-native aggregators and academic/polling sites; and 3) Other Sites, which include consumer products and internet services sites, foreign/non-English sites, spam sites, discontinued sites, content delivery tools, celebrity/sports/parody/satire sites, and other sites.
  • News Organizations (category): Sites in this category all showed evidence of original reporting (such as interviews, eyewitness accounts or referral to source documents) in the top five most linked-to articles on Twitter during this time period and the top five articles on their homepage when coding.
  • Other Information Providers (category): Sites in this category were focused on current events or public affairs information but are not news organizations.
  • Other Sites (category): Sites in this category did not provide current events information or else could not be coded.
  • Age: The date the site began posting content. This variable was coded for those established before or after Jan. 1, 2015, to capture sites that were created before the 2016 election season.
  • Self-described ideology: A site’s specified ideology or partisanship ­– as stated on its “about” page, the about sections of associated social media profiles (any social media profile linked to on the about page of the site was included, with most sites linking to both their Facebook and Twitter accounts) or in interviews with site founders ­– was grouped into three broad categories: 1) liberal, including Democratic and progressive, 2) conservative, including Republican and 3) no self-identified ideology.
  • Establishment orientation: A site’s specified orientation toward the media or political establishment, as stated on its “about” page or associated social media profiles. Sites that say, for example, that they are “exposing the lies of the media” or “taking the fight to the political establishment” are categorized as anti-establishment.

To gain some purchase on the kinds of sources being shared in the “Twitterverse” around the contentious issue of immigration, researchers identified all tweets on the topic of immigration during the first month of the Trump administration, Jan. 20-Feb. 20, 2017, and then focused on the 11.5 million tweet subset that included at least one external link. Any site that was linked to at least 750 times during this period was included in the study. This resulted in a list of the 1,030 sites most frequently referenced during a month-long Twitter discussion around immigration (these sites were linked to in 9.7 million tweets).

The analysis reveals that legacy and digital-native news organizations – entities that show evidence of original reporting in their most prominent articles – represented about four-in-ten of the most commonly linked-to sites (42%). And legacy news organizations accounted for twice as many sites as digital-native news organizations: 28% of all sites compared with 14%, respectively.

Another roughly three-in-ten sites (29%) linked to during this time are a mix of sites in the category of Other Information Providers, which are focused on current events and public affairs, such as nonprofit/advocacy organizations, digital-native commentary/blog sites or government sites.

Finally, another nearly three-in-ten sites (29%) were not clearly current events oriented, classified as Other Sites in this study. These constitute such entities as consumer product companies, foreign/non-English sites, and discontinued or spam sites. (Full definitions of each site category and grouping are available in Chapter 1).1

Although verifying the accuracy of all reporting was beyond the scope of this study, researchers found that few of the 1,030 sites carry the attributes of sites generally identified as publishers of “made-up” political news. First, only 18 sites – just 2% of all sites included in this study, including those that have since been discontinued – were found on at least one of three widely circulated “fake news” lists created by external organizations (BuzzFeed, FactCheck.org and Politifact).2 These sites tended to be either digital-native commentary/blogs (eight sites) or digital-native news organization sites (six sites).

Second, the large majority of the sites in the News Organizations category (94%) were created before Jan. 1, 2015, a cut-off date selected to identify sites created before the 2016 election campaign began since post-2016 election reporting identified many “fake news” sites as having been created during the lead-up to the election season. This includes nearly all legacy news organizations (99%) and the vast majority of digital-native news organizations (85%).

Even sites in the Other Information Providers category tended to be older: Virtually all academic/polling (100%) and government institution/public official (97%) sites were created before 2015, as were at least seven-in-ten nonprofit/advocacy organizations (77%) and digital-native commentary/blog sites (73%). The one grouping to have a more even mix of older and younger sites is digital-native aggregators – those who compile and distribute content created by others (52% were created before Jan. 1, 2015, and 48% since then).

Further, to get a sense of the degree to which the most linked-to content providers outwardly specified an ideological orientation, an analysis of the “about” pages on the official websites and social media profiles of sites in the News Organizations and Other Information Providers categories found that just 14% of these sites clearly specify a conservative or liberal ideological orientation. Even fewer sites stated that their mission is to produce news and information not being covered by traditional media or politicians – which researchers coded as “anti-establishment orientation.” Sites that explicitly include this language may be attempting to position themselves as outside traditional media organizations or the political establishment; for example, Raw Story’s Facebook page says they offer “stories often ignored in the mainstream media.” Only 8% of sites in the News Organizations and Other Information Providers categories – 57 sites in all – include this type of language.

Legacy news outlets among most frequently shared sites in tweets

Looking at the data another way – by the sites that appeared most frequently in tweets – underscores the prominent role news organizations played in this discussion. While sites in the News Organization category comprised 42% of the sites most linked to on Twitter, 75% of the tweets in this study contained links to them. Furthermore, 56% of tweets contained links to legacy news organizations – such as print or broadcast organizations – about three times as many as contained links to digital-native news organizations (19% of tweets). In other words, while the primary analysis treats all 1,030 sites that met the threshold of 750 tweets with equal weight, some sites were linked to far more than 750 times while others were closer to that cutoff. Looking at the frequency of shares, then, identifies which site groupings in this mix – as well as which individual sites – were the most prominent in the Twitter conversation about immigration. In this analysis, News Organizations was both the largest category of sites and had an outsized role in what traveled through Twitter.

Additionally, about one-in-ten tweets (13%) contained links to sites in the Other Information Providers category, while about a quarter of tweets (26%) contained links to sites in the Other Sites category.

Looking at the individual sites, two legacy news organizations – The New York Times and The Hill (7% each) – were among the most commonly shared in this study, as well as CNN (4%), The Washington Post (4%) and Fox News (3%).3

These are some of the findings from a new Pew Research Center analysis of all English-language tweets about immigration with external links posted in the 32 days following President Trump’s inauguration – Jan. 20-Feb. 20, 2017. That amounted to 11.5 million tweets. Researchers organized all the links according to the main entity (website domain or social media page, both of which are referred to as sites) in the link.4 All sites linked to at least 750 times during those 32 days are included in the main data set for analysis, which amounts to a final sample of 1,030 sites shared in 9.7 million tweets.

A team of in-house coders classified each of these 1,030 sites into 14 specific groupings under three broad categories: News Organizations, Other Information Providers and Other Sites. Researchers conducted additional analysis on sites in the first two categories: News Organizations and Other Information Providers (sites in the Other Sites category were not coded for additional analysis because most of the sites were not focused on news or current events).

It is important to keep in mind that this is an analysis of one social media site that measures the presence of different types of external sources rather than who shares them (whether different types of users or bots), who receives them, or the influence these sites may have on different parts of the public. Instead, this analysis examines the types of sites that a user may encounter on Twitter. Indeed, past Pew Research Center findings reveal that certain established legacy news outlets can carry more weight among distinct political groups. Nonetheless, the analysis sheds light on an important area of concern that emerged in the months following the 2016 presidential election: the presence and role of alternative information providers, particularly around debated issues in the news.

While these findings do not directly address broader questions of “made-up” news sites’ ability to influence opinion among certain parts of the public, or the larger impact of the ease of publishing and promoting content on the web, they do help put the role of these types of entities and the implications of this environment into some perspective.

  1. Foreign/non-English sites are those based outside the U.S. or Europe or that primarily publish in a language other than English.
  2. The Politifact, BuzzFeed and FactCheck.org lists were selected because they met the criteria of having staff from these organizations directly evaluate the content of each website included rather than compiling them from other existing lists. Additionally, these lists were cited in media reports as reputable sources of information or were part of fact checking initiatives, such as the Facebook fake news initiative.
  3. Researchers followed links from content delivery mechanisms like link shorteners and coded the destination site appropriately if the link was still active when analysis was conducted in summer 2017; those that are no longer active are included in the content delivery tools grouping under the Other Sites category.
  4. Specifically, researchers reduced links to their domain names and consolidated links from the same domain. For example, both www.cnn.com and edition.cnn.com were combined into cnn.com.