May 23, 2005

The Gender Gap

Methodology

SAMPLING AND INCLUSION

Two distinct categories of media were studied as part of the 2005 PEJ Media Report Card project.

The first, text-based media, included newspapers and Internet news sites. Princeton Survey Research Associates International conducted coding for those media.

The second, electronic media, included both broadcast network and cable network news. The School of Journalism at Michigan State University conducted coding for Broadcast Network News. The Institute for Communication Research of the College of Communication & Information Sciences at the University of Alabama conducted coding for Cable Network News. Esther Thorson of the University of Missouri School of Journalism conducted the statistical and methodological work for the report and put forth the original idea for this examination of gender.

Print, broadcast network and cable were each subject to a specific methodological approach regarding sampling and selection and coding. In all, the study examined some 16,800 stories. This included 6,589 newspaper stories, 1,903 online stories, 1,768 stories from network television and about 6,550 stories on cable news (the cable news study included two parts, a 20 day sample and a five day sample, in which some stories overlapped).

I. TEXT-BASED MEDIA

NEWSPAPERS

Newspaper Selection

Individual newspapers were selected to present a meaningful assessment of the content that is widely available to the public. Selections were made on both a geographic and a demographic basis, as well as diversity of ownership.

First, newspapers were divided into four groups based on daily circulation: Over 750,000; 300,001 to 750,000; 100,001 to 300,000, and 100,000 and under.

We included four newspapers over 750,000: USA Today, the Los Angeles Times, The New York Times, and The Washington Post. (The Wall Street Journal, which also falls in this category, was excluded as a specialty publication.)

Four newspapers were chosen in each of the remaining three categories. To ensure geographical diversity, each of the four newspapers within a circulation category was selected from a different geographic region of the U.S. Regions were defined according to the parameters established by the U.S. Census Bureau. 1

The newspapers in circulation groups two through four were selected through the following process:

First, using the Editor and Publisher Yearbook, we created a list of every daily newspaper in the U.S. Within each category, newspapers were selected at random until all categories were filled. To be eligible for selection, a newspaper was required to a) have a Sunday section, b) have a daily sports section, c) have its stories indexed in a news database, to be available to coders, and d) not be a tabloid. Newspapers not meeting those criteria were skipped over. In addition, an effort was made to ensure diversity in ownership.

Circulation Group 1

Los Angeles Times, New York Times , USA Today, Washington Post

Circulation Group 2

Cleveland Plain Dealer, Dallas Morning News, Philadelphia Inquirer, Sacramento Bee

Circulation Group 3

Albuquerque Journal, Asbury Park Press, Kansas City Star, San Antonio Express-News

Circulation Group 4

Bloomington ( Illinois ) Pantagraph , Hanover ( Pennsylvania ) Evening Sun, McAllen ( Texas ) Monitor, Vacaville ( California ) Reporter

Newspaper Study Operative Dates, 2004

Random sampling was used to select a sample of individual days for the study. By choosing individual days rather than weeks, we hoped to provide a broader look at news coverage that more accurately represented the entire year. To account for variations related to the different days of the week, the 28 days that were sampled included 4 of each day of the week. Dates were chosen from January 1 to October 13, a span of 286 days. October 13 was made the cutoff date to allow time for coding. Omitted dates included those of the Olympics and the Republican and Democratic National Conventions.

The following dates were generated and make up the 2004 sample.

January- 13, 16, 23
February- 2, 13, 23rd, 29th
March- 8, 12, 13, 14, 19, 24
April- 8, 15
May- 1, 4, 20
June- 8, 9, 16
July- 19, 25
August- 10, 12
September- 4, 22, 26

Story Procurement, Selection, and Inclusion

Stories were procured via hard copies of daily publications, supplemented by a combination of electronic databases (DIALOG, FACTIVA, and NEXIS).

All stories with distinct bylines that appeared on a particular newspaper's front page (Page A1), on the first page of the Local/Metro section, or on the first page of the sports section were selected for analysis. Each year the Annual Report rotates the third section-front examined. The 2004 study examined the style section-front.

INTERNET NEWS SITES

To select the Internet news sites to be coded, the Nielsen/NetRatings top 20 news sites list was consulted to determine the most prominent sites. The list contained four basic types of sites: news aggregators, 2 newspaper sites, network news sites, and cable news sites. Two sites were chosen for each of those categories. For aggregators, AOL and Yahoo were selected; they were the only two aggregators in top 20 list. For network news outlets, two sites were randomly chosen from among ABC, CBS, and MSNBC. (MSNBC appeared on both the network and cable lists because it is the news site for both NBC News and the MSNBC cable channel.) For cable sites, CNN and Fox News were chosen, since MSNBC had already been chosen from among the broadcast networks. For newspapers, the first site was chosen randomly from the four newspapers in Circulation Group 1, and the second was chosen randomly from the 12 newspapers in Groups 2 through 4. To be selected the newspaper had to have an active daily Web site. In addition, a local-TV news site was chosen. The market for local TV was chosen by randomly selecting one of the 15 markets from the newspaper sample and then randomly choosing among ABC, CBS, NBC, and Fox.

The following sites were included in the 2004 study:

ABC News (www.ABCNEWS.com), AOL (news section front page), Bloomington Pantagraph (www.pantagraph.com), CBS 11 TV – Dallas (www.cbs11tv.com), CNN (www.cnn.com), Fox News (www.foxnews.com), MSNBC (www.msnbc.com), Washington Post (www.washingtonpost.com), Yahoo! (news.yahoo.com)

Internet News Sites – Operative Dates 2004

The 2004 Internet study had two components. The first was a twenty-day sample that matched the dates of the newspaper sample, Mondays through Fridays. Weekends were not included for Internet, broadcast or cable sites. Again, the eligible dates ranged from January 1 to October 13, a period of 286 days.

The following dates were generated and constitute the 2003 Internet News Site sample.

January- 13, 16, 23
February- 2, 13, 23, 29
March- 8,* 12, 13, 14, 19,* 24
April- 8, 15
May- 1, 4, 20
June- 8, 9, 16*
July- 19, 25
August- 10,* 12*
September- 4, 22, 26
*Multiple Download Dates

In addition to the main sample, we conducted an additional study of five of those days in order to replicate the freshness variable studied in 2003. Among the 20-day sample, one day for each weekday was randomly selected.

Story Procurement, Selection, and Inclusion

For the main 20-day sample, each site was visited once a day. The download time rotated each day among four different hours: 9:00 A.M., 1:00 P.M., 5:00 P.M. and 9:00 P.M , ET. The order in which the sites were visited was also rotated for each capture time. Each download took approximately twenty minutes.

For the five-day sample, each site was visited four times on each day – 9:00 a.m. ET , 1:00 p.m. ET , 5:00 p.m. ET , and 9:00 p.m. ET – to download stories. The order in which the sites were visited was rotated for each capture time. Each download took approximately twenty minutes.

Each time, the following method was used to determine which stories to capture:

On the news home page of each of the sites, we identified featured stories. A story at the top of a page tied in to a graphic element – commonly a picture of an event or person – was counted as a featured story and captured for study. Multiple stories on the page relating to the same graphic element were also captured as featured stories. Pages with more than one graphic element were considered to have more than one featured story, and all such stories were studied.

After the featured stories, we included the next three most prominent stories without graphics starting from the top and moving down. Those stories were recorded as non-featured.

The following rules were put into place in selecting stories:

  • For the sample, the following were omitted from study: video, audio, charts, maps, background/archival information, news tickers, chat and polls.
  • Any headline that linked to an outside Web site was also omitted. (But stories attributed to other outlets but present on the site being studied were counted.)
  • Links to secondary stories about the same topic were counted as unique stories for the non-featured-stories category.
  • A graphic attached to a non-story item (i.e., video, audio, charts, maps, background/archival information, "complete coverage," chat and polls) was not counted as a story.
  • If there were no stories associated with a graphic, then only the top three stories were coded and none were considered featured.
  • If there was no graphic present, then no story was considered as featured, and the top three stories were counted as non-featured.
  • When news headlines with the same font and type size appeared in side-by-side columns, stories were prioritized in a left-to-right, line-by-line zigzag pattern.

Text-Based Media Coding Procedures

General practice called for a coder to work through no more than seven days/issues from any newspaper outlet during a coding session. After completing up to seven days/issues from one publication, coders switched to another text-based-media outlet, and continued to code up to seven days/issues.

All coding personnel rotated through all circulation groups, publications/sites, with the exception of the designated control publications. A control publication was chosen in each category of text media. The designated control publication/date was initially handled by only one coder. That work was then over-sampled during intercoder reliability testing.

Intercoder Reliability

Intercoder reliability measures the extent to which two coders, operating individually, reach the same coding decisions. The principal coding team for text media comprised four people who were trained as a group. One coder was designated as a general control coder, and worked off-site for the duration of the project. In addition, one newspaper was designated as a control source.

At the completion of the general coding process, each coder, working alone and without access to the initial coding decisions, re-coded publications originally completed by another coder. Intercoding tests were performed on 5% of all cases in connection with inventory variables, and agreement rates exceeded 98% for those variables. For the more difficult content variables, 20% of all publications/sites were re-coded, and intercoder agreement rates were as follows:

Topic: 92%

Female Sources: 98%

Male Sources: 97%

II. BROADCAST NETWORK NEWS

The ability to make direct comparisons between newspaper and broadcast network findings was a project design goal, so the weekday sample dates for those two news categories are identical. Because of preemptions and schedule changes, weekend network news broadcasts do not always appear in all markets, so Saturday and Sunday broadcast network news programs were excluded from the study.

On a handful of the sample dates, special events pre-empted the evening newscasts. In such instances an alternate date for the same day of the week was selected at random. The final dates were as follows:

January- 13, 16, 23
February- 2, 23
March- 8, 12, 19, 24
April- 8, 15
May- 4, 20
June- 8, 9, 16
July- 19
August- 10, 12
September- 15, 22
June 9 commercial network newscasts were not used because the programming was preempted by the ceremonies remembering President Ronald Reagan. NewsHour was studied on this date. September 15 was used as a substitute for June 9 for the network newscasts.

BROADCAST NETWORK MORNING NEWS PROGRAMS

( 7:00 a.m. – 7:59 a.m. Eastern Time Airings)

ABC Good Morning America , CBS The Early Show, NBC The Today Show

BROADCAST NETWORK EVENING NEWS PROGRAMS

(Full program as broadcast in New York market)

ABC World News Tonight, CBS Evening News, NBC Nightly News, PBS NewsHour

Program Procurement and Story Selection and Inclusion

The morning and evening broadcasts were procured through both transcripts and video tape. Transcripts were obtained through the Nexis electronic database. Videotaped programs were captured live in the New York City market by ADT Research. For the evening newscasts, that represents each day's 6:30 P.M. East Coast feed. PBS supplied the Project with tapes of the NewsHour.

In the mornings, the following content was analyzed: stories read by the newscaster in the half-hourly news blocks; feature and interview segments outside of the news blocks; banter between members of the anchor team whose import was other than to tease coming segments in that day's program or to promote the network's programming at some later time. One-fifth, 20%, of the sample was coded for teasers and promos and analyzed separately. Excluded from the analysis were the content of the weather blocks, local news inserts, commercials, and other content-free editorial matter such as logos, studio shots, openings and closings.

In the evenings the same rules applied, but because the content of the newscasts is less variegated, concerns about news blocks, banter, weather blocks and local news inserts were not applicable.

Broadcast Network Coding Procedures

Faculty and graduate students in the School of Journalism at Michigan State University conducted this part of the project. The two faculty members who supervised the project have more than 40 years of combined social-science experience in conducting such studies, and are two of the most published academic researchers in the field. Two students in the mass-media Ph.D. program at MSU, one a third-year student and the other a second-year student, coded most of the stories, assisted by a master's-degree graduate of the MSU Department of Communication. In addition, two current master's-degree students in the School of Journalism coded parts of the newscasts. Coding was done independently, working from the protocol, without consultation among the coders.

The coding protocol was provided by the Project for Excellence in Journalism.

Inter-Coder Reliability Testing for Broadcast Network News

A coder reliability assessment for each completed network was then conducted with a random sample of dates taken from those supplied by the State of the Media project. This usually consisted of one or two days used in the assessment from the total of days sampled, resulting in a sample of 5% to 10% of the total stories coded.

Percentages of agreement calculations were made to assess the coding for each of the variables requiring categorical choices among variable values.

Fifty-three stories from the evening newscasts and 69 from the morning newscasts (a total of 122 stories, or 7% of all stories) were used to test reliability. All of the variables used in the State of the Media analysis presented here achieved at least 90% inter-coder agreement, except story topic. The original story-topic coding scheme involved more than 300 subcategories, and reliability was below 80%. But when the coding was collapsed into the 12 categories used in this analysis, the inter-coder agreement reached 83% for all stories.

The content categories used in this analysis and their inter-coder agreement were: big story, 95%; story topic, 83%; female sources, male sources 98%.

 

Coder Reliability Summary for Evening News, Morning News, and All News Programs
Evening News (N=53)
Morning News (N=69)
All (N=122)
Dateline
99%
97%
98%
Big Story
93%
97%
95%
Story Topic
81%
86%
83.4%
Female Source Number
99%
97%
98%
Male Source Number
94%
95%
94.5%
Totals may not equal 100 due to rounding

III. CABLE NEWS

Cable News Programming – Outlet Selection and Operative Dates 2004

As with the online sample, the 2004 Cable study had two components. The first was a twenty-day sample that matched the dates of the newspaper sample on Mondays through Fridays. Weekends were not included for the Internet, broadcast or cable. Again, the eligible dates ranged from January 1 to October 13, a period of 286 days. On a handful of the sample dates, special events pre-empted the evening newscasts. In such instances an alternate date for the same day of the week was selected at random.

The following dates were generated and make up the 2004 cable news sample:

January- 13, 16, 23
February- 2,* 23
March- 8, 12, 19,* 24h
April- 15
May- 4, 20
June- 8, 9, 16
July- 19
August- 5, 10, 12
September- 22

Story Procurement and Inclusion

To assess the nature of the 24-hour news cycle as presented on cable news programming, CNN, Fox News, and MSNBC were selected because they were the three most-viewed cable news channels in 2003.

For the twenty-day sample, we selected three program types to study at each network: Daytime programming, the closest thing to a traditional newscast, and the highest-rated prime time talk show. The following programs were captured and analyzed:

DAYTIME PROGRAMMING

The 11-to-12 o'clock hour for each network

NEWSCAST/NEWS DIGEST PROGRAMS

CNN's NewsNight with Aaron Brown, FOX's Special Report with Brit Hume, MSNBC's Countdown with Keith Olbermann

PRIME-TIME TALK PROGRAMS

CNN's Larry King Live, FOX's O'Reilly Factor, MSNBC's Hardball with Chris Matthews

All cable programming was procured through both videotape and transcripts, although transcripts were not available for the Fox News programming at the 11:00 a.m. hour. Transcripts were obtained through the Nexis electronic database. Videotaped programs were captured live in the Washington , D.C. market. In some instances tapes were provided to us by VMS, a commercial third-party monitoring service.

Cable News Coding Procedures

The cable news coding was conducted by faculty members, graduate students, and research staff people affiliated with the Institute for Communication and Information Research at the University of Alabama . Six coders were involved throughout the coding process. All coders worked independently, without consulting one another regarding specific coding decisions.

Cable News Inter-coder Reliability Testing

As noted, three program types were studied for each of the three cable news networks. To assess reliability within and across program types, we randomly selected six of the 60 hours of daytime programming, six of the 60 hours of news-digest programming, and six of the 60 hours of prime-time talk programming. In other words, the reliability sample was stratified by program type.

The reliability sample was also stratified by network. Within the six hours for each program type we included two hours from each of the three networks.

This 18-hour sample represents 10% of the 180 hours of programming included in the study; the 6-hour sample for each program type represents 10% of the 60 hours dedicated to each of the three program types.

Percentages of agreement calculations were made to assess the coding for each of the variables requiring categorical choices among variable values.

All of the variables used in the State of the Media analysis presented here achieved at least 88% inter-coder agreement.