A UNC statistics class analyzed data from the most recent 2018 North Carolina general election and found that more than 1,500 absentee ballots were unaccounted for in Bladen and Robeson counties, which are part of North Carolina’s 9th Congressional District.
Richard Smith, distinguished professor in the Department of Statistics and Operations Research, teaches the approximately 70-student class of STOR 556, “Advanced Methods of Data Analysis” and had students analyze election data from all 100 North Carolina counties.
“It started out as a homework exercise,” Smith said. “There was an example in the text which was about the election results from the Bush v. Gore election, which is 19 years ago, and I said to the students, ‘Why are we talking about this example that is 19 years old when we have lots of data right here? There is an argument going on in North Carolina.' So I went online, downloaded some data and made it a homework exercise. I wanted to see what students would make of it.”
The basis of Smith’s assignment revolved around the undecided election of Republican candidate Mark Harris for the U.S. House in North Carolina’s 9th Congressional District.
Harris won the election in November 2018 by 905 votes over Democrat Dan McCready, but the North Carolina State Board of Elections refused to certify his election after doubts arose concerning the legitimacy of the results due to fraudulent practices with absentee ballots. The State Board of Elections held a hearing this week about these allegations of fraud.
The Board could order a new election if the 9th Congressional District ballots aren't certified.
“One of the issues is, can we actually prove that the number of missing votes was greater than 905?” Smith said.
Smith said he collected data of the number of absentee ballots for all 100 counties in North Carolina. He then found the proportion of absentee ballots that were not returned was less than 4 percent for 98 counties, slightly higher than the usual 1 to 2 percent rate, and around 11 percent in both Bladen and Robeson counties.
Students were first instructed to create a linear model for all 100 counties.