How did we do? A look back at Daily Kos Elections' 2018 race ratings
Elections are never truly over; pundits start talking about the next one as soon as the last one is in the books, before the lame duck session is even ended, and that’s especially the case this year, in that we’re most likely headed for a full re-do of the fraud-tainted election in North Carolina’s 9th congressional district. But, to the extent that the votes are finally counted in the vote-by-mail western states, we’re done enough that we can do a quick retrospective look at how Daily Kos Elections’ race ratings performed this year.
As you might know, we had two entirely different approaches to rating the races, which provide two different perspectives. There are our qualitative ratings (Senate, gubernatorial, and House), which use the Tossup/Lean/Likely/Safe formulation that a number of other prognosticators use; these are based on polling data, of course, but also a gestalt mixture of other factors, like the state of the overall national environment; candidate fundraising; what races the major outside groups like the DCCC or House Majority PAC are spending on; what rumors about the candidates’ chances are getting leaked to the press; and other intangibles, such as whether a candidate’s messaging or ads sound confident or defensive.
There is also our quantitative system, which uses Bayesian trendlines to average polls in the races. We chose not to do a full-on predictive model this year, assigning specific probabilities to each race or to a cumulative event like flipping control of the Senate (which means that, unlike 2014, we can’t say “Wow, we had the best ‘Brier score’ of any model”). By comparing our quantitative averages to the actual results, however, we have the chance to delve a little more deeply into how accurate both our quantitative and our qualitative ratings were. (Also, even among various analysts, such as FiveThirtyEight or RealClearPolitics, you’ll see slightly different averages, so that can be a basis for comparison too. Averages can vary depending on which pollsters get included, whether bad pollsters [or internal polls] get downweighted, and the rate at which older polls’ influence decays.)