11/4/2024 (DRAFT)
I am trying to see what “user visible” options need to be allowed, where user is, eg, Colorado DOS, or a consultant working with them configuring the library for their needs. So I wanted to leave out all the research questions, its our job to answer those, and I was picking the best answers in my understanding so far.
I dont know of any reason to use sampling with replacement.
If a jurisdiction cant create CVRS, then use polling, otherwise use Comparison.
For the risk function, use AlphaMart (or equivilent BettingMart) with ShrinkTrunkage, which estimates the true population mean (theta) with a weighted average of an initial estimate (eta0) with the actual sampled mean. Use the reported winner’s mean as eta0. The only settable parameter is d, which is used for estimating theta at each sample draw:
estTheta = (d*eta0 + sampleSum_i) / (d + sampleSize_i)
which trades off smaller sample sizes when theta = eta0 (large d) vs quickly adapting to when theta < eta0 (smaller d).
A few representative plots showing the effect of d are at meanDiff plots.
The requirements for Comparison audits:
For the risk function, use BettingMart with AdaptiveBetting. AdaptiveBetting needs estimates of the rates of over(under)statements. If these estimates are correct, one gets optimal sample sizes. AdaptiveBetting uses a variant of ShrinkTrunkage that uses a weighted average of initial estimates (aka priors) with the actual sampled rates.
TODO: quantify how things go when rate estimates are incorrect. A first pass is at Ballot Comparison using Betting Martingales
The nice thing about SHANGRLA is that it cleanly separates the risk-function from the sampling strategy. All of the above is the risk-function. Following concerns the sampling strategy.
Risk-limiting audits (RLAs) can use information about which ballot cards contain which contests (card-style data, CSD), see Stylish Risk-Limiting Audits in Practice. According to that paper, this makes a huge difference in sample sizes.
CSD requires:
This information is available for any election. The question is whether its included on the CVR.
TODO: quantify how much difference CSD makes.
With all of the above, we can estimate sample sizes by simulation, knowing only the reported margin and risk limit. The main settable parameter here is the target percentage (aka quantile) of runs that should finish within the estimated sample size.
TODO: show sample size estimates as function of quantile and reported margin.
Here we create a consistent sampling across all contests under audit. I dont think there are any user visible options here.
Jurisdiction Scenarios
Audit types