Magnitude measures (mean |error|, RMSE, max boost, spread) come back NULL across all four sponsor buckets within race × week. The signal is rank-disagreement, not noise. other_firm polls understate the eventual race winner by −1.83 pp vs media polls (p = 0.010) and call the winner
Motivated by the Goiás IPOP cross-cycle pattern
(paper.tex §Setting, lines 265–281). 2020:
357 IPOP polls were self-contracted (the firm registered itself as
contratante) — that's pollster_self in our taxonomy, and the
2020 polls were the fraudulent ones prosecuted in Operação Leão
de Neméia. 2024: IPOP (68 polls) and Alcateia Outsourcing
(41 polls) routed every Goiás 2024 mayoral poll through
FacUnicamps, a private faculdade in Goiânia — that's other_firm
in our taxonomy. The shell channel migrated across cycles in
response to enforcement. A 4-bucket accuracy split therefore
has to ask not just "does other_firm look slanted in 2024" but
"does pollster_self look slanted too" and "what kind of
slant — magnitude or direction."
H13 (shell-contratante)
named the prediction and was queued as "small-N behind the LLM
extractor." Protocol-level counts say no: other_firm is 1,553
polls (22 %) and pollster_self is 1,464 polls (21 %) — both
larger than the candidate-linked bucket (1,026, 15 %). The
four-bucket split is well-powered without the LLM pass.
Results
Raw cell means (no FE, all 7,032 polls)
| Bucket | n | mean |err| | RMSE | max+ | spread | poll-leader err | winner err | calls winner #1 |
|---|---|---|---|---|---|---|---|---|
| media | 2,948 | 8.15 | 9.03 | 10.63 | 18.80 | +7.25 | −0.16 | 73.2 % |
| other_firm | 1,553 | 8.17 | 8.93 | 10.35 | 19.11 | +7.39 | −0.84 | 69.3 % |
| pollster_self | 1,464 | 8.38 | 9.39 | 11.44 | 19.74 | +7.58 | −0.51 | 70.8 % |
| candidate | 1,026 | 9.22 | 9.97 | 11.53 | 18.83 | +9.12 | +1.43 | 72.5 % |
Two observations from the raw cells matter for picking the right measure:
The boost handed to the poll's #1 cand is similar across buckets (
poll_leader_error≈ 7.3–7.6 pp for everything except candidate- linked at 9.1). Every poll inflates whoever it puts on top by roughly the same amount. Magnitude-of-boost is not where the sponsor signal lives.What differs is who that #1 cand is.
winner_errorranges from −0.84 to +1.43 across buckets, andpoll_calls_winner_firstfrom 69 % to 73 %. The slant signature is "which candidate the poll points to," not "how big a boost the slant delivers."
This reframes the AN-082 search: the right measure is rank-based, not magnitude-based.
Race × Week FE (Spec B — preferred)
Coefficients on the bucket dummies vs media reference. The four magnitude outcomes are uniformly null:
| Outcome | is_candidate | is_pollster_self | is_other_firm |
|---|---|---|---|
| mean_abs_error | +0.13 (p=0.84) | −0.09 (p=0.73) | +0.02 (p=0.95) |
| rmse | +0.25 (p=0.72) | −0.08 (p=0.80) | +0.06 (p=0.88) |
| max_signed_error | +0.50 (p=0.57) | +0.00 (p=1.00) | +0.23 (p=0.68) |
| spread_error | +0.92 (p=0.58) | −0.27 (p=0.71) | +0.41 (p=0.70) |
The rank-based outcomes are where the signal is:
| Outcome | is_candidate | is_pollster_self | is_other_firm |
|---|---|---|---|
| poll_leader_error | −1.65 (p=0.114) | −0.43 (p=0.40) | +0.22 (p=0.77) |
| winner_error | −2.26 (p=0.089) | +0.56 (p=0.30) | −1.83 (p=0.010) |
| poll_calls_winner_first | +0.013 (p=0.81) | +0.074 (p=0.009) | −0.048 (p=0.15) |
Spec C adding log(sample_size) leaves these results essentially unchanged.
n=2,190 polls in 885 race × week cells for Spec B (cells with only one poll are dropped — no within-variation).
Interpretation
Magnitude doesn't move; direction does. None of the magnitude measures separates the four buckets within race × week. Shell-suspect polls are not noisier, don't have larger boosts, don't have wider spreads, don't have higher RMSE. Whatever "shell" means here, it doesn't show up as a degradation of poll quality.
The signal is rank-disagreement. Within race × week:
other_firmpolls understate the eventual race winner by 1.83 pp vs media polls (p = 0.010) and are directionally 4.8 pp less likely to call the winner #1 (p = 0.15).candidate-linkedpolls show the same anti-winner signature at similar point estimate (−2.26 pp on winner_error), with ~2× wider SE (1,026 polls, even thinner after race × week absorption).pollster_selfpolls are 7.4 pp MORE likely to call the winner #1 than media polls within race × week (p = 0.009). That looks paradoxical given the 2020 Goiás IPOP precedent, but it's a 2024-sample-only result. Two readings:- Selection — they get the easy races. Pollster_self firms self-finance their flagship polls in races where they're confident; within-race × week comparison rewards that. Raw-means call-the-winner rate (70.8 %) is below media (73.2 %) precisely because pollster_self polls cluster in tougher races. The FE flips the sign.
- The Goiás fraud channel is no longer pollster_self. After the 2020 prosecution, IPOP and Alcateia rerouted through FacUnicamps; the residual 2024 pollster_self pool is dominated by reputational firms doing showcase polls.
The other_firm result is the H13 signature. A hidden sponsor with sender-identity concealment buys the same product as a self-sponsoring candidate: point the poll at a non-leader. The average boost magnitude doesn't change — every poll inflates its #1 by ~7 pp — but the choice of #1 is different. Within race × week, that shows up as the eventual winner being understated. other_firm and candidate-linked share the signature; media and pollster_self don't.
How this changes the H13 reading
Earlier headlines from the v1 draft of AN-082 framed the signal as "shell polls are −1.8 pp more inaccurate." That overclaims. The corrected reading:
- Inaccuracy is not the metric. Magnitude is null.
- Direction is the metric. Shell-suspect polls misrank the eventual winner — they put a different cand on top, who turns out not to win.
- The mass-on-which-candidate is the slant. The poll's rank-1 choice is the slant declaration; its identity differs across buckets even though its inflation magnitude doesn't.
Caveats
Race × week absorption is sharp. 6,991 polls in the four- bucket sample → 2,190 rows in Spec B because most cells have only one poll. Surviving cells are the well-polled, attention- competitive races. Race FE (Spec A) on the full 5,750 gives is_other_firm β=−0.63 on winner_error (p=0.24) — directionally identical, weaker.
other_firm is a heuristic regex residual (
source/assemble/poll.py:_norm). False positives include civic associations and unions. The LLM pass (H13 to-do) would split (suspected shell / civic association / unknown) and should sharpen the signal, not weaken it — pure noise would attenuate the within-FE −1.83.Cross-cycle channel migration is the load-bearing interpretive caveat. The 2024 result codes the shell channel as other_firm; the 2020 IPOP fraud was pollster_self. A 2020 re-run of AN-082 would show pollster_self carrying the anti- winner signature, and other_firm not. The bucket meaning is cycle-specific.
pollster_self +7.4 pp on
poll_calls_winner_firstis novel and not predicted. Two competing readings (selection-easy- races vs post-prosecution pool-purification) cannot be separated in this analysis. Worth a separate check via the per-firm β cross-cut from AN-016/017.
Follow-ups
LLM classification of the other_firm bucket (H13 data requirement). With the regex residual already showing −1.83 pp / p = 0.010 on winner_error, the LLM pass should let us split other_firm into (shell-suspect / civic association / unknown) and see whether the slant concentrates in the shell sub-bucket.
Cross-reference to the named 2024 Goiás case. Filter to IPOP + Alcateia CNPJs, confirm: (a) their 2024 polls are coded
other_firm(FacUnicamps as contratante), (b) their poll-level winner_error magnitudes match the universe-wide −1.83.Cross-cycle re-run on 2020 polls. Replicate AN-082 on 2020 data: the prediction is
pollster_selfcarries the anti-winner signature there (where IPOP self-contracted), andother_firmis null. A confirming cross-cycle pattern would be the cleanest evidence that the bucket meaning is shifting with enforcement, not the underlying mechanism.Race × month FE intermediate spec. Race × week drops 69 % of the sample. Race × month keeps more polls per cell and would clarify whether the rank-disagreement signal is robust to less aggressive timing absorption.
Wire to the paper. Currently §Setting frames the Goiás case as motivating the shell-sponsor caveat. AN-082 turns it into a quantified rank-disagreement signal (other_firm −1.83 pp anti-winner, p = 0.010). Worth one paragraph in §Results alongside the headline +7 pp, with the cross-cycle channel-migration framing.
Artifacts
- Script:
source/analysis/an-082-accuracy-by-sponsor-bucket.py - Spec-level coefficients:
build/table/an-082-accuracy-by-sponsor-bucket.csv - Raw cell means:
build/table/an-082-accuracy-by-sponsor-bucket__cells.csv - Headline JSON:
build/table/an-082-accuracy-by-sponsor-bucket.json
Related
- H13 shell-contratante hypothesis
- AN-006 sponsor-route split
- AN-017 customer-mix refresh
- Paper §Setting: Goiás IPOP cross-cycle pattern (2020
pollster_self / 2024 FacUnicamps shell),
paper/paper.texlines 265–281, citingjornalopcao2024lista.