Restricting comparators to media-or-pollster-self polls in the same race within the same week (Spec 3c) gives β = +6.95 pp (p=0.008) on 60 race-week cells. The tightest design and the most robust to the timing-of-commission alternative.
Comparator definition. Spec 3c's "clean comparator" pool is polls whose
sponsor_typesare a subset of{media, pollster_self}(poll_is_independent==1 inbuild/assemble/cand_poll.parquet, set insource/assemble/poll.py:65,110). Pollster-self polls — where the sponsor CNPJ equals the pollster CNPJ — are included because the pollster is fielding without a paying client and so has no sponsor-side bias incentive. See AN-002b for a media-only sensitivity that excludes pollster-self.
Results
Table: Timing-controlled specs on the clean-comparator sample
All three specs restrict to rows where the poll is either self-sponsored OR sponsored exclusively by independent media / pollster-self.
| Spec | β (sponsored_by) | SE | p | sample |
|---|---|---|---|---|
| Spec 3a (clean + candidate FE) | +6.32 | 1.46 | <0.001 | 21,453 rows |
| Spec 3b (clean + race × month FE) | +7.77 | 1.65 | <0.001 | 21,453 rows |
| Spec 3c (clean + race × week FE) | +6.95 | 2.57 | 0.008 | 409 rows / 60 cells |
(from source/analysis/regressions.py → build/table/regressions.csv)
Spec 3c identifies β off 60 (race × week) cells where the data contains BOTH a self-sponsored poll AND an independent poll fielded in the same week. The within-cell comparison strips any race-week trend (campaign-phase momentum, news cycle effect) by construction.
Interpretation
- Strictest design, same magnitude. Spec 3c is the tightest design the project supports given the data, and it gives β ≈ +7 pp — the same magnitude as the within-candidate FE alone (AN-001).
- Non-overlapping identifying variation. The two specs share no identifying variation: AN-001 identifies off within-candidate cross-poll variation; Spec 3c identifies off within-(race × week) cross-poll variation. Both converging on +7 pp is the load-bearing robustness fact: the headline isn't an artifact of either identification structure.
- Timing-of-commission alternative ruled out. The "candidate commissions when leading" concern cannot mechanically generate this. To produce a 7 pp jump in Spec 3c, the candidate would need to commission a poll within 4-7 days of an independent poll of the same race AND have their private-belief-about-leading correlate with that 7-day window — implausible at the magnitudes observed.
- No race-month-level confound. The Spec 3a → 3b → 3c progression also confirms there's no race-month-level confound: 3a (no race FE) = +6.3, 3b (race × month FE) = +7.8, 3c (race × week FE) = +7.0. Tightening the time window doesn't move β meaningfully.
Confidence rationale (green). Three timing-controlled specs converge on β ≈ +7 pp, with the tightest (race × week FE) significant at p=0.008 despite identifying off only 60 cells. The convergence with AN-001's within-candidate FE — which uses non-overlapping identifying variation — makes the headline robust to the leading alternative explanation (timing-of-commission). Remaining uncertainty is about the comparator definition, not the magnitude.
Follow-ups
- 60 race-week cells is still thin. The 2022 cycle extension (queued
in
docs/todo.md) would add governor-race cells and let us run Spec 3c with much more power. - Spec 3c implicitly assumes "independent media + pollster-self sponsored" really are independent of candidate strategy. The candidate-classifier LLM refinement (queued) would let us tighten this comparator.