AN-002: Does β survive race × week FE with an independent-only comparator?

Restricting comparators to media-or-pollster-self polls in the same race within the same week (Spec 3c) gives β = +6.95 pp (p=0.008) on 60 race-week cells. The tightest design and the most robust to the timing-of-commission alternative.

Hypothesis: H1: Self-sponsored polls overstate the sponsoring candidate
Confidence: green
Type: robustness

Design

Sample: estimulado-non-aggregate-match2
Specification: error ~ sponsored_by | candidate FE + race×week FE, cluster-robust SE at muni level; sample restricted to self-sponsored OR independent-sponsored polls, where 'independent' = media-or-pollster-self (poll_is_independent flag in cand_poll.parquet)
Comparator: media_or_pollster_self
Time window: race_week
Cluster: muni

Script: source/analysis/regressions.py
Target: build/table/regressions.csv
Commit: 2548d50
Status: interpreted · 2026-06-02
Created: 2026-06-02

Comparator definition. Spec 3c's "clean comparator" pool is polls whose sponsor_types are a subset of {media, pollster_self} (poll_is_independent==1 in build/assemble/cand_poll.parquet, set in source/assemble/poll.py:65,110). Pollster-self polls — where the sponsor CNPJ equals the pollster CNPJ — are included because the pollster is fielding without a paying client and so has no sponsor-side bias incentive. See AN-002b for a media-only sensitivity that excludes pollster-self.

Results

Table: Timing-controlled specs on the clean-comparator sample

All three specs restrict to rows where the poll is either self-sponsored OR sponsored exclusively by independent media / pollster-self.

Spec	β (sponsored_by)	SE	p	sample
Spec 3a (clean + candidate FE)	+6.32	1.46	<0.001	21,453 rows
Spec 3b (clean + race × month FE)	+7.77	1.65	<0.001	21,453 rows
Spec 3c (clean + race × week FE)	+6.95	2.57	0.008	409 rows / 60 cells

(from source/analysis/regressions.py → build/table/regressions.csv)

Spec 3c identifies β off 60 (race × week) cells where the data contains BOTH a self-sponsored poll AND an independent poll fielded in the same week. The within-cell comparison strips any race-week trend (campaign-phase momentum, news cycle effect) by construction.

Interpretation

Strictest design, same magnitude. Spec 3c is the tightest design the project supports given the data, and it gives β ≈ +7 pp — the same magnitude as the within-candidate FE alone (AN-001).
Non-overlapping identifying variation. The two specs share no identifying variation: AN-001 identifies off within-candidate cross-poll variation; Spec 3c identifies off within-(race × week) cross-poll variation. Both converging on +7 pp is the load-bearing robustness fact: the headline isn't an artifact of either identification structure.
Timing-of-commission alternative ruled out. The "candidate commissions when leading" concern cannot mechanically generate this. To produce a 7 pp jump in Spec 3c, the candidate would need to commission a poll within 4-7 days of an independent poll of the same race AND have their private-belief-about-leading correlate with that 7-day window — implausible at the magnitudes observed.
No race-month-level confound. The Spec 3a → 3b → 3c progression also confirms there's no race-month-level confound: 3a (no race FE) = +6.3, 3b (race × month FE) = +7.8, 3c (race × week FE) = +7.0. Tightening the time window doesn't move β meaningfully.

Confidence rationale (green). Three timing-controlled specs converge on β ≈ +7 pp, with the tightest (race × week FE) significant at p=0.008 despite identifying off only 60 cells. The convergence with AN-001's within-candidate FE — which uses non-overlapping identifying variation — makes the headline robust to the leading alternative explanation (timing-of-commission). Remaining uncertainty is about the comparator definition, not the magnitude.

Follow-ups

60 race-week cells is still thin. The 2022 cycle extension (queued in docs/todo.md) would add governor-race cells and let us run Spec 3c with much more power.
Spec 3c implicitly assumes "independent media + pollster-self sponsored" really are independent of candidate strategy. The candidate-classifier LLM refinement (queued) would let us tighten this comparator.