AN-064: Inference sensitivities for spec 2 and spec 3c

Multiway (muni × pollster) clustering moves spec-2 SE 1.04 → 1.07; candidate-level clustering 1.04 → 1.14. Wild-cluster restricted bootstrap p-values are < 1/2000 for both spec 2 and spec 3c (the small-cluster spec where it matters most). Inference is robust to all three sensitivities.

Hypothesis: H1: Self-sponsored polls overstate the sponsoring candidate
Confidence: green
Type: robustness

Design

Sample: estimulado-non-aggregate-match2
Specification: Spec 2: error ~ sponsored_by + opponent_sponsored + log_sample_size + days_to_election + days_to_election_sq | candidate FE + pollster FE; cluster sensitivities below. Spec 3c: error ~ sponsored_by | candidate FE + race × week FE; WCR bootstrap at muni level.
Cluster: ['muni', 'muni × institute', 'politico_id']
Notes: WCR uses FWL residualization (AN-031 scaffold generalized) + Rademacher draws at the muni level, B = 2000.

Script: source/analysis/regressions.py
Target: build/table/regressions.csv
Status: interpreted · 2026-06-14
Created: 2026-06-14

Question

Two concerns flagged by GPT-5-pro's pre-submission review (2026-06-14):

Multiway clustering. Spec 2 is estimated with cluster-robust SE at the muni level. With pollster effects partly correlated within muni, two-way clustering by muni × pollster is the right sensitivity. A candidate-level cluster is also worth checking.
Finite-sample inference for Spec 3c. Spec 3c identifies β off 60 muni-week cells (28 in the media-only sensitivity AN-063). The asymptotic cluster-robust p-value may be unreliable in this regime. Wild-cluster restricted (WCR) bootstrap p-values are the standard finite-sample remedy.

Design

All three sensitivities run on the same samples as the headline specs:

Sensitivity	Sample	Method
Spec 2 baseline	22,665 rows / 2,669 munis	Cluster on muni
Spec 2 multiway	22,665 rows	Cluster on (muni, institute)
Spec 2 candidate-cluster	22,665 rows	`cluster_entity=True` (politico_id)
Spec 2 WCR p-value	22,665 rows	FWL-residualize → Rademacher bootstrap at muni level, B = 2000
Spec 3c WCR p-value	288 rows / 60 cells	Same FWL bootstrap

The WCR implementation in regressions.py:wcr_p_value generalizes the AN-031 route-bootstrap scaffold to arbitrary panel models. It residualizes both the target (sponsored_by) and the dependent (error) against the other controls + FE, then applies cluster-level Rademacher draws under the H0: β = 0 restriction.

Results

Spec	β	SE	p (CRVE)	p (WCR, B=2000)
Spec 2 (muni cluster)	+6.86	1.04	<0.001	—
Spec 2 (muni × institute)	+6.86	1.07	<0.001	—
Spec 2 (politico_id cluster)	+6.86	1.14	<0.001	—
Spec 2 (WCR)	—	—	—	<0.0005 (0/2000)
Spec 3c (muni cluster)	+8.05	2.60	0.003	—
Spec 3c (WCR)	—	—	—	<0.0005 (0/2000)

Both Spec 2 sensitivities move the SE by less than 10%, and the asymptotic p stays below 0.001. The WCR bootstrap p-values for both specs are below 1/B (the smallest nonzero value the bootstrap can return at this resolution); none of the 2,000 Rademacher draws produces a t-statistic at or above the observed magnitude.

Interpretation

Multiway clustering is not load-bearing. The 1.04 → 1.07 change under muni × institute is small; the (politico_id) cluster gives a slightly wider SE but the magnitude of the change (1.04 → 1.14) does not affect the inferential conclusion.
Finite-sample inference confirms Spec 3c. The WCR p-value matches the asymptotic CRVE p (0.003) in the direction of greater rejection. The 60 muni-week cells were the natural worry; this bootstrap addresses it directly.

The combined evidence supports reporting Spec 2 with muni-clustered SE and the WCR p-value in the footnote on Table 1 of the paper.

Follow-ups

The bootstrap B = 2000 can return p-values as small as 0.0005. If exact small-p resolution becomes important for revision rounds, increase B (cost: linear in B).
A score-bootstrap variant (Davidson-MacKinnon) would also be appropriate; not added because the CRVE asymptotic and the WCR bootstrap agree, so the marginal information is small.