Multiway (muni × pollster) clustering moves spec-2 SE 1.04 → 1.07; candidate-level clustering 1.04 → 1.14. Wild-cluster restricted bootstrap p-values are < 1/2000 for both spec 2 and spec 3c (the small-cluster spec where it matters most). Inference is robust to all three sensitivities.
Question
Two concerns flagged by GPT-5-pro's pre-submission review (2026-06-14):
- Multiway clustering. Spec 2 is estimated with cluster-robust SE
at the muni level. With pollster effects partly correlated within
muni, two-way clustering by
muni × pollsteris the right sensitivity. A candidate-level cluster is also worth checking. - Finite-sample inference for Spec 3c. Spec 3c identifies β off 60 muni-week cells (28 in the media-only sensitivity AN-063). The asymptotic cluster-robust p-value may be unreliable in this regime. Wild-cluster restricted (WCR) bootstrap p-values are the standard finite-sample remedy.
Design
All three sensitivities run on the same samples as the headline specs:
| Sensitivity | Sample | Method |
|---|---|---|
| Spec 2 baseline | 22,665 rows / 2,669 munis | Cluster on muni |
| Spec 2 multiway | 22,665 rows | Cluster on (muni, institute) |
| Spec 2 candidate-cluster | 22,665 rows | cluster_entity=True (politico_id) |
| Spec 2 WCR p-value | 22,665 rows | FWL-residualize → Rademacher bootstrap at muni level, B = 2000 |
| Spec 3c WCR p-value | 288 rows / 60 cells | Same FWL bootstrap |
The WCR implementation in regressions.py:wcr_p_value generalizes the
AN-031 route-bootstrap scaffold to arbitrary panel models. It
residualizes both the target (sponsored_by) and the dependent
(error) against the other controls + FE, then applies cluster-level
Rademacher draws under the H0: β = 0 restriction.
Results
| Spec | β | SE | p (CRVE) | p (WCR, B=2000) |
|---|---|---|---|---|
| Spec 2 (muni cluster) | +6.86 | 1.04 | <0.001 | — |
| Spec 2 (muni × institute) | +6.86 | 1.07 | <0.001 | — |
| Spec 2 (politico_id cluster) | +6.86 | 1.14 | <0.001 | — |
| Spec 2 (WCR) | — | — | — | <0.0005 (0/2000) |
| Spec 3c (muni cluster) | +8.05 | 2.60 | 0.003 | — |
| Spec 3c (WCR) | — | — | — | <0.0005 (0/2000) |
Both Spec 2 sensitivities move the SE by less than 10%, and the asymptotic p stays below 0.001. The WCR bootstrap p-values for both specs are below 1/B (the smallest nonzero value the bootstrap can return at this resolution); none of the 2,000 Rademacher draws produces a t-statistic at or above the observed magnitude.
Interpretation
- Multiway clustering is not load-bearing. The 1.04 → 1.07 change under muni × institute is small; the (politico_id) cluster gives a slightly wider SE but the magnitude of the change (1.04 → 1.14) does not affect the inferential conclusion.
- Finite-sample inference confirms Spec 3c. The WCR p-value matches the asymptotic CRVE p (0.003) in the direction of greater rejection. The 60 muni-week cells were the natural worry; this bootstrap addresses it directly.
The combined evidence supports reporting Spec 2 with muni-clustered SE and the WCR p-value in the footnote on Table 1 of the paper.
Follow-ups
- The bootstrap B = 2000 can return p-values as small as 0.0005. If exact small-p resolution becomes important for revision rounds, increase B (cost: linear in B).
- A score-bootstrap variant (Davidson-MacKinnon) would also be appropriate; not added because the CRVE asymptotic and the WCR bootstrap agree, so the marginal information is small.