Headline survives the five red-team substitutions. K1 (media-only comparator) preserves β at +7.59 with n→253; K3 (Route B vice-prefeito) is falsified upstream (0/429 vice matches); K4 (drop_absorbed) restores β at +8.00 under race-FE-only — within-candidate-FE selection is not generating β; K5 (drop Route D) raises β to +9.30. K2 (raw percent, no renormalization) attenuates β to +5.10 — the within-(protocol × scenario_label) renormalization contributes ~3 pp of the headline magnitude; the residual +5.10 on raw percent is the conservative β to cite alongside the +7.98 renormalized number.
Question
An adverse-referee review of the +7.94 spec-3c sponsor coefficient flagged five "potential killer" substitutions, each of which — according to the referee — could mechanically generate β > 0 without any real sponsor slant. This analysis runs all five on the same panel as a single consolidated robustness battery.
The five checks:
- K1 — Media-only comparator. Drop the 7,961 pollster_self polls (26% of all polls) on suspicion they under-report frontrunners, mechanically inflating β.
- K2 — Raw percent (no renormalization). Refit with
poll_percent_raw— sponsored polls list fewer candidates, so the within-(protocol × scenario_label) denominator is smaller and every share inflates mechanically. - K3 — Route B vice-prefeito audit. Route B is 429/641 = 67% of treated rows. Descriptive count of
committee_office == VICE-PREFEITOmatches to test whether vice-committees are tagged as sponsoring the prefeito on the same ticket. - K4 — drop_absorbed audit. Count the candidates with no within-variation in
sponsored_by; refit with race FE replacing candidate FE to estimate on the full panel. - K5 — Route D party-regex audit. Inspect the 152 party-name route sponsors by hand; refit dropping Route D entirely.
Design
K1/K2/K4/K5 each refit specs 2 and 3c on a modified sample (or variable); K3 is descriptive (no regression). One consolidated long-format table (build/table/robustness_redteam.csv) collects β / SE / p / n across check × spec; K3 and K5 also emit audit-style console output. Specs identical to source/analysis/robustness.py: spec 2 = candidate FE + pollster FE + log_sample_size + days_to_election + days²; spec 3c = race × week FE on the clean comparator, no candidate FE; cluster-robust SE at muni throughout.
Results
Table: Red-team K-checks vs baseline (β, SE, n across spec 2 and spec 3c)
| Check | Spec 2 β (SE) | Spec 3c β (SE) | n (3c) | Verdict |
|---|---|---|---|---|
| Baseline | +7.98 (1.32)*** | +7.94 (2.68)** | 448 | — |
| K1 — media-only comparator (drop pollster_self) | +7.98 (1.32)*** | +7.59 (4.30) | 253 | β stable; SE inflates with smaller n |
K2 — raw percent (no within-protocol renormalization) |
+5.10 (0.92)*** | +5.30 (2.08)** | 448 | β attenuates ~30%; renorm contributes ~3 pp |
| K3 — Route B vice-prefeito audit | n/a | n/a | n/a | Falsified upstream: 0/429 Route B matches are VICE-PREFEITO |
| K4 — race FE + pollster FE (no candidate FE) | +8.00 (1.04)*** | n/a | n/a | Within-candidate-FE selection is not driving β |
| K5 — drop Route D treated rows (party-name regex) | +8.82 (1.39)*** | +9.30 (3.61)** | 315 | β rises; Route D names are real party organs |
(Cluster-robust SE at muni throughout. *** p<0.001, ** p<0.01.)
(from build/table/robustness_redteam.csv)
Table: Per-check audit diagnostics
- K1. Dropping the 7,961 pollster_self polls leaves spec 2 unchanged at +7.98 (spec 2 uses all rows) and moves spec 3c from +7.94 to +7.59; SE inflates from 2.68 to 4.30 as the (race × week) cells with both sponsored and media-only independent polls drop from 448 to 253 rows.
- K2. Recomputing
error = poll_percent_raw − 100·final_sharegives β = +5.10 (spec 2) and +5.30 (spec 3c) — a ~30% attenuation; renormalization adds ~3 pp because sponsored polls list fewer non-aggregate candidates so the denominator is smaller. - K3. Zero of the 429 Route B treated rows are tagged
committee_office == VICE-PREFEITO; the upstream join inpipelines/politica/source/clean/poll_sponsor.pyalready restricts Route B to PREFEITO committees. Final-rank distribution of the 429 matches: 215 rank 1; 126 rank 2; 41 rank 3; 47 rank ≥4 — the expected "frontrunners commission their own polls" pattern. - K4. 8,205 of 8,431 candidates (97.3%) have no within-variation in
sponsored_by— only 226 candidates contribute on 1,311 of 31,186 rows (4.2%). Refitting with race FE replacing candidate FE (full panel) gives β = +8.00 (SE 1.04), within 0.02 pp of the candidate-FE estimate. - K5. The 152 Route D treated rows trace to 41 unique sponsor strings, all unambiguously party organs (PSD, PL, PSB, PMN, PP, PT, Republicanos, PMDB, etc.) — no "PLANEJAMENTO"/"PROPAGANDA" false positives among the top-20. Dropping Route D (n=634→482 in spec 3c; treated drops 23.7%) raises β to +8.82 (spec 2) and +9.30 (spec 3c).
(from source/analysis/robustness_redteam.py)
Interpretation
Four of the five attacks are cleared (K1, K3, K4, K5); the fifth (K2) partially attenuates β but does not flip the sign or kill significance.
- Within-candidate FE is not selecting on a small subset. K4 race-FE-only β = +8.00 on the full sample lands within 0.02 pp of the candidate-FE estimate — the 97.3% absorbed share is not driving the result.
- Route audits are clean. Route B committee CNPJ matches (K3) are all PREFEITO by upstream construction; Route D party-name regex matches (K5) are all real party organs and dropping them strengthens β.
- Pollster-self comparator contamination is not the source. K1 drops 7,961 pollster_self polls and preserves the point estimate at +7.59, with SE inflation reflecting the smaller race-week cell count.
- Renormalization is doing real work. ~3 pp of the headline magnitude comes from the within-(protocol × scenario_label) step. The conservative raw-percent specification gives β = +5.10 to +5.30 — still within-candidate-significant, but materially smaller.
The paper note should report both numbers — the renormalized β as the primary specification (comparable scale to final_share) and the raw β as the robustness number that excludes any artifact-of-renormalization concern.
Confidence rationale (green). Four of five referee attacks are cleared with the point estimate intact or strengthened; the fifth attenuates β by ~30% but the residual +5.10 remains highly significant and within-candidate-identified. The sign, significance, and order-of-magnitude of the headline survive every substitution.
Follow-ups
- K2 mechanism: why does renormalization inflate sponsored polls more? (puzzle). The renormalization denominator depends on the set of non-aggregate candidates listed in the (protocol × scenario_label) row. Sponsored polls plausibly list fewer candidates (a sender-side design lever already in [[design_levers]] §rosters). Test: per protocol, count
n_non_aggregate_candidatesand the sum ofpercentover them (the implicit denominator). Compare distributions for sponsored vs independent polls. If sponsored polls have systematically smallerΣ percent, then K2's ~3 pp attenuation is mechanistically explained — and the renormalization-inflated β has Channel-A content (sender chose to list few candidates → mechanical share bump on the listed sponsor) on top of the raw +5 pp. Suggested script:denominator_audit.py. - Drop_absorbed disclosure for the paper (extension). The 97.3% absorbed share is a striking but reportable fact: candidate-FE identification rests on 226 candidates / 1,311 rows of the treated-side comparator structure. The paper note should disclose this explicitly with a footnote citing AN-010 §K4. The race-FE-only β = +8.00 belongs in the robustness table alongside K1/K2/K5.
- K1 cell-count fragility (blind spot, low priority). Spec 3c under K1 retains the point estimate but the p-value crosses 0.05 (p = 0.082). The paper note should not lean on spec-3c K1 as strong evidence — the test is underpowered by the time pollster_self polls are dropped from the strict race-week comparator. Spec 3a (race-month FE) under K1 may recover power; worth a single-line addition.
- Comparable on independent panel: media-only β (extension). If pollster_self polls themselves carry bias (in either direction), K1's β = +7.59 is partially diagnostic — but the pure media-only independent baseline should also be computed as the mean error for media-only polls (no candidates self-sponsored). Compare to the all-Brazil +0.93 figure for "independent" polls; if the media-only mean error is materially closer to 0 than +0.93, then pollster_self polls are biased and K1 understates the true sponsor effect. Suggested script: a 10-line addition to
source/sponsor_baseline.py(if it exists) or a one-off cell.