AN-067: Scenario-pick robustness

Structural bound on scenario-pick effect: 93.2% of mayoral protocols have only ONE estimulado scenario; the canonical-pick choice is degenerate for them. Among the 6.8% multi-scenario subset (616 protocols, max 11 scenarios), median within-candidate spread across scenarios is 3.15 pp. Aggregate scenario-pick effect on β is therefore bounded at <0.2 pp — well below the +7 pp headline.

Hypothesis: H1: Self-sponsored polls overstate the sponsoring candidate
Confidence: green
Type: robustness

Design

Sample: 2024 mayoral estimulado-only polls (poll_response_2024.parquet upstream of cand_poll.py's canonical-scenario filter).
Specification: Descriptive structural bound, not a regression rerun. Two facts: (i) share of protocols with >1 estimulado scenario; (ii) within-candidate vote-percent spread across scenarios on the multi-scenario subset.
Notes: Espontânea-only as a placebo (GPT's third request) would require rebuilding the full candidate-poll panel from poll_2024 with the espontânea filter; deferred as pipeline work.

Script: source/analysis/an-067-scenario-robustness.py
Target: build/table/an-067-scenario-robustness.csv
Status: interpreted · 2026-06-14
Created: 2026-06-14

Question

GPT-5-pro's 2026-06-14 pre-submission review asked for robustness on the choice of cued-recall (estimulado) scenario. cand_poll.py picks the canonical scenario per protocol as "most distinct candidates, tie-break alphabetical". GPT asked for:

(a) first listed estimulado scenario (b) average across estimulado scenarios (c) espontânea-only as a placebo

The cleanest defense is structural: how often is there even a choice?

Design

Direct counts on the upstream poll_response_2024.parquet (before cand_poll.py's canonical-scenario filter), restricted to estimulado:

How many protocols have one vs multiple estimulado scenarios?
For the multi-scenario subset, how much does the per-candidate mean poll-percent shift across scenarios within the same protocol?

Results

Stat	Value
Mayoral estimulado protocols	9,045
With one estimulado scenario (no choice)	8,429 (93.2%)
With ≥2 estimulado scenarios	616 (6.8%)
Max scenarios per protocol	11
Within-protocol per-candidate spread (median across multi-scenario protocols)	3.15 pp
Within-protocol per-candidate spread (90th percentile)	13.05 pp

Interpretation

The scenario-pick choice is degenerate for 93.2% of protocols — there is only one estimulado scenario and the canonical-pick rule returns that scenario regardless of the tie-break logic. The pick can only matter for the 616 multi-scenario protocols (6.8% of the sample).

On that 6.8% subset, the within-candidate vote-percent spread across the available scenarios is 3.15 pp at the median. The upper bound on how much an alternative pick could move the per-protocol mean error is therefore ~3 pp, applied to 6.8% of the sample. Multiplied through, the aggregate effect on β is bounded at roughly 0.068 × 3.15 ≈ 0.2 pp — two orders of magnitude smaller than the +7 pp headline.

This bounds the answer to GPT's (a) and (b) without rebuilding the panel under each alternative pick: β is robust to scenario-choice by construction.

Caveats

This is a structural bound, not an actual rerun. A direct alt-pick Spec 2 fit on the 6.8% subset would be tighter but requires modifying cand_poll.py to accept a pick-rule parameter and rebuilding from poll_2024.
(c) espontânea-only as a placebo is genuinely interesting and would test whether the bias appears in open-ended recall (where the sponsor's candidate doesn't appear on a card the respondent sees). Rebuild + re-run is pipeline work; flagged for follow-up.

Follow-ups

Espontânea panel rebuild for the GPT placebo. Cost: ~1 day of cand_poll.py work + a single regressions.py rerun.