Structural bound on scenario-pick effect: 93.2% of mayoral protocols have only ONE estimulado scenario; the canonical-pick choice is degenerate for them. Among the 6.8% multi-scenario subset (616 protocols, max 11 scenarios), median within-candidate spread across scenarios is 3.15 pp. Aggregate scenario-pick effect on β is therefore bounded at <0.2 pp — well below the +7 pp headline.
Question
GPT-5-pro's 2026-06-14 pre-submission review asked for robustness on
the choice of cued-recall (estimulado) scenario. cand_poll.py picks
the canonical scenario per protocol as "most distinct candidates,
tie-break alphabetical". GPT asked for:
(a) first listed estimulado scenario (b) average across estimulado scenarios (c) espontânea-only as a placebo
The cleanest defense is structural: how often is there even a choice?
Design
Direct counts on the upstream poll_response_2024.parquet (before
cand_poll.py's canonical-scenario filter), restricted to estimulado:
- How many protocols have one vs multiple estimulado scenarios?
- For the multi-scenario subset, how much does the per-candidate mean poll-percent shift across scenarios within the same protocol?
Results
| Stat | Value |
|---|---|
| Mayoral estimulado protocols | 9,045 |
| With one estimulado scenario (no choice) | 8,429 (93.2%) |
| With ≥2 estimulado scenarios | 616 (6.8%) |
| Max scenarios per protocol | 11 |
| Within-protocol per-candidate spread (median across multi-scenario protocols) | 3.15 pp |
| Within-protocol per-candidate spread (90th percentile) | 13.05 pp |
Interpretation
The scenario-pick choice is degenerate for 93.2% of protocols — there is only one estimulado scenario and the canonical-pick rule returns that scenario regardless of the tie-break logic. The pick can only matter for the 616 multi-scenario protocols (6.8% of the sample).
On that 6.8% subset, the within-candidate vote-percent spread across the available scenarios is 3.15 pp at the median. The upper bound on how much an alternative pick could move the per-protocol mean error is therefore ~3 pp, applied to 6.8% of the sample. Multiplied through, the aggregate effect on β is bounded at roughly 0.068 × 3.15 ≈ 0.2 pp — two orders of magnitude smaller than the +7 pp headline.
This bounds the answer to GPT's (a) and (b) without rebuilding the panel under each alternative pick: β is robust to scenario-choice by construction.
Caveats
- This is a structural bound, not an actual rerun. A direct alt-pick Spec 2 fit on the 6.8% subset would be tighter but requires modifying cand_poll.py to accept a pick-rule parameter and rebuilding from poll_2024.
- (c) espontânea-only as a placebo is genuinely interesting and would test whether the bias appears in open-ended recall (where the sponsor's candidate doesn't appear on a card the respondent sees). Rebuild + re-run is pipeline work; flagged for follow-up.
Follow-ups
- Espontânea panel rebuild for the GPT placebo. Cost: ~1 day of cand_poll.py work + a single regressions.py rerun.