Polls commissioned by a Brazilian 2024 mayoral candidate overstate that candidate by ~7 percentage points
🟢 Within-candidate FE gives β = +7.75 pp (SE 1.34, p < 0.001) on 568 self-sponsored candidate-poll rows across 793 polls, 8,431 candidates, 2,942 races. Sample: 2024 mayoral 1st round, estimulado scenarios, match_score ≥ 2.
The estimate replicates across three independent identification strategies:
- Within-candidate panel FE (AN-001): +7.75 pp
- Race × week FE with independent-comparator restriction (AN-002): +6.95 pp
- Pre-poll trajectory descriptive placebo (AN-003): +6.70 pp (t = 5.21, n = 132 candidates)
Three converging estimates from non-overlapping identifying variation upgrade this finding to 🟢 replicated.
SE robustness check (added 2026-06-19, AN-111). AN-110 surfaced a pooled empirical DEFF* = 12.59 at race × week × candidate level (cross- poll dispersion within cells is much wider than binomial + DEFF≤3 predicts), prompting the question of whether the headline β survives under SEs that reflect the empirical noise floor. AN-111 re-ran Spec 2 and Spec 3c under six SE choices — cluster at muni_id (baseline), race_week, politico_id, two-way (muni_id, field_period_week), two-way (muni_id, politico_id), and wild-cluster bootstrap at race_week (B=2000). Spec 2 β = +6.86 pp survives at t ≥ 6.0 under every choice; Spec 3c β = +7.91 pp survives at t ≥ 3.0 analytic and t = 6.1 under bootstrap. The aggregate β is identified off averaging over n=450–568 sponsored cand-poll rows, and that averaging is robust to the empirical noise floor — the noise floor matters for per-poll detectability / auditability comparators (per AN-108, AN-109, AN-110) but not for aggregate identification.
Sources.
- Own analysis:
build/table/regressions.csv(AN-001 Spec 2); AN-001; AN-002; AN-003; AN-111 (build/table/an-111-headline-robustness-empirical-noise.csv) - Cross-refs: H1 headline-sponsor-bias; briefs/all_brazil_analysis.md; blind-audit-detectability (per-poll detectability / auditability — separate from aggregate identification)
Cited analyses
Within-candidate FE on 568 self-sponsored candidate-poll rows gives β = +7.75 pp (p<0.001) — large, robust to spec, robust to renormalization choice. The opponent-sponsored coefficient is -1.93 (p=0.030), so the bias is sender-specific not a generic house effect.
Restricting comparators to media-or-pollster-self polls in the same race within the same week (Spec 3c) gives β = +6.95 pp (p=0.008) on 60 race-week cells. The tightest design and the most robust to the timing-of-commission alternative.
For 132 candidates whose self-sponsored poll was preceded by an independent poll in the same race (median gap 10 days), the within-candidate jump from preceding-independent to self-sponsored error is +6.70 pp (t=5.21). The most intuitive counter to "self-sponsor when leading".
Headline β robust to all SE choices. Spec 2 β = +6.86 pp (SE 0.81–1.14 across cluster_muni/race_week/politico_id/twoway/wcr_race_week; t ≥ 6.0; all p<0.001). Spec 3c β = +7.91 pp (SE 2.30–2.64; t ≥ 3.0; all p ≤ 0.004; wcr-bootstrap t=6.1). AN-110's empirical noise floor is mechanism evidence, not a threat to the aggregate result. No decisions.md walk-back.
Pooled empirical DEFF* = 12.59 across 673 race × week × candidate cells with ≥3 independent polls (median cell DEFF* = 3.93, P75 = 9.59); realized cross-poll SD on a candidate's share is ≈ 5.0 pp, well above the binomial + DEFF≤3 model. Explains AN-109's miscalibration and qualifies AN-108's per-poll-loud reading: against the empirical cross-firm noise floor (not the binomial benchmark), a +7 pp single-poll shift is ≈ 0.8 SDs — within the noise floor. Aggregate regression β survives because N=568 averages the noise out. Informative about reputation / auditability comparators, not about channel decomposition.
Binomial SE on the sponsor candidate's share is ≤ 3.5 pp on 99.3 % of sponsored polls (median 2.49 pp at median n=360); +7 pp is z ≈ 2.8 against binomial SE on the median sponsored poll. Per-poll bias is statistically visible against a tight (binomial / unbiased) benchmark — informs the reputation / auditability story, not the channel decomposition. AN-110 walks this back against the empirical noise floor.
**v2 (2026-06-19; supersedes):** calibrated at AN-110 pooled empirical DEFF\\* = 12.59, the blind audit reaches nominal FP (independent protocols 4.8 % at Bonferroni vs nominal 5 %); sponsored excess detection persists at **+7.5 pp Bonferroni / +11.9 pp single-hypothesis** — a calibrated regulator-facing blind audit distinguishes sponsored from independent at a 2.6× lift. The bias has tail concentration: only 4.3 % of sponsor's-candidate rows fail at the empirical DEFF (per-poll bias is buried for 95.7 %), but 12.3 % of sponsored protocols catch via 'at least one share extreme'. **v1 (binomial + DEFF ≤ 3):** the test was uncalibrated (FP 30–60 %); sponsored excess was +13 to +19 pp but the gap inflated by miscalibration.