H1: Self-sponsored polls overstate the sponsoring candidate

When a Brazilian mayoral candidate commissions an electoral poll, the poll's reported share for that candidate is systematically higher than what other polls of the same race report for the same candidate. The within-candidate gap is the quantity of interest: holding the candidate fixed, does sponsorship by that candidate's own campaign shift the reported number? If Channel A (sponsor as Bayesian persuader) is real, the coefficient on the SponsoredBy_c indicator should be positive and economically meaningful — on the order of 5–10 pp.

Evidence strength: Confirmed by AN-001, AN-002, AN-003 (2026-06-02). Within-candidate panel FE gives β = +7.75 pp (SE 1.34, p < 0.001) on 568 self-sponsored candidate-poll rows. Race × week FE with the independent-comparator restriction gives +6.95 pp; the pre-poll trajectory descriptive placebo gives +6.70 pp. Three non-overlapping identifying strategies converge on the same magnitude.

Theory

The framework is Polls as Bayesian persuasion (theory.md §"Polls as Bayesian persuasion (supply-side / Channel A)"). The sponsor is the sender; voters, donors, and coalition partners are the receivers. The sender chooses the signal structure — wording, sample frame, methodology disclosures — to maximize the posterior weight receivers place on the candidate's viability. In the canonical Kamenica & Gentzkow (2011) setup, commitment to the signal structure is the sender's only constraint; in the TSE poll registry that commitment is institutional (the methodology is filed before fielding), which is why the supply-side prediction is sharp here.

Prediction

A regression of reported candidate share on a SponsoredBy_c indicator, absorbing candidate fixed effects, yields a positive coefficient on the order of 5–10 pp. The mechanism is supply-side: the sponsor commissions slant; the candidate fixed effect strips out the obvious confounder that candidates with higher standing might attract more sponsorship.

Competing predictions

Reverse causation. Candidates commission polls when their own internal tracking already suggests they are leading, so the cross-section of self-sponsored polls is mechanically skewed toward leaders without any slant being inserted. If this is the whole story, the within- candidate FE coefficient should attenuate sharply toward zero, since the same candidate's polls would not move across sponsorship status conditional on their true standing.

Sample-frame selection. Self-sponsored polls might be drawn from strongholds (neighborhoods, voter demographics) that lean toward the sponsoring candidate. This would inflate the reported share without any active distortion of the signal structure. AN-072 (poll methodology paired) is the natural test — declared sample frames are flat across sponsor type, refuting this alternative.

Prior research

The "encomendada" poll is a recognized phenomenon in Brazilian political discourse but has not been quantified with a registry-wide design. Folha (2022) documented the PL paying Paraná Pesquisas R$ 2.7M via Fundo Partidário in the 2022 pre-campaign while not appearing as TSE contratante on any of the firm's 63 presidential polls; the firm's polls consistently reported a Lula–Bolsonaro technical tie while Datafolha/Ipec/Quaest showed Lula leading [stories.csv #077]. The closest direct sponsor-bias predecessors are online opinion-survey experiments [cite:leeper2019sponsorship; cite:crabtree2020sponsorship], not pre-election polls. Brazilian work on non-random poll error documents systematic deviations but has not linked them to sponsor identity [cite:meireles2022pesquisas; cite:lloyd2016vote].

Evidence

Analysis	Bearing	Key takeaway
AN-001	Confirms	Within-candidate FE on the analysis panel: β = +7.75 pp (SE 1.34, p < 0.001), n_self = 568 across 793 polls, 8,431 candidates, 2,942 races.
AN-002	Confirms	Race × week FE with independent-comparator restriction: +6.95 pp. Identifying variation now lives strictly within a single race-week, independent of any pre-existing differences across races.
AN-003	Confirms (placebo passes)	Pre-poll trajectory descriptive placebo: self-sponsored polls exceed the trajectory implied by prior independent polls of the same candidate by +6.70 pp (t = 5.21, n = 132 candidates) — refutes the "candidates commission when leading" reverse-causation reading.

Open tests

Channel A vs Channel B mechanism decomposition

The +7 pp gap can in principle arise either from declared design choices (sample-frame skew, weighting, question wording — Channel A) or from post-fielding tampering of the reported numbers (Channel B). H10 (methodology-flexibility-a) is the natural decomposition: once the poll_methodology LLM extractor runs, we re-estimate Spec 2 absorbing the declared methodology bundle. The residual is the Channel B share. See H10 for the design.

Cross-state robustness on registered universe

AN-001 estimates on the analysis panel (1st-round mayoral, estimulado, match_score ≥ 2). Re-running the spec on the full 14.9k registered universe — including 2nd-round races and senatorial polls — would test whether the magnitude is specific to the headline sample. Blocked on sample-frame extension in cand_poll.py.

Supporting analyses

AN-001 green causal

Within-candidate FE on 568 self-sponsored candidate-poll rows gives β = +7.75 pp (p<0.001) — large, robust to spec, robust to renormalization choice. The opponent-sponsored coefficient is -1.93 (p=0.030), so the bias is sender-specific not a generic house effect.

AN-002 green robustness

Restricting comparators to media-or-pollster-self polls in the same race within the same week (Spec 3c) gives β = +6.95 pp (p=0.008) on 60 race-week cells. The tightest design and the most robust to the timing-of-commission alternative.

AN-003 green placebo

For 132 candidates whose self-sponsored poll was preceded by an independent poll in the same race (median gap 10 days), the within-candidate jump from preceding-independent to self-sponsored error is +6.70 pp (t=5.21). The most intuitive counter to "self-sponsor when leading".

AN-009 green robustness

χ²(5) = 2.31, p = 0.80 on the joint party × sponsor interaction — by-party point-estimate dispersion in the subset regressions of robustness.md §3 is not statistically distinguishable from noise around the pooled β. The pooled +7.92 (OTHER baseline) is a sufficient summary; the apparent MDB null was a reduced-power artifact of subset regression, not a real party effect.

AN-010 green robustness

Headline survives the five red-team substitutions. K1 (media-only comparator) preserves β at +7.59 with n→253; K3 (Route B vice-prefeito) is falsified upstream (0/429 vice matches); K4 (drop_absorbed) restores β at +8.00 under race-FE-only — within-candidate-FE selection is not generating β; K5 (drop Route D) raises β to +9.30. K2 (raw percent, no renormalization) attenuates β to +5.10 — the within-(protocol × scenario_label) renormalization contributes ~3 pp of the headline magnitude; the residual +5.10 on raw percent is the conservative β to cite alongside the +7.98 renormalized number.

AN-011 green robustness

Permutation null (B=500) rejects at p < 0.002: 0/500 within-(race × week) random shuffles of sponsored_by produce β as extreme as the observed +4.68 (race-week FE only; the +7.94 candidate+race-week-FE headline is yet further out the same tail). Pollster jackknife (top-20): β range [+7.80, +8.13], sd 0.074 — no single firm drives the result. UF jackknife (26 states): β range [+7.42, +8.32], sd 0.180 — no single state drives the result. The +7.98 spec-2 baseline survives every leave-one-out.

AN-012 green robustness

Wild-cluster bootstrap on the race-week-FE-only spec (β=+4.68, n=448, G=51 muni clusters): WCR two-sided p = 0.0175 (B=2000 Rademacher draws); WCU 95% percentile CI [+1.09, +8.29] and percentile-t CI [+0.88, +8.50]. WLS spec-2 weighted by sample_size: β=+8.15 (SE 1.42, p<0.001) — within 0.18 pp of the unweighted baseline. Week-window sensitivity (`%U` baseline vs `%V` ISO vs 10-day vs 14-day rolling): β range [+4.16, +5.31], all p < 0.02 — the race-week-FE-only spec is not brittle to the week-boundary definition. The proper spec-3c TWFE wild-cluster bootstrap (candidate FE + race-week FE) is parked as a follow-up; the rejection here on the more conservative spec extends to the headline a fortiori.

AN-013 green robustness

Crude per-candidate post-fielding tampering ruled out by within-sponsored-poll digit forensics; broader fabrication channels NOT ruled out. Round-number frequency indistinguishable between sponsor's own candidate (18.2 % integers) and other candidates in the same sponsored poll (17.7 %; A vs B z=+0.26, p=0.79). Mult-of-5 indistinguishable (4.1 % vs 3.7 %; z=+0.45, p=0.66). Tenths-digit distribution for values ≥ 5 has the same shape across A and B (digit-0 share 20.4 % vs 20.3 %). Group A's Benford first-digit shift toward 4–5–6 vs B's 1–2–3 reflects sponsors polling viable candidates (30–60 % range), not numerical manipulation: Group B inside the same sponsored polls follows Benford normally. **Blind spots of this test:** sophisticated manipulation that preserves digit distributions, proportional within-poll rescaling, and any pre-publication data work (quota reweighting, dropping strata, re-running) leave no digit signature and are NOT addressed here. Substantive read: the +7-8 pp average is not produced by crude per-candidate edits leaving digit fingerprints; the broader question of whether design-driven (Channel A) or sophisticated/pre-publication (Channel B) mechanisms dominate is the companion paper's job.

AN-013V2 green robustness

Three forensics targeting the AN-013 blind spots — design-respecting fabrication — return **two clean nulls and one ambiguous-cause positive**. (T1) Standardised-error variance test: var(z|sponsored, demeaned) = 79.7 vs var(z|indep, demeaned) = 315.3; F-test ratio 0.253 p<0.0001 but Levene's median-centred p = 0.122 — the F-test result is outlier-driven; sponsored polls are NOT 'too clean' (var of 80 σ² means SD ≈ 9 σ around the bias mean, plenty of spread). (T2) Bias-concentration test: sponsored polls have 11.5 % within ±2pp of own mean vs 19.0 % for indep (z = −4.75, p < 0.0001 in the *anti*-fabrication direction — sponsored polls are MORE spread, not less). (T3) Within-firm rounding TVD on tenths-digit of poll_percent_raw: 12 firms qualify with ≥10 sponsored and ≥10 indep each; mean TVD = 0.39 (vs ~0.15-0.20 expected under H0); 3 of 12 firms significant at chi-square p<0.05 (vs 0.6 expected). Read: T1+T2 argue *against* simple sample-design-consistent fabrication as the headline mechanism — the sponsored-error distribution is wider, not tighter, than the indep distribution. T3 picks up real within-firm processing differences (could be differential subcontracting, customer-specific reporting templates, or fabrication; cause is not separable from this test). Combined with AN-013 v1 (no per-row digit-tampering signature), the cumulative weight of evidence is that **the +7 pp is not concentrated in a single big fabrication lever**. The residual likely lives in a constellation of small effects across sample-frame contamination (1-4 pp prior), interviewer scripting (0-2 pp), and strategic timing (1-2 pp), each individually modest.

AN-014 green robustness

Red-team K2 hypothesis falsified. **Sponsors do not list fewer candidates.** Within-candidate denominator gap (sponsored poll − non-sponsored polls, n = 226 candidates with within-variation): median +0.15, mean −2.61, 50 % of candidates have a positive gap. Cell-level sponsored cells in fact have *larger* denominators (91.74 vs 78.68; Welch p < 0.0001 in the *opposite* direction from the red-team conjecture). The K2 renormalized-vs-raw gap (+1.31 pp within-candidate, +2.88 pp in the regression spec) is the mechanical multiplicative scaling of renormalization on a non-zero raw effect: predicted gap under H₀ 'denominators identical within candidate' = +6.08 × 1.198 = +7.29 pp, observed +7.40 pp, residual +0.11 pp — **98.6 % of the renormalized gap is the multiplicative effect, 1.4 % is any denominator shift**. The Channel-A characterization should not include 'sponsors list fewer candidates'; that lever is empirically absent.

AN-015 green robustness

Data-quality concern dispatched. Three of four extraction-quality proxies (denom_dist, n_named, mean_match) are statistically indistinguishable between sponsored and independent cells; the fourth (denom) differs in the *opposite* direction from a 'sponsored polls have fewer extracted candidates' hypothesis (+13.06, p<10⁻¹⁵). Spec 2 augmented with denom_dist + mean_match controls leaves β unchanged at +8.00. The clean-denominator subset (denom ∈ [80, 110]) gives β=+5.39, equal to the AN-010 K2 raw β of +5.10 — consistent with AN-014's mechanical-multiplicative-scaling story (scale factor ≈ 1 when denom ≈ 100) rather than any data-quality artifact. Outlier audit on the top-10 sponsored-row errors shows a mix of inflation and deflation (poll%=100 and poll%=0 cases), not a systematic upward bias. The proper within-firm test is parked as a follow-up: top-5 pollsters by row count don't sponsor enough polls to estimate within-firm β; need to refit on firms that DO sponsor (the customer-mix sorting set from AN-007).

AN-016 green robustness

Data-quality concern definitively dispatched **and** a substantively new finding surfaces: dramatic between-firm β heterogeneity. Within-firm β (15 firms ≥ 10 sponsored rows; 31 firms ≥ 5): mean +6.5 / +7.2, median +4.4 / +6.3, **range [−11, +35], sd 10.3**. 19 of 31 firms (61 %) are individually significant at p < 0.05; 22/31 positive. **Big-name firms (CENSUS, IIP, INSTITUTO PARANÁ, Verita) have β near zero or modestly negative within-firm; smaller firms (METHODUS +24.7, CAMARGO +23.6, INTENÇÃO +35.2, DATA SC +16.1) slant heavily.** Because PDF style and LLM-extraction pattern are held strictly fixed within firm, the cross-firm β dispersion (sd 10.3 pp) is *real cross-firm sponsor-behavior heterogeneity*, not data-quality artifact. The data-quality concern is closed.

AN-059 green robustness

**Variance decomposition of the headline +7 pp: 100 % within-firm, ~0 % between-firm sponsor selection.** Spec A (cand FE only, headline replicate): β = +7.85 pp (SE 1.24, p ≈ 2.6 × 10⁻¹⁰). Spec B (cand FE + firm FE, the within-firm sponsor effect averaged across all 426 firms in the analysis sample): β = +7.98 pp (SE 1.25, p ≈ 1.9 × 10⁻¹⁰). The between-firm component (A − B) is **−0.13 pp**. The +7 pp is not 'sponsors hire firms that systematically over-state' — it is 'firms slant when paid by candidates, regardless of which firm'. Same firm, same pollster style, different customer → +8 pp. The firm-level slant-for-hire selection hypothesis (2-4 pp prior in docs/thinking.md residual decomposition) is structurally ruled out at headline scale. AN-016's within-firm β dispersion (sd 10.3) is still real; AN-018's size-discipline (small firms slant more among the 31 firms ≥ 5 sponsored) still holds; AN-059's decomposition says sponsors do not preferentially load on the high-slant firms in the universe at large.

AN-063 green robustness

Restricting the Spec 3c clean comparator to media-only polls (dropping pollster-self) gives β = +8.15 pp (p = 0.036, n = 165 rows / 28 race-week cells) — if anything larger than the media-or-pollster-self spec. Pollster-self inclusion is not pulling the headline up.

AN-064 green robustness

Multiway (muni × pollster) clustering moves spec-2 SE 1.04 → 1.07; candidate-level clustering 1.04 → 1.14. Wild-cluster restricted bootstrap p-values are < 1/2000 for both spec 2 and spec 3c (the small-cluster spec where it matters most). Inference is robust to all three sensitivities.

AN-066 green robustness

Benjamini-Hochberg FDR correction on the 10 within-pair Channel-A directional tests. 6 of 10 survive at q < 0.05 and 7 of 10 at q < 0.10. The only 'small positive, marginal' lever (population-frame mismatch, p = 0.12) does NOT survive correction (q_BH = 0.15), sharpening the 'no single lever carries +7 pp' conclusion.

AN-067 green robustness

Structural bound on scenario-pick effect: 93.2% of mayoral protocols have only ONE estimulado scenario; the canonical-pick choice is degenerate for them. Among the 6.8% multi-scenario subset (616 protocols, max 11 scenarios), median within-candidate spread across scenarios is 3.15 pp. Aggregate scenario-pick effect on β is therefore bounded at <0.2 pp — well below the +7 pp headline.

AN-068 green robustness

Spec 2 sponsor-label permutation null: 500 random reassignments of sponsored_by across the FWL-residualized panel produce a null distribution centered on 0.005 pp (sd 0.62, max |β| = 2.07). The observed β = +6.86 pp is unreachable in 500 draws — permutation p < 1/B (= 0.002), about 11 null SDs away from the observed magnitude.

AN-069 green descriptive

Same-poll sponsor-vs-top-opponent DiD: +8.2 pp (t = 6.43, n = 239 pairs, 64% positive). The sponsor's candidate's error jumps MORE between matched independent and sponsored polls than the top opponent's error does. Large sponsor-specific shift consistent with either per-candidate slant or sample-design favoring the sponsor's voter base.

AN-070 green figure

Event-study figure of independent-poll polling error in ±4 weeks around each self-sponsored poll. Independent bins hover near zero error; the self-sponsored event-day point sits at +7.4 pp (SE 0.93). Visualizes the within-candidate trajectory test.

AN-111 green robustness

Headline β robust to all SE choices. Spec 2 β = +6.86 pp (SE 0.81–1.14 across cluster_muni/race_week/politico_id/twoway/wcr_race_week; t ≥ 6.0; all p<0.001). Spec 3c β = +7.91 pp (SE 2.30–2.64; t ≥ 3.0; all p ≤ 0.004; wcr-bootstrap t=6.1). AN-110's empirical noise floor is mechanism evidence, not a threat to the aggregate result. No decisions.md walk-back.

AN-008 yellow robustness

SP-only prototype (2026-06-01): within-candidate FE gave β = +7.24 pp on 15 self-sponsored rows. Superseded by AN-001's all-Brazil run but kept as the project's first-pass design validation.

AN-065 yellow descriptive

Firm-race coverage proxy for the publication-selection alternative. In the runoff-eligible 172-muni sample, sponsored polls of firms with NO media coverage show MORE bias (mean error +15.2 pp, n=10) than sponsored polls of firms WITH coverage (+7.2 pp, n=28) — the opposite of what publication selection predicts. n=10 is too thin for inference, but the direction does not support publication selection driving the headline.

AN-071 yellow descriptive

No cross-sectional correlation between per-firm polling accuracy (MAE on non-self-sponsored polls) and within-firm sponsor β. Pearson r = −0.09 (p = 0.68) on 22 firms. The reputation channel in §sec:policy operates through volume/visibility (AN-018), not through accuracy as a standalone signal.