Within race, 2024 mayoral candidates with ≥1 self-sponsored poll are 3.15 pp more likely to appear as litigant in a fraud-flavored PESQUISA-eleitoral case (race-FE OLS, p=0.011, base rate 7.9 %). Direction matches the perceived-bias prediction; magnitude is small in absolute terms but ~40 % relative. Naive (no-race-FE) comparison runs the other way because self-sponsoring candidates cluster in races with lower overall sue rates.

Hypothesis
perceived-bias-validation
Confidence
yellow
Type
descriptive
Design
Sample
6,768 PREFEITO-2024 candidates with ≥1 poll in poll_2024 registry and a CPF in politico table; 2,871 unique races (muni_id).
Specification
candidate-level OLS of P(any fraud-flavored PESQUISA case involvement) on any_self ∈ {0,1}. Race FE via within-muni demeaning. Cluster-robust SE on muni_id. Controls in {[],[log(n_polls)],[log(n_polls), log1p(n_indep)]}. Continuous SponsoredBy_c = n_self / n_polls as alt treatment.
Notes
Fraud bucket via assunto_desc ∈ {DIVULGACAO DE PESQUISA ELEITORAL FRAUDULENTA, PESQUISA FRAUDULENTA, DIVULGACAO DE PESQUISA DE FRAUDULENTA, IRREGULARIDADES DOS DADOS PUBLICADOS EM PESQUISAS ELEITORAIS}. No LLM in v1 — the 2024 DataJud taxonomy is granular enough to do the cut from assunto alone. cand_proc_2024 matches CPF to TREdiarios parte; only ~12 % of PESQUISA cases match any candidate (most defendants are pollster CNPJs), so this is a candidate-side, not poll-side, test.
Script
source/analysis/an-072-fraud-suit-by-sponsor.py
Target
build/table/an-072-fraud-suit-by-sponsor.csv
Status
interpreted · 2026-06-16
Created
2026-06-16

Question

Use 1 of the EJ poll-lawsuits agenda (docs/todo.md § Complementary data). Do candidates with self-sponsored polls draw legal challenges framed as methodological fraud? The 50-case 2020 pilot showed the sued universe is overwhelmingly formal compliance, so the revised test restricts to fraud-flavored assuntos. If the perceived-bias prediction holds, candidates whose polls include self-sponsored ones should be over-represented as litigants in fraud cases.

Design

Universe: 6,768 PREFEITO-2024 candidates appearing in build/assemble/cand_poll.parquet. Treatment is any_self = 1{n_self ≥ 1} (5.5 % of the sample) or continuous SponsoredBy_c = n_self / n_polls. Outcomes are candidate-level indicators built by joining cand_proc_2024.csv (CPF → case role) to proc_2024.parquet (assunto_desc → fraud / compliance bucket):

Race FE absorbed by within-muni_id demeaning. Cluster-robust SE on muni_id. Spec ladder A–D varies controls and treatment scale.

Findings

Naive (no FE) comparison: treated rate 5.6 % vs. control 8.0 %, −2.3 pp — suggests self-sponsoring cands are less sued. This is selection on race: muni_id concentrates self-sponsoring cands in disputed races where fraud-suits are actually less common (e.g., mid-tier cities where pollsters register more polls and challenges are dilute).

Within race, the sign flips and matches the prediction:

Spec y coef (any_self) SE t p
A fraud_any 0.031 0.012 2.54 0.011
B fraud_any 0.030 0.012 2.43 0.015
C fraud_any 0.031 0.012 2.49 0.013
A fraud_autor 0.016 0.012 1.33 0.183
A fraud_reu 0.025 0.013 1.91 0.056
A pesq_any 0.025 0.014 1.76 0.079
A compl_any 0.020 0.013 1.59 0.112

Continuous SponsoredBy_c is small and not significant (coef 0.009, p = 0.40) — the action is at the extensive margin (any vs none), not the intensive margin (share).

The role split has reu (defendant) slightly stronger than autor (plaintiff): being sued for fraud is what sponsorship predicts, more than suing about fraud.

Interpretation

Direction of headline matches Use-1 prediction: within race, sponsored exposure predicts fraud-suit involvement. Magnitude is 3.1 pp on a 7.9 % base, ≈ 40 % relative. The compliance bucket shows a similar but smaller and non-significant pattern, so the fraud-specific cut does the work — consistent with the framing that the methodologically- suspicious sub-universe is what self-sponsorship draws.

Two caveats keep this at yellow confidence:

  1. Candidate-side, not poll-side, test. Only ~12 % of PESQUISA cases match any candidate via cand_proc_2024 (most defendants are pollster CNPJs). The 88 % unmatched cases are not necessarily about candidates whose CPF didn't fuzzy-match; they're about polls whose challengers and challengees are firms.
  2. Treatment is candidate exposure, not poll-level perception. A poll-level test would put SponsoredBy_c on the poll and ask whether that poll is named in a case. Doable only on the ~36 % of 2024 PESQUISA cases that have mov text in TREdiarios; needs an LLM/regex pass to extract poll-protocol references from the decision text. Done: AN-072v2 (2026-06-16). v2 finds the opposite sign at the poll unit — within race × week, candidate- sponsored polls are −3.9 pp less likely to be sued for fraud (p=0.010). The two findings are reconciled by selection: cands with self-sponsored polls operate in lawsuit-heavy races, but the lawsuits target other polls, not theirs. See docs/analyses/an-072v2-poll-level-fraud-suit.md.

Files