title: SP-slice analysis (prototype) status: prototype (2026-06-01)

SP-slice analysis — poll-sponsor-bias

(Originally written as build_audit.md on educloud; renamed to disambiguate from the laptop bulk-extraction quality audit, which is now build_audit.md.)

Prototype run of the within-candidate FE design on the SP slice only (SP is the UF whose LLM relatório extraction was done on educloud ahead of the laptop bulk run for the other 25 UFs). Code-pipeline sanity check, not the headline.

Data assembly

2026-06-14 update. The SP-specific assembly script was retired when the national cand_poll.parquet build matured (Routes A+B+C+D

df = pd.read_parquet("build/assemble/cand_poll.parquet")
df = df[df["uf"] == "SP"]

This is what source/analysis/sp_regressions.py does. Row counts shift slightly (~5.3k SP candidate-poll rows under the national pipeline vs ~4.8k under the dropped SP-specific within-muni name matcher); coefficient magnitudes are unchanged in sign and remain strongly positive and significant.

The numbers below are the pre-retirement counts (from the original SP-only assembly, kept for context):

Per-race coverage:

Preliminary regressions

source/analysis/sp_regressions.pybuild/table/sp_regressions.csv. All cluster-robust SEs at the race (muni) level.

Spec β (sponsored_by) SE p β (opp_sponsored) SE p
naive (no FE) +2.17 2.49 0.38 -0.99 1.68 0.55
Spec 1 (pollster + candidate FE) +7.64 2.56 0.003 -3.11 2.79 0.27
Spec 2 (+ structured methodology) +7.24 2.46 0.003 -3.14 2.63 0.23
Spec 2 WLS (weighted by N) +8.06 2.60 0.002 -2.84 2.28 0.21
Spec 3a (clean comparator + candidate FE) +8.22 3.17 0.010
Spec 3b (clean + race × month FE) +8.02 5.43 0.140
Spec 3c (clean + race × week FE, strict) +15.73 3.41 <0.001

Symmetry test (Spec 2): β_self − β_opp = +10.4 — clear sign-test evidence the bias operates on the sponsor's own candidate, not as a generic pollster house effect.

Spec 3 family: timing-controlled identification

Endogeneity concern (flagged 2026-06-01): candidates may commission polls when they privately believe they're leading — time-varying private momentum that within-candidate FE doesn't absorb. Three specs address it:

Spec 3a (clean comparator only, no timing FE) gives β = +8.22, almost identical to Spec 2's +7.24 — the within-candidate FE was already doing most of the work, and restricting to clean comparators sharpens the estimate slightly. Spec 3c (race × week FE) is identified off only 3 (race × week) cells in SP and gives β = +15.7 — high but imprecise. The bulk laptop run will give a usable Spec 3c sample.

Pre-poll trajectory placebo

A direct test of the "self-sponsor when leading" hypothesis. For each candidate with a self-sponsored poll, look at their poll number in the most recent INDEPENDENT poll fielded before the self-sponsored one in the same race. If "candidate commissions when leading" is the explanation, the preceding independent poll should already be high (both polls are measuring the same private peak).

7 SP candidates qualify (a self-sponsored poll preceded by an independent poll in the same race). Median time gap: 4 days.

Metric Value
Mean error in self-sponsored polls −1.28 (close to truth)
Mean error in preceding independent polls −12.55 (large negative — they understated)
Mean within-candidate jump (self − pre-indep) +11.27

In other words, for the same candidate, the self-sponsored poll lands ~11 pp higher than the immediately preceding independent poll. The time gap is too short (median 4 days) for genuine momentum to plausibly explain that magnitude.

Caveat: only 7 candidates contribute to the placebo. Bulk run will firm this up. But the direction + magnitude consistently support the slant interpretation over the timing-of-commission alternative.

Interpretation (caveats apply)

  1. Naive → FE-adjusted β jumps from +2 to +7. Self-sponsoring candidates are negatively selected on true standing — the within-candidate FE strips this confound.
  2. ~7 pp self-sponsor bias is large but plausible given Brazil's "encomendada" stereotype. The TSE registration regime means we observe every registered poll, including unpublicized ones, so this estimate isn't contaminated by selection-into-release.
  3. Spec 3 (LLM methodology controls) deferred — the poll_methodology extractor isn't built yet (queued in pipelines/politica/docs/todo.md). Without it, we can't decompose Channel A (Bayesian-persuasion via design) vs Channel B (residual / fabrication). The fact that β is stable from Spec 1 to Spec 2 (+7.64 → +7.24) hints structured methodology controls don't absorb much of it — leaving room for both channels.

Caveats

Reproduce

# On educloud, in /workspace/pipelines/politica:
BASE_DIR=$PWD DATA_DIR=/workspace/data PYTHONPATH=/workspace/packages/llmkit \
  python source/llm/poll_extract.py --year 2024 --states SP --validate-cached
BASE_DIR=$PWD DATA_DIR=/workspace/data PYTHONPATH=/workspace/packages/llmkit \
  python source/clean/poll_sponsor.py
BASE_DIR=$PWD DATA_DIR=/workspace/data PYTHONPATH=/workspace/packages/llmkit \
  python source/clean/poll_sponsor.py

# Then in /workspace/projects/poll-sponsor-bias:
python source/assemble/cand_poll.py
python source/analysis/sp_regressions.py