Cheap-Tier-2 structural test on coverage × candidate-base finds **no triple-interaction signal**. On the universe-scale within-candidate FE sample (n=20,393 cand-poll rows, 3,524 candidates, 1,665 muni clusters), the headline sponsor effect lands at **+7.70 pp (SE 1.44, p<10⁻⁷)**, but the flat `sponsored × narrow_coverage` interaction is null (β = +1.61, SE 4.93, p=0.74) and the wash-out-breaking triple `sponsored × narrow_coverage × base_lv_size_weighted` is also null (β = −0.54, SE 1.35, p=0.69). The 95 % CI on the triple is [−3.2, +2.1] in units of pp per unit of base_lv_dm — rules out large triples but not modest ones. Read: coverage class is unlikely to be the dominant Channel-A lever for the headline +7 pp; consistent with AN-032's reversed-sign bairro test. Directs attention to weighting / income-quota features (next AN).

Confidence
yellow
Type
descriptive
Design
Sample
20,393 candidate-poll rows from build/assemble/cand_poll.parquet after dropping (a) candidates whose 2020 base profile is unavailable (no own 2020 prefeito candidacy AND no party 2020 prefeito / vereador run in muni — 33 % of the panel), (b) protocols without a poll_coverage extraction (0 % loss; universe-scale 14,876 protocols all have coverage_class), and (c) candidates appearing in only one poll (within-cand FE requires ≥ 2). 3,524 candidates, 1,665 muni clusters.
Specification
Within-candidate FE (PanelOLS, entity_effects=True, clusters=muni_id). Three nested specs: (Spec 1) `error ~ sponsored` — headline replicate on the analysis slice; (Spec 2) `error ~ sponsored + narrow_coverage + sponsored×narrow_coverage` — the flat AN-019-style test; (Spec 3) `error ~ sponsored + narrow_coverage + sponsored×narrow + sponsored×base_lv_dm + narrow×base_lv_dm + sponsored×narrow×base_lv_dm` — the triple interaction. `narrow_coverage = 1[coverage_class ∈ {urban_only, specific_neighborhoods}]`. `base_lv_dm` = candidate's base_lv_size_weighted demeaned across the analysis sample (mean 21.6). base_lv_dm main effect is candidate-level → absorbed by cand FE; the two two-way interactions and the triple are identified from within-cand variation in sponsored and narrow_coverage.
Comparator
independent-media or pollster-self polls of the same candidate (the within-cand FE design defines the comparator implicitly)
Cluster
muni_id
Weights
unweighted
Script
source/analysis/an-055-coverage-by-cand-base.py
Target
build/table/an-055-coverage-by-cand-base.csv
Status
interpreted · 2026-06-14
Created
2026-06-14

Question

The blinded LLM-judge pilot (blinded Channel-A discovery brief) flagged coverage and weighting as the dominant high-plausibility mechanism domains, with 14/16 high-plausibility hypotheses agreeing with the actual sponsored side (87.5 %, p ≈ 0.004). The flat structural test sponsored × coverage_class (AN-019) was underpowered and direction-ambiguous: rural-base candidates' sponsors might choose rural-friendly coverage and urban-base candidates' sponsors might choose urban-only coverage, so a directional coverage effect can wash out in the aggregate.

This analysis breaks the wash-out by interacting the coverage choice with each candidate's natural base concentration. The cheap proxy is base_lv_size_weighted = vote-weighted average of "seções per local_votacao" across the candidate's 2020 base:

If sponsored polls choose narrow (urban-only / specific-neighborhood) coverage when their candidate's base concentrates in dense urban LVs, the triple interaction sponsored × narrow_coverage × base_lv_size_weighted should be positive.

Design

source/analysis/an-055-coverage-by-cand-base.py:

  1. Load build/assemble/cand_poll.parquet (with the base profile columns piped through from build/assemble/cand.parquet).
  2. Load coverage_class per protocol from the cached poll_coverage LLM extractions (14,876 protocols at universe scale — both LLM-extracted and deterministic short-circuits).
  3. Define narrow_coverage = 1[coverage_class ∈ {urban_only, specific_neighborhoods}].
  4. Restrict to candidates with a non-unavailable base_source and to candidates appearing in ≥ 2 polls. Demean base_lv_size_weighted for interaction interpretability.
  5. Fit three nested PanelOLS specs with within-candidate fixed effects and muni-clustered SEs.

The base-profile build is documented in source/intermediate/cand__base_profile.py and follows the fallback ladder: own 2020 prefeito vote → party 2020 prefeito vote in same muni → party 2020 vereador vote → unavailable.

Results

Headline (Spec 1) — sponsor effect survives on the analysis slice

Statistic Value
Sample (cand-poll rows) 20,393
Candidates 3,524
Muni clusters 1,665
β_sponsored +7.70 pp
SE (muni-clustered) 1.44
t 5.35
p 9.0 × 10⁻⁸

The +7-8 pp sponsor effect from the headline analysis lands here at +7.70 pp on a strict slice (within-cand FE + base profile available + coverage extraction available). No drift.

Spec 2 — flat sponsor × narrow coverage interaction is null

Coefficient β SE t p
sponsored +7.56 1.44 5.25 1.5 × 10⁻⁷
narrow_coverage −0.86 0.58 −1.47 0.140
sponsored × narrow_coverage +1.61 4.93 0.33 0.743

The differential sponsor effect inside narrow-coverage polls is not statistically distinguishable from zero. This is the wash-out target. The interpretation is direction-ambiguous on its own: it can mean (a) no coverage channel exists, or (b) the coverage channel exists but directional alignment cancels in the aggregate.

Spec 3 — triple interaction (the wash-out-breaking test)

Coefficient β SE t p
sponsored +6.29 1.69 3.72 2.0 × 10⁻⁴
narrow_coverage −0.91 0.58 −1.57 0.117
sponsored × narrow_coverage −7.98 22.44 −0.36 0.722
sponsored × base_lv_dm −0.097 0.11 −0.92 0.357
narrow_coverage × base_lv_dm −0.0073 0.0080 −0.92 0.359
sponsored × narrow × base_lv_dm −0.54 1.35 −0.40 0.689

The triple is also null. Point estimate is small and slightly negative; 95 % CI on the triple is approximately [−3.2, +2.1] pp per unit of base_lv_dm.

The candidate-level main effect of base_lv_dm is absorbed by the within-cand FE (it's constant per politico_id by construction). All three interactions are identified from within-candidate variation in sponsored and narrow_coverage.

Interpretation

What the triple null does and does not rule out

The 95 % CI on the triple (±1.4 pp per unit of base_lv_dm) — over the inter-quartile range of base_lv_dm (roughly 4 to 30 in the analysis sample, so a swing of about 25 units) — gives a confidence band on the implied within-pair coverage-bias effect of approximately ±35 pp. This is wide. We can rule out very large coverage × base alignments, but a modest one (say, a 5-pp differential between urban-base and rural-base candidates in narrow-coverage polls) sits comfortably inside the CI. This is not a precise null in the strict sense.

What it does say: at this proxy granularity (base_lv_size_weighted from 2020 seção votes), there is no evidence the structural Channel A mechanism runs through coverage × candidate-base alignment at universe scale.

Two readings survive

(R1) Coverage isn't the dominant Channel A lever. Consistent with AN-032 (bairro partisan composition reversed sign), AN-019 (small noisy positive), AN-024 (deferral wrong-signed). The +7.7 pp sponsor effect channels through some other mechanism — most likely weighting, income-quota distributions, or scenario rotation (AN-051's already-flagged 26 × under-documentation of name rotation in sponsored polls). The LLM-judge brief flagged cotas de renda / ponderação por renda / cobertura geográfica detalhada at similar rates as cobertura apenas urbana; only the latter category is what AN-055 tests.

(R2) Coverage IS the lever but the cheap proxy is too coarse. base_lv_size_weighted collapses the candidate's geographic base into one scalar (urban-rural-ish). It cannot distinguish "rural-far-from-center" from "low-income peri-urban" or "ethnically distinct neighborhood". Full Tier 2 (IBGE setor socioeconomic crosstabs) could sharpen, at the cost documented in docs/todo.md (3-5 days sandbox quick-and-dirty, 1.5-2 weeks pipeline-grade).

Both readings agree on the next move: test weighting features structurally before sharpening the geographic proxy. If weighting interactions show signal, the mechanism story tightens without spending the IBGE-setor infrastructure week. If weighting is also null, the case for the IBGE-setor sharpening strengthens.

Refined mechanism inventory (post-AN-055)

Lever Status Evidence
Bairro partisan composition Reversed sign AN-032
Coverage class (flat) Underpowered + 0 AN-019
Coverage class × candidate base (cheap Tier 2) Null (this AN) AN-055
Coverage deferral Wrong-signed AN-024
Audit pct Heavy overlap, small right-tail gap AN-021
Methodology completeness Wrong-signed AN-022
Interviewer training Wrong-signed (sponsors describe MORE) AN-042
Mode (phone / in-person) Wrong-signed AN-041
Nonresponse handling Null-by-data-design AN-043
Name / scenario rotation Working hypothesis — sponsored under-document rotation 5× AN-051
Weighting / income-quota distributions Not yet tested structurally — (LLM-judge flagged)

The pattern: nearly every structural lever tested has come back null or wrong-signed against the Channel A "candidates hide methodology" prediction. The two open frontiers are scenario rotation (AN-051's robust positive finding) and weighting/income-quota features (not yet structured).

Follow-ups

  1. Next-up: weighting / income-quota structural extraction (highest paper-value extension). The LLM-judge brief's recurring features include cotas de renda (n=5), ponderação de renda / ponderação por renda (3+2), cotas por nível econômico (n=2), ponderação por nível econômico (n=1). None of the current poll_sampling.py / poll_operations.py schema fields capture quota DISTRIBUTION mismatches between sponsored and indep polls of the same race. Build a structured extractor for the income-quota vector and a "quota deviation from muni baseline" metric, then a per-protocol panel test. Should be ~1 LLM extraction sprint at universe scale (post the queued sampling/operations batch resubmission).
  2. Full Tier 2 only if weighting is also null. With AN-055 null and AN-019/021/022/024 null/wrong, the case for spending the IBGE-setor week strengthens only if weighting also fails. Keep the Tier 2 todo entry as-is.
  3. Re-examine the blinded brief's cobertura geográfica and cobertura urbana e rural themes. These are finer than coverage_class's 6-bucket categorization. The LLM may be picking up substantive coverage differences inside the existing categories (e.g., within urban_plus_selected_rural, which rural districts are included can vary). A follow-up extractor for "list of bairros explicitly excluded" could sharpen.
  4. Sensitivity: base_source quality. Split the sample by base_source ∈ {own_2020_prefeito, party_2020_prefeito, party_2020_vereador} and re-fit Spec 3. If the own-2020-prefeito subset (sharpest base measurement) shows even a noisy positive triple, the proxy-coarseness reading (R2) gains; if it doesn't, the "coverage isn't the lever" reading (R1) gains.