Universe-scale n=14,876. Candidate-touched polls defer at 35.5% vs independent at 38.3% — candidate-touched are LESS likely to defer (odds ratio 0.89, 95% CI [0.80, 0.98], chi-square p = 0.021). The "candidates hide coverage via deferral" hypothesis is statistically refuted at universe scale.

Confidence
green
Type
descriptive
Design
Sample
14,876 mayoral protocols (universe-scale via cov_bucket classifier)
Specification
2×2 table of cov_bucket (deferred_complement vs everything else) × sponsor_bucket (candidate_touched vs independent). Chi-square + odds ratio.
Notes
D6 of six. Promotes the deferral question (AN-019/020 saw 55% deferred in independent and 83% in committees) from the n=200 LLM subset to the full universe. cov_bucket is universe-deterministic, doesn't need LLM.
Script
source/analysis/an-024-coverage-deferral-by-sponsor.py
Target
build/table/an-024-coverage-deferral-by-sponsor.csv
Status
interpreted · 2026-06-02
Created
2026-06-02

Question

AN-019 found ~55% deferred in independent polls and 56% in candidate- touched (n=200 subset — essentially identical rates). AN-020 found committees are 83% deferred (n=6). Deferral could be either:

  1. Industry-wide boilerplate — pollsters submit a separate complementary methodology document by convention, regardless of sponsor. Then deferral rate is constant across sponsor types.
  2. Sponsor-specific tell — candidate-touched polls disproportionately defer to hide specific coverage choices behind a less-scrutinized complementary doc.

The universe-scale 2×2 test pins this down with proper power (n_candidate ≈ 800 expected).

Design

Per-protocol classification from the cov_bucket scan + sponsor parquet:

Two-by-two chi-square + odds ratio on the candidate-vs-independent cells. Drop "other" for the headline test.

Results

Universe-scale 14,876 mayoral protocols:

sponsor_bucket n n_defer defer_rate
candidate_touched 1,928 684 35.5%
independent 9,502 3,637 38.3%
other 3,446 1,163 33.7%

2×2 chi-square (candidate-touched vs independent):

Interpretation

With proper power at universe scale, candidate-touched polls are LESS likely to defer than independent polls, by 2.8 percentage points (OR = 0.89, p = 0.021). The simple Channel A subprediction "candidate-touched polls hide coverage by deferring to a complementary document" is statistically refuted.

This sharpens the cumulative finding from D1-D6:

Across every measured methodology lever, the Channel A "candidates minimize/hide methodology" prediction is either null or wrong-signed. The +7 pp sponsor bias estimated in AN-001 must be operating through something the LLM methodology extraction did not capture:

  1. Channel B (residual / fabrication) — interviewer-level shading that wouldn't show up in any disclosed methodology field. The day-to-election decay (AN-005) hints at this — slant shrinks toward the verification event.
  2. Channel A via levers not measured — quota distributions, actual rural sub-district selection, weighting choices in the complement document. These are below the resolution of the current LLM schema.
  3. Quota distribution slant inside a constant menu — 88% of pollsters use {sex, age, education, income} quotas, but the bin shares within those quotas vary. A poll that quotas to a demographically-favorable distribution (e.g., young voters over-sampled when the candidate skews young) slants without touching coverage class or audit.

This sets the agenda for the Spec 3 regression on the universe LLM extract: the β shrinkage will likely be small when the methodology features land. The interesting follow-up will be Channel B diagnostics, not Channel A.

Follow-ups

  1. Quota-distribution slant test (extension): parse the sampling__quota_distributions JSON in the LLM extract once universe-scale, compare each poll's bin shares to IBGE Census 2022 reference for the muni. A poll that quotas to a demographically-shifted distribution is doing Channel A via a lever AN-019-AN-024 don't capture.
  2. Day-to-election decay × sponsor type (extension): AN-005 showed β decays toward election day. Is that decay larger for the audit-pct/coverage-deferral-controlled subset? That would tighten the Channel B story.
  3. Funding-source disclosure × β (blind spot): only 9% of polls mention funding source (AN-022 fields). The DS_ORIGEM_RECURSO flag in the registry is universal. Does the handful of polls that also mention funding voluntarily have smaller β? That would index a self-selection on transparency.
  4. Update theory.md § Channel A vs B framing (blind spot): the project's docs/theory.md § "Polls as Bayesian persuasion" currently treats Channel A as the leading hypothesis. The universe-scale D1-D6 results justify pre-emptively weakening that framing — Channel B should at least be co-equal.