On the 200-poll methodology subset, slant-permissive coverage classes (specific_neighborhoods + urban_only) appear in 12% of candidate-touched polls vs 10% of independent polls — direction matches Channel A but n_candidate=25 makes the gap noisy. Opaque coverage (deferred + not_specified) is 72% vs 80%, weakly contradicting the "candidates hide scope" prediction.
Question
Of the 200 polls in the methodology LLM subset, which coverage_class
values (specific_neighborhoods, urban_only, deferred_complement, etc.)
appear disproportionately in candidate-touched sponsorships (any
sponsor with route ∈ {cpf, committee, party, party_name}) vs
independent sponsorships (all sponsors are media or
pollster-self)? Coverage class is the load-bearing Channel A
methodology lever — a poll that defers coverage to a complement
document or restricts to specific neighborhoods has the most slant
room without violating disclosure rules. If the slant mechanism is
Channel A, we expect worse coverage classes (deferred, specific-
neighborhoods) to cluster in the candidate-touched cell.
Design
Per-protocol classification:
- candidate_touched: protocol has ≥1 sponsor row whose
sponsor_routeis one of {cpf, committee, party, party_name}. - independent: protocol's sponsor types are a subset of
{media, pollster_self} and at least one such type is present
(i.e., the
poll_is_independentdefinition used elsewhere). - other: residual — protocols with other_firm or mixed sponsors.
Coverage class taxonomy from the LLM extraction
(coverage__coverage_class field):
full_municipality, urban_plus_selected_rural, urban_only,
specific_neighborhoods, deferred_complement, not_specified.
Cross-tab + row shares + column shares + a stacked bar chart.
Results

Cross-tab on the n=200 methodology subset (n_independent = 141, n_candidate_touched = 25, n_other = 34):
| coverage_class | independent | candidate_touched | other |
|---|---|---|---|
| full_municipality | 2.8% | 4.0% | 17.6% |
| urban_plus_selected_rural | 7.1% | 12.0% | 5.9% |
| urban_only | 2.8% | 0.0% | 0.0% |
| specific_neighborhoods | 7.1% | 12.0% | 14.7% |
| deferred_complement | 55.3% | 56.0% | 38.2% |
| not_specified | 24.8% | 16.0% | 23.5% |
Slant-permissive (specific_neighborhoods + urban_only): 12% vs 10% (candidate vs independent). Direction matches Channel A but the gap is 2 pp on a n=25 candidate cell — not decisive.
Opaque (deferred + not_specified): 72% vs 80% (candidate vs independent). Candidate-touched polls are slightly less opaque, weakly contradicting the simple "candidates hide scope" prediction.
Interpretation
The simple Channel A reading — "candidate-touched polls disproportionately use selective-coverage classes" — gets weak support. The specific_neighborhoods cell does have candidate-touched at 12% vs independent at 7%, but on n=25 candidate-touched the absolute count is 3 protocols (out of 18 total specific_neighborhoods polls). That's suggestive, not decisive.
The deferred_complement rate (≈55% in both groups) suggests deferral itself is industry-wide boilerplate rather than a candidate-specific hiding tactic. AN-024 (D6) will probe deferral specifically.
The "other" residual (n=34) sits oddly — 18% full_municipality, much higher than either main group. These look like ad-hoc / one-off pollsters who don't follow the deferral convention. Worth flagging for the sponsor-type classifier LLM refinement.
The Channel A signal — if there is one — is probably not concentrated
in coverage_class on this thin subset. Audit_pct, quota distributions,
or operations levers (D3, D4) may carry more.
Follow-ups
- Refit on the universe extraction (extension): when the LLM methodology pass runs on all 14,876 protocols, re-run this cross- tab. With ~800 candidate-touched polls (Routes A+B+C+D), the 12% vs 10% gap should sharpen to a clear difference or wash out entirely. Suggested script: same as this AN, swap input parquet.
- Specific-neighborhoods deep-dive (puzzle): the 3
candidate-touched protocols in this cell are the most direct
Channel A evidence on the subset. Pull the actual
coverage__coverage_class_evidenceandcoverage__excluded_areas_listedtext to see whether the excluded neighborhoods correlate with the sponsoring candidate's expected weakness (working-class districts for an incumbent, etc.). - Reclassify the "other" bucket (extension): 34 protocols
land in "other" because their sponsor types don't cleanly fit
the candidate-touched/independent split (mostly other_firm or
mixed sponsors). The LLM sponsor-classifier refinement queued in
docs/todo.mdwould split this cell — many "other_firm" sponsors are probably political consultancies / shells (would shift to candidate-touched).