On the 200-poll methodology subset, slant-permissive coverage classes (specific_neighborhoods + urban_only) appear in 12% of candidate-touched polls vs 10% of independent polls — direction matches Channel A but n_candidate=25 makes the gap noisy. Opaque coverage (deferred + not_specified) is 72% vs 80%, weakly contradicting the "candidates hide scope" prediction.

Confidence
yellow
Type
descriptive
Design
Sample
200-poll methodology LLM subset (poll_methodology_2024__subset_n200.parquet) joined to sponsor parquet
Specification
cross-tab of coverage_class × sponsor_type (candidate_touched / independent / other), with row and column shares
Notes
First of six descriptives (D1-D6) on the methodology LLM subset before the universe extraction lands. coverage_class is the load-bearing Channel A variable; sponsor_type comes from poll_sponsor_2024_candidate's route classification.
Script
source/analysis/an-019-coverage-class-by-sponsor-type.py
Target
build/table/an-019-coverage-class-by-sponsor-type.csv
Status
interpreted · 2026-06-02
Created
2026-06-02

Question

Of the 200 polls in the methodology LLM subset, which coverage_class values (specific_neighborhoods, urban_only, deferred_complement, etc.) appear disproportionately in candidate-touched sponsorships (any sponsor with route ∈ {cpf, committee, party, party_name}) vs independent sponsorships (all sponsors are media or pollster-self)? Coverage class is the load-bearing Channel A methodology lever — a poll that defers coverage to a complement document or restricts to specific neighborhoods has the most slant room without violating disclosure rules. If the slant mechanism is Channel A, we expect worse coverage classes (deferred, specific- neighborhoods) to cluster in the candidate-touched cell.

Design

Per-protocol classification:

Coverage class taxonomy from the LLM extraction (coverage__coverage_class field): full_municipality, urban_plus_selected_rural, urban_only, specific_neighborhoods, deferred_complement, not_specified.

Cross-tab + row shares + column shares + a stacked bar chart.

Results

Coverage class by sponsor type

Cross-tab on the n=200 methodology subset (n_independent = 141, n_candidate_touched = 25, n_other = 34):

coverage_class independent candidate_touched other
full_municipality 2.8% 4.0% 17.6%
urban_plus_selected_rural 7.1% 12.0% 5.9%
urban_only 2.8% 0.0% 0.0%
specific_neighborhoods 7.1% 12.0% 14.7%
deferred_complement 55.3% 56.0% 38.2%
not_specified 24.8% 16.0% 23.5%

Slant-permissive (specific_neighborhoods + urban_only): 12% vs 10% (candidate vs independent). Direction matches Channel A but the gap is 2 pp on a n=25 candidate cell — not decisive.

Opaque (deferred + not_specified): 72% vs 80% (candidate vs independent). Candidate-touched polls are slightly less opaque, weakly contradicting the simple "candidates hide scope" prediction.

Interpretation

The simple Channel A reading — "candidate-touched polls disproportionately use selective-coverage classes" — gets weak support. The specific_neighborhoods cell does have candidate-touched at 12% vs independent at 7%, but on n=25 candidate-touched the absolute count is 3 protocols (out of 18 total specific_neighborhoods polls). That's suggestive, not decisive.

The deferred_complement rate (≈55% in both groups) suggests deferral itself is industry-wide boilerplate rather than a candidate-specific hiding tactic. AN-024 (D6) will probe deferral specifically.

The "other" residual (n=34) sits oddly — 18% full_municipality, much higher than either main group. These look like ad-hoc / one-off pollsters who don't follow the deferral convention. Worth flagging for the sponsor-type classifier LLM refinement.

The Channel A signal — if there is one — is probably not concentrated in coverage_class on this thin subset. Audit_pct, quota distributions, or operations levers (D3, D4) may carry more.

Follow-ups

  1. Refit on the universe extraction (extension): when the LLM methodology pass runs on all 14,876 protocols, re-run this cross- tab. With ~800 candidate-touched polls (Routes A+B+C+D), the 12% vs 10% gap should sharpen to a clear difference or wash out entirely. Suggested script: same as this AN, swap input parquet.
  2. Specific-neighborhoods deep-dive (puzzle): the 3 candidate-touched protocols in this cell are the most direct Channel A evidence on the subset. Pull the actual coverage__coverage_class_evidence and coverage__excluded_areas_listed text to see whether the excluded neighborhoods correlate with the sponsoring candidate's expected weakness (working-class districts for an incumbent, etc.).
  3. Reclassify the "other" bucket (extension): 34 protocols land in "other" because their sponsor types don't cleanly fit the candidate-touched/independent split (mostly other_firm or mixed sponsors). The LLM sponsor-classifier refinement queued in docs/todo.md would split this cell — many "other_firm" sponsors are probably political consultancies / shells (would shift to candidate-touched).