Sponsored polls never use phone (0/244) and use in-person at 95%; independent uses phone at 10%. χ² on the joint mode table p = 0.0003 in the *opposite* direction of cheap-mode-slant. The mode-substitution lever is refuted on this sample.
Question
Quick-win probe from the source-of-bias agenda: does mode of data collection (in-person / phone / online / mixed) vary systematically between sponsored and matched independent polls? A sponsor that substitutes cheaper modes (phone-only, online) for in-person fieldwork could mechanically tilt the realized sample without violating any disclosure requirement — making mode a concrete Channel-A design lever.
Design
Source data: the 244 curated sponsored × independent pairs in
build/llm/curated_pairs/pairs_with_extractions.parquet. Each pair is
same muni × same candidate × ±14 days. The s_operations__mode and
i_operations__mode fields are already extracted as a controlled
vocabulary {in_person, phone, online, mixed, not_specified} from the
LLM methodology pass.
Tests:
- Marginals. sp mode marginal vs ind mode marginal across the 244 pairs.
- Contingency. 5×5 joint distribution of
(s_mode, i_mode)with a chi-squared test on the full table. - Sign test on differing pairs. Among pairs where the two sides
disagree on mode, is the bias contrast
sponsored_error - indep_errorsystematically signed? Tests whether the mode disagreement actually travels with the bias.
Power note: this is a thin contrast (the priors below show ~95% of
sponsored polls use in_person mode), so the regression
error ~ sponsored × mode + race × week FE from the original brief is
not run here — the categorical contrast has near-zero variance on the
sponsored side. Reported as a margin comparison plus the within-pair
sign test; full-power regression is deferred to when the full-universe
LLM methodology extractor lands (>200 protocols off in_person).
Results

Marginals (n=244 pairs):
| Mode | Sponsored | Independent |
|---|---|---|
| In-person | 232 (95.1%) | 216 (88.5%) |
| Phone | 0 (0.0%) | 24 (9.8%) |
| Online | 0 | 0 |
| Mixed | 0 | 2 (0.8%) |
| Not specified | 12 (4.9%) | 2 (0.8%) |
Two facts jump out:
- No sponsored poll in this 244-pair sample uses phone mode. Zero out of 244. Independent polls use phone in 10% of the sample. This is the opposite direction of the cheap-mode-substitution prior — sponsors are not picking phone polls to slant via reach.
- Sponsored polls are more "in-person" but also more "not
specified". The 6.6 pp difference in
in_personshare (95% vs 89%) is mirrored by a 4.1 pp difference innot_specifiedshare (5% vs 1%) — sponsored polls advertise the gold-standard mode when they advertise at all, and decline to document otherwise.
Joint contingency (5×5, sparsified to 2×4 non-zero after dropping all-zero rows/cols):
| i:in_person | i:phone | i:mixed | i:not_spec | |
|---|---|---|---|---|
| s:in_person | 208 | 22 | 1 | 1 |
| s:not_spec | 8 | 2 | 1 | 1 |
χ² = 18.66 (dof=3), p = 0.0003. Modes are strongly non-independent across the pair, but the structure of the dependence is "sponsored side forces in_person or stays silent" — not a substitution toward cheaper modes.
Bias contrast on differing-mode pairs (n=35):
- 22 pairs sponsored more biased, 13 pairs independent more biased
- Sign test p = 0.18 (two-sided); Wilcoxon signed-rank p = 0.16
- Mean contrast = +2.23 pp (sponsored higher) — consistent with the overall headline direction but underpowered
Among the 35 differing-mode pairs, the bias contrast does not significantly differ from zero. There is no evidence that mode disagreement carries the within-pair bias.
Interpretation
The mode-substitution channel for sponsor bias is refuted on this sample. Sponsored polls do not lean on cheaper modes (phone, online, mixed) to mechanically tilt the sample; if anything, they overrepresent the gold-standard in-person mode while sometimes hiding the mode entirely.
Mode is not a Channel-A lever carrying the +7 pp slant. The χ² significance reflects opacity (sponsored "not_specified" rate is 5× higher than independent), not design substitution.
This strengthens the opacity-as-default reading in source-of-bias.md § Opacity differences and rules out one of the six concrete-design candidates listed in that doc's probe agenda (item 5).
Follow-ups
Why is phone mode entirely absent on the sponsored side? (puzzle): zero phone-mode polls across 244 sponsored is a sharp selection signal — phone-mode might be cheaper but also stigmatized as low-quality, so candidate-sponsors avoid it for reputational reasons. Worth checking on the full-universe extractor (when it lands) whether 0% holds at scale or shifts. Suggested script:
mode-by-sponsor-universe.pyafter the methodology extractor completes.Pollster fixed effects on the 35 differing-mode pairs. (extension): are the 22 phone-on-the-independent-side pairs concentrated in a few specific pollster firms (e.g., AtlasIntel phone-IVR polls)? If so, the bias contrast may reflect a firm-tier story, not a sponsor-side mode choice. Suggested script: tabulate
pollster_cnpj× differing-pair flag on the AN-041 detail CSV.Item 6: document the AN-041 size-mismatch finding in source-of-bias.md (extension): move "Mode × sponsor" from the open-questions table to the ruled-out table; flag mode-substitution as a refuted concrete-design mechanism. Source-of-bias edit only, no new script.
Run AN-042 (interviewer training × sponsor — quick win 2/3) (extension): the natural next probe in the source-of-bias agenda; uses already-extracted
s_operations__interviewer_training_describedandsupervisor_role_describedfields on the same 244-pair sample.