id: an-043 hypothesis: methodology-flexibility-a type: descriptive status: done status_date: 2026-06-02 confidence: yellow created: 2026-06-02 script: source/analysis/an-043-nonresponse-handling-by-sponsor.py target: build/table/an-043-nonresponse-handling-by-sponsor.csv headline: Structured `nonresponse_handling` is 100 % `not_specified` on every pair. Diagnostic regex grep confirms the substantive null — undecided/refusal vocabulary appears in only 5 sponsored and 2 independent pair-sides out of 488. The probe is null-by-data-design: registration PDFs are pre-fielding planning documents and do not describe a post-fielding analytical choice. Testing the nonresponse-handling × sponsor lever requires the post-fielding relatório PDFs (currently outside the methodology extraction pipeline). design: sample: 244 sponsored × independent curated pairs (same muni, same candidate, ±14 d) specification: marginal probe on structured `nonresponse_handling`; diagnostic regex grep across other free-text methodology fields (sample_design_evidence, stage_descriptions, quota_distributions, extraction_notes) for Portuguese nonresponse keywords; sponsor contrast on the grep-hit indicator comparator: independent poll within the matched pair notes: tests probe item 3 (originally item 1) in source-of-bias.md — "Nonresponse handling × sponsor". Third of three source-of-bias quick wins. Closes the quick-win batch.

AN-043: Does nonresponse / undecided handling vary between sponsored and matched independent polls?

Question

Third quick-win probe from the source-of-bias agenda. Different nonresponse-handling rules — redistribute to leaders, redistribute proportionally, exclude, or leave undecideds in — produce mechanically different headline shares for the same realized data. If sponsored polls disproportionately use a leader-redistribute rule, that is a direct Channel-A lever: same field data, biased reported headline.

Design

Source data: the 244 curated sponsored × independent pairs in build/llm/curated_pairs/pairs_with_extractions.parquet. Two-stage:

Structured probe. Inspect the existing s_operations__nonresponse_handling / i_operations__nonresponse_handling fields (controlled vocab from the original LLM pass).
Diagnostic grep. If the structured field has no variation, grep the other free-text methodology fields (sample_design_evidence, stage_descriptions, quota_distributions, extraction_notes — both s_ and i_ sides) for Portuguese nonresponse-handling keywords: não/nao respond, recus, descart, redistribu, proporcional, indecis, branco, nulo, etc. The grep-hit indicator becomes a binary feature; we then run McNemar on it as in AN-042.

The grep step is diagnostic — it answers "is there signal in the raw text that the controlled-vocab extractor missed, or do the PDFs genuinely fail to describe nonresponse handling?"

Results

AN-043: Nonresponse-handling regex hits — sponsored vs matched independent

Structured field. 100 % not_specified on every pair-side (244/244 sponsored AND 244/244 independent). No variation to test.

Diagnostic regex grep across the surrounding free-text fields (sample_design_evidence, stage_descriptions, population_reference_evidence, quota_distributions, extraction_notes, mode_details, etc.):

Hit family	Sponsored	Independent	Δ	b	c	McNemar p
A — undecided / refusal vocabulary (`indeciso`, `recusa`, `branco`, `nulo`, `nao respondeu`, etc.)	2.0 %	0.8 %	+1.2 pp	3	0	0.25
B — redistribution / treatment vocabulary (`redistribu`, `descart`, `proporcional`, `imput`, `pondera`, `cristaliz`)	66.0 %	61.5 %	+4.5 pp	62	51	0.35
Joint A AND B (strict hit)	2.0 %	0.0 %	+2.0 pp	5	0	0.06

Caveat on Family B. The high marginal hit rate (66 %) is contaminated by proporcional, which is canonical Portuguese sampling jargon ("amostragem por cotas proporcionais") appearing in almost every PDF. It does not signal redistribution of undecideds. Family A is the cleaner gauge: 5 sponsored + 2 independent pair-sides out of 488 (~1.4 %) mention any nonresponse / refusal vocabulary at all.

Bias contrast on differing-hit pairs. Family B (the larger sample): sp-only mean contrast +6.07 pp vs ind-only +2.49 pp, Mann–Whitney p = 0.21 — directionally consistent with the headline but not significant and likely an artifact of the proporcional contamination.

Interpretation

The probe is null-by-data-design. Both the structured nonresponse_handling field and the diagnostic free-text grep confirm that neither sponsored nor matched independent registration PDFs describe how the pollster intends to handle undecideds, refusals, or blank/null responses. The reason is structural: TSE registration documents are pre-fielding planning documents, and nonresponse handling is a post-fielding analytical choice — naturally absent because the pollster hasn't yet observed the nonresponse rate to treat.

This closes the registration-PDF methodology extraction's reach on this lever. To actually test the nonresponse-handling × sponsor lever requires the post-fielding relatório PDFs (filed after the poll closes), which describe undecided-redistribution rules. The 1,608 2024 relatório PDFs are already mirrored at pipelines/politica/build/scrape/tse_relatorio/2024/ and the existing poll_relatorio.py extractor reads them — but only for candidate- percent rows, not methodology metadata. A separate LLM extractor pass with a methodology schema (nonresponse rule, response rate, weighting variables applied, post-stratification) over the same PDFs would unblock this probe. The blocker is the extractor, not the data.

The marginal Joint A∧B signal (5 sponsored vs 0 independent PDFs mentioning both vocabularies, McNemar p = 0.06) is in the same direction as AN-042's "sponsored polls describe more" finding — but n=5 makes it suggestive, not load-bearing.

This closes the third and final quick-win probe from the source-of-bias agenda. Combined verdict across the three:

AN-041 (mode) — refuted, opposite direction (sp over-uses gold-standard mode)
AN-042 (interviewer training) — refuted, opposite direction; reframes opacity → selective disclosure
AN-043 (nonresponse handling) — null-by-data-design; needs relatório PDF extraction to test

None of the three quick wins surfaces a Channel-A lever that quantitatively closes the +7 pp size-mismatch problem. source-of-bias.md § candidate mechanisms agenda items 1-3 are now resolved or escalated; items 4 (weighting structured extraction) and the question-order priming probe (blocked on questionnaire PDF mirror) remain the load-bearing concrete-lever candidates.

Follow-ups

Spot-read the 5 Joint A∧B sponsored polls (diagnostic puzzle, low cost). McNemar p = 0.06 with b=5 / c=0 is the only signal to fall out of AN-043 in a non-trivial direction. Are these 5 genuine mentions of undecided-redistribution, or false-positive keyword collisions (e.g. recusa in a different context + proporcional from PPS sampling)? Open the s_sampling__sample_design_evidence and adjacent fields for the 5 protocol ids and check by hand. ~10 min.
Relatório-PDF methodology extractor for nonresponse handling (blind spot, queued). Data is on disk — 1,608 2024 relatório PDFs at pipelines/politica/build/scrape/tse_relatorio/2024/ are already being read by pipelines/politica/source/llm/poll_relatorio.py for candidate-percent rows. The missing piece is a methodology schema on the same PDFs covering nonresponse rule (redistribute / proportional / exclude / cristalizar), response rate, refusal rate, weighting variables applied, and post-stratification description. Pilot on the curated_pairs intersect with the 1,608-PDF universe (~150-200 pairs likely covered). Queued under source-of-bias.md probe agenda item 1.
Same diagnostic grep approach for question-order priming (extension). While the proper probe is blocked on the questionario_pesquisa_2024.zip host-side mirror (sandbox CDN-403), AN-043's regex-on-free-text approach could give a preliminary read on whether question-order or name-rotation vocabulary (ordem, rotacionado, aleatoriz, randomiz) appears differently across the 244 pairs. Cheap to add.
Promote AN-043 finding into source-of-bias.md (writeup, done inline). Probe item 3 (nonresponse handling) moves from "needs classifier pass" to "null-by-data-design — needs relatório extraction pipeline". Already updated.