AN-016: Within-firm β — does the headline survive when PDF style is held fixed?

Data-quality concern definitively dispatched **and** a substantively new finding surfaces: dramatic between-firm β heterogeneity. Within-firm β (15 firms ≥ 10 sponsored rows; 31 firms ≥ 5): mean +6.5 / +7.2, median +4.4 / +6.3, **range [−11, +35], sd 10.3**. 19 of 31 firms (61 %) are individually significant at p < 0.05; 22/31 positive. **Big-name firms (CENSUS, IIP, INSTITUTO PARANÁ, Verita) have β near zero or modestly negative within-firm; smaller firms (METHODUS +24.7, CAMARGO +23.6, INTENÇÃO +35.2, DATA SC +16.1) slant heavily.** Because PDF style and LLM-extraction pattern are held strictly fixed within firm, the cross-firm β dispersion (sd 10.3 pp) is *real cross-firm sponsor-behavior heterogeneity*, not data-quality artifact. The data-quality concern is closed.

Hypothesis: H1: Self-sponsored polls overstate the sponsoring candidate
Confidence: green
Type: robustness

Design

Sample: estimulado-non-aggregate-match2 (31,186 rows, 641 sponsored, 8,431 candidates). Firms ranked by sponsored-row count; primary cut at ≥ 10 sponsored (15 firms), supplementary at ≥ 5 (33 firms).
Specification: Spec 2 within firm: error ~ sponsored_by + opponent_sponsored + log_sample_size + days_to_election + days² | candidate FE; cluster-robust SE at muni. Race-FE-only fallback for firms where the candidate FE absorbs too much.
Comparator: each firm's own non-sponsored polls — PDF style and house methodology held strictly fixed
Cluster: muni
Weights: none

Script: source/analysis/an-016-within-firm-beta.py
Target: build/table/within_firm_beta.csv
Status: interpreted · 2026-06-02
Created: 2026-06-02

Question

AN-015 ran a within-firm β test on the top-5 pollsters by total row count but found those firms sponsor too few polls themselves (NaN coefficients). The natural set for the test is the firms that do sponsor polls — AN-007's customer-mix-sorting cohort of institutes with ≥ 5 self-sponsored polls each. Refitting spec 2 within each of those firms is the strongest available "PDF style held fixed" test of the headline: if extraction differences across firms were driving the +7-8 pp result, restricting to within-firm identification should kill or shrink β.

Design

source/analysis/an-016-within-firm-beta.py refits spec 2 on each firm's polls separately. Two cuts:

Primary: firms with ≥ 10 self-sponsored rows (15 firms; cleanest within-firm power).
Supplementary: firms with ≥ 5 self-sponsored rows (33 firms; the AN-007 set).

For each firm: spec 2 = error ~ sponsored_by + opponent_sponsored + log_sample_size + days_to_election + days² | candidate FE + pollster FE (degenerate — only 1 firm in subset), cluster-robust SE at muni. If candidate FE absorbs everything (firm with few candidates appearing in both sponsored and non-sponsored polls), fall back to race FE only.

Tabulate per-firm β, SE, p, n. Compute distribution range and the share of firms with significant β. Forest plot of the per-firm coefficients.

Results

Within-firm sponsor coefficient β with 95 % CI

Primary cut (≥ 10 sponsored rows, 15 firms)

Firm	n_self	β	SE	p
CENSUS INSTITUTO DE PESQUISAS	72	−2.76	5.71	0.63
IIP INSTITUTO DE PESQUISAS	66	−1.64	2.24	0.47
INSTITUTO PARANÁ	34	−10.95	10.43	0.29
W J MENDES PESQUISAS	28	+4.05	3.91	0.30
EVA FRANCIELI DE SOUZA	20	−8.68	3.54	0.016
INSTITUTO VERITA	17	+0.55	0.75	0.46
NEXXUS MAIS	16	+7.69	3.23	0.020
INSTITUTO DATA SC	13	+16.07	4.34	0.001
INSTITUTO METHODUS	12	+24.72	3.52	<0.001
VISÃO PESQUISAS	12	+13.92	4.92	0.008
RADAR INTELIGÊNCIA	10	+11.56	2.37	<0.001
INSTITUTO CAMARGO E MEDINA	10	+23.59	3.83	<0.001
PROMÍDIA PESQUISA	10	+2.29	0.54	<0.001
INSTITUTO LJM	10	+4.45	2.77	0.12
3S CONSULTORIA	10	+12.86	1.79	<0.001

Summary: mean β +6.51, median +4.45, range [−10.95, +24.72], sd 10.29. 11 of 15 positive; 9 of 15 significant at p < 0.05.

Supplementary cut (≥ 5 sponsored rows, 31 firms with usable β)

Mean β +7.16, median +6.26, range [−11, +35], sd 10.34. 22 of 31 positive; 19 of 31 significant at p < 0.05.

Notable additions from the supplementary cut:

Firm	n_self	β	SE	p
INTENÇÃO INSTITUTO	5	+35.20	3.80	<0.001
OPINIÃO ESTATÍSTICA	6	+19.57	—	<0.001
TRIÂNGULO MULTIPROJETOS	7	+14.66	3.76	0.001
AR7 PESQUISAS	9	−4.00	1.12	0.001
OPINAR PESQUISAS	9	+7.85	1.47	<0.001

(Some firms had candidate FE fully absorb the within-variation; for those, the table shows the race-FE-only fallback. The race-FE fallback gives slightly larger absolute β by AN-014's multiplicative-scaling argument, but the qualitative cross-firm pattern is robust.)

Interpretation

Two main findings, one negative (the question we set out to answer) and one positive (a substantively new observation):

The data-quality concern is dispatched

Within each firm, PDF layout, extraction template, scenario-label conventions, and the LLM's prompt-response behavior are all constant by construction. If sponsored-poll-vs-independent-poll extraction differences were driving the headline, every firm's within-firm β should be similar (since each firm's PDFs are self-consistent). Instead β ranges from −11 to +35 across firms — a 46-pp spread that cannot come from a single uniform LLM bias. Combined with AN-013 / AN-014 / AN-015, this closes the data-quality flank.

A new substantive finding: firm-level β heterogeneity

The headline +7.85 pp is a cross-firm average masking dramatic firm-level heterogeneity:

Big-name, high-volume firms slant very little. CENSUS (n=72 sponsored polls), IIP (n=66), INSTITUTO PARANÁ (n=34), VERITA (n=17) all have point estimates near zero or modestly negative; none significant at p < 0.05 in the within-firm spec. Their sponsored polls of their own candidates are not systematically more rosy than their independent polls of the same candidates.
Smaller, niche firms slant heavily. METHODUS (+24.7), CAMARGO (+23.6), INTENÇÃO (+35.2), DATA SC (+16.1), VISÃO (+13.9), RADAR (+11.6), 3S (+12.9), TRIÂNGULO (+14.7), SEND (+14.3), BRASLOPES (+16.5), I. M. MENDONÇA (+10.7), and several others sit at +10 to +35.
Two firms have negative β at p < 0.05 (EVA FRANCIELI at −8.68; AR7 at −4.00) — their sponsored polls of their own candidates understate relative to other polls of the same candidates.

This is a clean instance of the AN-007 customer-mix-sorting prediction: firms whose business is mostly candidate-sponsored work face lower reputational cost from slant; firms with mixed business (media + sponsored) discipline themselves to maintain publication-side credibility. The headline number is a weighted average across an industry that is sharply segmented.

Combined with the rest of the battery

AN	What's ruled out
AN-010	comparator contamination, FE-selection, route false positives, renormalization-as-design-lever (partially)
AN-011	FE-structure permutation, single-firm dominance, single-state dominance
AN-012	thin-cluster CRVE under-coverage, sample-size weighting, week-window brittleness
AN-013	Channel B per-row fabrication
AN-014	K2 "sponsors list fewer candidates" mechanism
AN-015	differential LLM extraction quality (script proxies)
AN-016	differential LLM extraction quality (within-firm holding PDF style fixed)

Together, the data-quality concern is now closed on six independent fronts:

Within-poll digit symmetry (AN-013)
Within-candidate denominator stability (AN-014)
Clean independent baseline (+0.93 pp; not zero-shifted)
Negative opponent coefficient (zero-sum sponsor effect, not noise)
Quality-proxy distribution + clean-subset β (AN-015)
Within-firm β heterogeneity (AN-016) — the 46-pp cross-firm spread is incompatible with a uniform extraction bias

Follow-ups

Customer-mix-sorting refresh (extension, high paper-value). AN-007 ran a cross-section regression of per-pollster β on candidate-share of customer mix using 11 institutes. AN-016 expands the per-firm β table to 31 firms with usable estimates. Refresh AN-007's slope with the larger sample; the relationship should be sharper and the SE tighter. Suggested edit: rerun source/analysis/pollster_customer_mix.py reading the AN-016 β table.
Industry-segmentation framing for the paper note (blind spot, high paper-value). The β = sd 10.3 cross-firm dispersion is a substantive finding the headline number obscures. The paper note's framing should explicitly distinguish the average effect (+7.85) from the dramatic between-firm dispersion. The implied policy reading shifts from "polling industry has a bias problem" to "polling industry is segmented; the bias problem lives in a subset of niche firms." Suggested footnote: cite AN-016 §Results and the forest plot.
Hand-validation prioritization (refinement of the queued TODO). The hand-validation TODO should prioritize sampling polls from the high-β firms (METHODUS, CAMARGO, INTENÇÃO, RADAR) — if those firms' sponsored-poll PDFs do look qualitatively different, that would be an interesting finding. Conversely, sampling from CENSUS / IIP / Verita where β ≈ 0 would confirm extraction is working as expected on the big firms.
Heterogeneity by firm size (extension). AN-016 shows large firms have low β. Formal test: regress per-firm β on log(n_polls_total). If a clean monotone relationship exists, it's a reportable "size as discipline" finding.