Data-quality concern definitively dispatched **and** a substantively new finding surfaces: dramatic between-firm β heterogeneity. Within-firm β (15 firms ≥ 10 sponsored rows; 31 firms ≥ 5): mean +6.5 / +7.2, median +4.4 / +6.3, **range [−11, +35], sd 10.3**. 19 of 31 firms (61 %) are individually significant at p < 0.05; 22/31 positive. **Big-name firms (CENSUS, IIP, INSTITUTO PARANÁ, Verita) have β near zero or modestly negative within-firm; smaller firms (METHODUS +24.7, CAMARGO +23.6, INTENÇÃO +35.2, DATA SC +16.1) slant heavily.** Because PDF style and LLM-extraction pattern are held strictly fixed within firm, the cross-firm β dispersion (sd 10.3 pp) is *real cross-firm sponsor-behavior heterogeneity*, not data-quality artifact. The data-quality concern is closed.
Question
AN-015 ran a within-firm β test on the top-5 pollsters by total row count but found those firms sponsor too few polls themselves (NaN coefficients). The natural set for the test is the firms that do sponsor polls — AN-007's customer-mix-sorting cohort of institutes with ≥ 5 self-sponsored polls each. Refitting spec 2 within each of those firms is the strongest available "PDF style held fixed" test of the headline: if extraction differences across firms were driving the +7-8 pp result, restricting to within-firm identification should kill or shrink β.
Design
source/analysis/an-016-within-firm-beta.py refits spec 2 on each
firm's polls separately. Two cuts:
- Primary: firms with ≥ 10 self-sponsored rows (15 firms; cleanest within-firm power).
- Supplementary: firms with ≥ 5 self-sponsored rows (33 firms; the AN-007 set).
For each firm: spec 2 = error ~ sponsored_by + opponent_sponsored + log_sample_size + days_to_election + days² | candidate FE + pollster FE (degenerate — only 1 firm in subset), cluster-robust
SE at muni. If candidate FE absorbs everything (firm with few
candidates appearing in both sponsored and non-sponsored polls),
fall back to race FE only.
Tabulate per-firm β, SE, p, n. Compute distribution range and the share of firms with significant β. Forest plot of the per-firm coefficients.
Results

Primary cut (≥ 10 sponsored rows, 15 firms)
| Firm | n_self | β | SE | p |
|---|---|---|---|---|
| CENSUS INSTITUTO DE PESQUISAS | 72 | −2.76 | 5.71 | 0.63 |
| IIP INSTITUTO DE PESQUISAS | 66 | −1.64 | 2.24 | 0.47 |
| INSTITUTO PARANÁ | 34 | −10.95 | 10.43 | 0.29 |
| W J MENDES PESQUISAS | 28 | +4.05 | 3.91 | 0.30 |
| EVA FRANCIELI DE SOUZA | 20 | −8.68 | 3.54 | 0.016 |
| INSTITUTO VERITA | 17 | +0.55 | 0.75 | 0.46 |
| NEXXUS MAIS | 16 | +7.69 | 3.23 | 0.020 |
| INSTITUTO DATA SC | 13 | +16.07 | 4.34 | 0.001 |
| INSTITUTO METHODUS | 12 | +24.72 | 3.52 | <0.001 |
| VISÃO PESQUISAS | 12 | +13.92 | 4.92 | 0.008 |
| RADAR INTELIGÊNCIA | 10 | +11.56 | 2.37 | <0.001 |
| INSTITUTO CAMARGO E MEDINA | 10 | +23.59 | 3.83 | <0.001 |
| PROMÍDIA PESQUISA | 10 | +2.29 | 0.54 | <0.001 |
| INSTITUTO LJM | 10 | +4.45 | 2.77 | 0.12 |
| 3S CONSULTORIA | 10 | +12.86 | 1.79 | <0.001 |
Summary: mean β +6.51, median +4.45, range [−10.95, +24.72], sd 10.29. 11 of 15 positive; 9 of 15 significant at p < 0.05.
Supplementary cut (≥ 5 sponsored rows, 31 firms with usable β)
Mean β +7.16, median +6.26, range [−11, +35], sd 10.34. 22 of 31 positive; 19 of 31 significant at p < 0.05.
Notable additions from the supplementary cut:
| Firm | n_self | β | SE | p |
|---|---|---|---|---|
| INTENÇÃO INSTITUTO | 5 | +35.20 | 3.80 | <0.001 |
| OPINIÃO ESTATÍSTICA | 6 | +19.57 | — | <0.001 |
| TRIÂNGULO MULTIPROJETOS | 7 | +14.66 | 3.76 | 0.001 |
| AR7 PESQUISAS | 9 | −4.00 | 1.12 | 0.001 |
| OPINAR PESQUISAS | 9 | +7.85 | 1.47 | <0.001 |
(Some firms had candidate FE fully absorb the within-variation; for those, the table shows the race-FE-only fallback. The race-FE fallback gives slightly larger absolute β by AN-014's multiplicative-scaling argument, but the qualitative cross-firm pattern is robust.)
Interpretation
Two main findings, one negative (the question we set out to answer) and one positive (a substantively new observation):
The data-quality concern is dispatched
Within each firm, PDF layout, extraction template, scenario-label conventions, and the LLM's prompt-response behavior are all constant by construction. If sponsored-poll-vs-independent-poll extraction differences were driving the headline, every firm's within-firm β should be similar (since each firm's PDFs are self-consistent). Instead β ranges from −11 to +35 across firms — a 46-pp spread that cannot come from a single uniform LLM bias. Combined with AN-013 / AN-014 / AN-015, this closes the data-quality flank.
A new substantive finding: firm-level β heterogeneity
The headline +7.85 pp is a cross-firm average masking dramatic firm-level heterogeneity:
- Big-name, high-volume firms slant very little. CENSUS (n=72 sponsored polls), IIP (n=66), INSTITUTO PARANÁ (n=34), VERITA (n=17) all have point estimates near zero or modestly negative; none significant at p < 0.05 in the within-firm spec. Their sponsored polls of their own candidates are not systematically more rosy than their independent polls of the same candidates.
- Smaller, niche firms slant heavily. METHODUS (+24.7), CAMARGO (+23.6), INTENÇÃO (+35.2), DATA SC (+16.1), VISÃO (+13.9), RADAR (+11.6), 3S (+12.9), TRIÂNGULO (+14.7), SEND (+14.3), BRASLOPES (+16.5), I. M. MENDONÇA (+10.7), and several others sit at +10 to +35.
- Two firms have negative β at p < 0.05 (EVA FRANCIELI at −8.68; AR7 at −4.00) — their sponsored polls of their own candidates understate relative to other polls of the same candidates.
This is a clean instance of the AN-007 customer-mix-sorting prediction: firms whose business is mostly candidate-sponsored work face lower reputational cost from slant; firms with mixed business (media + sponsored) discipline themselves to maintain publication-side credibility. The headline number is a weighted average across an industry that is sharply segmented.
Combined with the rest of the battery
| AN | What's ruled out |
|---|---|
| AN-010 | comparator contamination, FE-selection, route false positives, renormalization-as-design-lever (partially) |
| AN-011 | FE-structure permutation, single-firm dominance, single-state dominance |
| AN-012 | thin-cluster CRVE under-coverage, sample-size weighting, week-window brittleness |
| AN-013 | Channel B per-row fabrication |
| AN-014 | K2 "sponsors list fewer candidates" mechanism |
| AN-015 | differential LLM extraction quality (script proxies) |
| AN-016 | differential LLM extraction quality (within-firm holding PDF style fixed) |
Together, the data-quality concern is now closed on six independent fronts:
- Within-poll digit symmetry (AN-013)
- Within-candidate denominator stability (AN-014)
- Clean independent baseline (+0.93 pp; not zero-shifted)
- Negative opponent coefficient (zero-sum sponsor effect, not noise)
- Quality-proxy distribution + clean-subset β (AN-015)
- Within-firm β heterogeneity (AN-016) — the 46-pp cross-firm spread is incompatible with a uniform extraction bias
Follow-ups
- Customer-mix-sorting refresh (extension, high paper-value).
AN-007 ran a cross-section regression of per-pollster β on
candidate-share of customer mix using 11 institutes. AN-016 expands
the per-firm β table to 31 firms with usable estimates. Refresh
AN-007's slope with the larger sample; the relationship should be
sharper and the SE tighter. Suggested edit: rerun
source/analysis/pollster_customer_mix.pyreading the AN-016 β table. - Industry-segmentation framing for the paper note (blind spot, high paper-value). The β = sd 10.3 cross-firm dispersion is a substantive finding the headline number obscures. The paper note's framing should explicitly distinguish the average effect (+7.85) from the dramatic between-firm dispersion. The implied policy reading shifts from "polling industry has a bias problem" to "polling industry is segmented; the bias problem lives in a subset of niche firms." Suggested footnote: cite AN-016 §Results and the forest plot.
- Hand-validation prioritization (refinement of the queued TODO). The hand-validation TODO should prioritize sampling polls from the high-β firms (METHODUS, CAMARGO, INTENÇÃO, RADAR) — if those firms' sponsored-poll PDFs do look qualitatively different, that would be an interesting finding. Conversely, sampling from CENSUS / IIP / Verita where β ≈ 0 would confirm extraction is working as expected on the big firms.
- Heterogeneity by firm size (extension). AN-016 shows large firms have low β. Formal test: regress per-firm β on log(n_polls_total). If a clean monotone relationship exists, it's a reportable "size as discipline" finding.