**Variance decomposition of the headline +7 pp: 100 % within-firm, ~0 % between-firm sponsor selection.** Spec A (cand FE only, headline replicate): β = +7.85 pp (SE 1.24, p ≈ 2.6 × 10⁻¹⁰). Spec B (cand FE + firm FE, the within-firm sponsor effect averaged across all 426 firms in the analysis sample): β = +7.98 pp (SE 1.25, p ≈ 1.9 × 10⁻¹⁰). The between-firm component (A − B) is **−0.13 pp**. The +7 pp is not 'sponsors hire firms that systematically over-state' — it is 'firms slant when paid by candidates, regardless of which firm'. Same firm, same pollster style, different customer → +8 pp. The firm-level slant-for-hire selection hypothesis (2-4 pp prior in docs/thinking.md residual decomposition) is structurally ruled out at headline scale. AN-016's within-firm β dispersion (sd 10.3) is still real; AN-018's size-discipline (small firms slant more among the 31 firms ≥ 5 sponsored) still holds; AN-059's decomposition says sponsors do not preferentially load on the high-slant firms in the universe at large.

Confidence
green
Type
robustness
Design
Sample
27,919 candidate-poll rows from build/assemble/cand_poll.parquet after dropping rows missing error / log_sample_size / days_to_election / pollster_cnpj / muni_id and restricting to candidates appearing in ≥ 2 polls. 5,164 candidates, 426 firms (pollster_cnpj), 490 sponsored_by==1.
Specification
Spec A: error ~ sponsored + opponent_sponsored + log_sample_size + days_to_election + days² | candidate FE; cluster-robust SE at muni. Spec B: same regressors, adds firm FE (pollster_cnpj) as a second absorbed effect via PanelOLS.other_effects. The within-firm sponsor coefficient (Spec B) is what remains after firm-level baselines are absorbed; the difference β_A − β_B is the between-firm (selection) contribution.
Comparator
each firm's own non-sponsored polls (firm FE constructs this implicitly)
Cluster
muni
Weights
none
Script
source/analysis/an-059-firm-fe-decomp.py
Target
build/table/an-059-firm-fe-decomp.csv
Status
interpreted · 2026-06-14
Created
2026-06-14

Question

AN-016 / AN-017 / AN-018 established that within-firm β varies wildly across firms (sd 10.3 pp, range [−11, +35], 31 firms with ≥ 5 sponsored polls). AN-018 found that firm SIZE explains most of that cross-firm dispersion among those 31 firms.

What was still missing — and what the residual-decomposition entry added to docs/thinking.md on 2026-06-14 flagged as the load-bearing diagnostic — is the explicit decomposition of the HEADLINE +7 pp into:

If between-firm is large, the mechanism story is sponsor selection of firms — methodology choices are a downstream story but the load-bearing lever is which firm gets hired. If between-firm is small, the +7 pp is genuine within-firm methodology slant — the same firm produces honest polls for media and tilted polls for sponsors.

Design

source/analysis/an-059-firm-fe-decomp.py:

  1. Load build/assemble/cand_poll.parquet (31,186 cand-poll rows).
  2. Apply the headline sample filter (drop NA on the core controls, restrict to candidates with ≥ 2 polls). N = 27,919.
  3. Two nested PanelOLS specs with the same control set (sponsored, opponent_sponsored, log_sample_size, days_to_election, days²) and muni-clustered SE:
    • Spec A — within-candidate FE only (the headline structure).
    • Spec B — within-candidate FE + firm FE (pollster_cnpj).
  4. Read β_sponsored from each spec. The between-firm component is β_A − β_B.
  5. Tertile sensitivity: split firms into thirds by row count, run the headline cand-FE spec separately within each tertile, compare against AN-018.

Results

Headline decomposition

Spec β_sponsored SE t p
A: cand FE only (headline) +7.85 pp 1.24 +6.32 2.6 × 10⁻¹⁰
B: cand FE + firm FE +7.98 pp 1.25 +6.37 1.9 × 10⁻¹⁰
Between-firm (A − B) −0.13 pp
Within-firm share 102 %
Between-firm share −2 %

The headline replicate (Spec A) lands at +7.85 pp on this strict sample, consistent with the +7-8 pp range across specs in the headline analysis. Adding firm FE BARELY MOVES the coefficient — if anything, the within-firm effect is slightly larger than the headline. The sponsor effect is essentially all within-firm.

Tertile sensitivity (all 426 firms, not AN-018's 31)

firm-size tertile n firms n rows β_sponsored SE p
small 342 8,301 +6.10 2.43 0.012
medium 58 8,487 +8.76 2.76 0.001
large 13 8,548 +2.82 2.16 0.192

Medium-tier firms drive the bulk of the slant by row count. Large firms slant less (consistent with AN-018's reputation-discipline story). The pattern is compatible with AN-018 — the earlier analysis ranked firms WITH ≥ 5 sponsored polls (31 firms) and found small slant more; AN-059's tertile here uses ALL 426 firms by total row count, so the "small" bucket includes many firms with 1-2 polls only. The two findings are decomposing differently; both hold.

Interpretation

The +7 pp is within-firm, not between-firm

The clean reading: pollsters slant when paid by candidates and don't slant (or slant less) when paid by media or by themselves. Same firm, same methodology PDF style, same back-office. Customer identity flips the output.

This rules out the firm-level slant-for-hire SELECTION hypothesis at headline scale. Sponsors don't disproportionately concentrate on firms with systematically higher baseline error rates. If they did, β_A would substantially exceed β_B; instead, β_A ≈ β_B (within 0.13 pp).

It does NOT rule out sponsor selection on which firms do candidate work. Sponsors clearly cluster on a subset of firms (490 sponsored polls across 426 firms is uneven). But the firms they pick aren't systematically "high-baseline-error" firms; they're just firms willing to take candidate contracts.

Reconciling with AN-016 and AN-018

AN-016's within-firm β dispersion (sd 10.3 pp across 31 firms) is real. AN-018's size-discipline story (small firms +12, large firms −1 within the 31) is real. These describe heterogeneity in the within-firm slant across firms, not the variance decomposition of the headline.

AN-059 averages across all 426 firms, weighted by sample size. The weighted average happens to land at +7.98 pp ≈ +7.85 pp because medium-tier firms (which carry most of the sample) slant near that average. Small firms slant more but contribute fewer polls; large firms slant less and contribute proportionally fewer than their size would suggest.

Implication for the residual decomposition

The docs/thinking.md 2026-06-14 residual decomposition (after AN-051, AN-056, AN-057) attributed:

AN-059 zeroes out hypothesis #2. The residual is now even more firmly in the within-firm-methodology zone. The five remaining untested categories — sophisticated fabrication, wave selection, sample frame contamination, interviewer scripting, strategic timing — must collectively explain a larger share than the prior allocation suggested.

The next test priority shifts accordingly:

  1. Firm-level decomposition — done; result is "no selection".
  2. Wave-selection test: candidate-sponsored polls within firm × muni × month vs pollster-self polls in the same firm × muni × month.
  3. Sophisticated fabrication forensics: AN-013 v2 with stronger tests.
  4. Strategic timing × news events: needs event database.

Follow-ups

  1. Wave-selection test as AN-060. For each (firm × muni × month) bucket, count polls filed by sponsorship type. Within a firm's filing calendar in a race, do candidate-sponsored waves cluster on specific weeks (e.g. post-rally), or are they distributed uniformly within the firm's filing dates? This is the natural next read on the residual.
  2. Universe-scale weighting extraction is even less attractive now. With firm selection zeroed out, the residual 2-6 pp is structurally inaccessible to extraction-from-registration-text. Scaling poll_weighting to universe gives tighter CIs on the AN-057 (+0.04 p) signal but doesn't address the residual.
  3. Update paper's mechanism narrative. The headline is now cleaner: "the +7 pp is within-firm methodology slant, not sponsor firm selection". This is a sharper claim than the inventory in §5 currently makes.