AN-073: Firm specialization by party in candidate-sponsored polls

State-HHI is above chance for 37/38 firms (regional specialization is universal in this segment); party-HHI is above chance for only 10/38 (party concentration is uncommon). The relational-reputation prediction — high party-HHI tracks high within-firm β — is not supported: full-sample r(party-HHI, β) = −0.456 (n=22), but the negative correlation is driven entirely by CENSUS (HHI 0.86, β −8.2) and EVA FRANCIELI (HHI 0.85, β −8.4), the two firms thinking.md already flagged as suspected-selection cases. Excluding them, r = −0.155 (n=20) and the party-HHI tercile means in β flatten to 7.38 / 8.52 / 8.22. M1/M3 relational reputation is refuted at point estimate; M4 single-shot / menu-pricing gains by elimination.

Confidence: yellow
Type: descriptive

Design

Sample: firms with >=5 candidate-sponsored mayoral protocols in 2024 (sponsor_candidate_party non-null on poll_sponsor_2024.parquet, routes A/B/C/D, deduplicated to (protocol, party))
Specification: per-firm party-HHI, ideological-bloc-HHI, and state-HHI on the candidate-sponsored subsample; permutation null draws n_i parties per firm with replacement from the universe candidate-sponsored party distribution (2,000 reps); cross-cut against AN-016 within-firm β
Comparator: multinomial permutation null from universe sponsor-party share
Notes: Bloc mapping per docs/thinking/enforcement-puzzle.md proposal — left: PT/PSB/PCdoB/PV/REDE; center: MDB/PSDB/PSD/CIDADANIA/PDT; right: PL/PP/REPUBLICANOS/UNIÃO/NOVO/PRTB/PSC; other: everything else. PCdoB normalized from raw 'PC do B'. State-HHI lets us separate regional vs partisan specialization.

Script: source/analysis/an-073-firm-party-specialization.py
Target: build/table/an-073-firm-party-specialization.csv
Status: interpreted · 2026-06-16
Created: 2026-06-16

Question

Do candidate-sponsored polls from a given firm concentrate on one party (or ideological bloc, or state) more than would be expected if the firm drew its candidate clientele at random from the universe candidate-sponsored party distribution? And does the observed concentration co-vary with the firm's within-candidate sponsor bias (AN-016)?

The follow-up motivation is the enforcement-puzzle thinking note: relational-reputation enforcement (M1, M3) predicts firms develop durable ties with specific parties / political families, and that durability gets priced into the bias (high party-HHI ↔ high β). The competing single-shot-pricing reading (M4 quid-pro-quo via post-election contracts, or pure menu pricing à la Goiás 2020) predicts firms take whoever pays, so party-HHI is universally near the chance baseline.

Design

source/analysis/an-073-firm-party-specialization.py:

Load pipelines/politica/build/clean/poll_sponsor_2024.parquet, keep rows with sponsor_candidate_party non-null (the candidate- sponsored subsample). Deduplicate to (protocol, institute, sponsor_candidate_party) — a single protocol with two coalition sponsors of different parties contributes once per distinct party.
Apply the party-name normalization (PC do B → PCdoB).
Restrict to firms with n_sponsored ≥ 5.
For each firm, compute party-HHI, bloc-HHI (using the bloc map in the frontmatter notes), and state-HHI (on uf).
Permutation null: for each firm independently, draw n_i parties with replacement from the universe candidate-sponsored party distribution (empirical multinomial). 2,000 reps. Empirical one-sided p = P(HHI_null ≥ HHI_observed). Run separately for party, bloc, state.
Join the AN-016 within_firm_beta.csv β estimate per firm (preferring candidate FE spec when available, else race FE (fallback)).
Report: per-firm table, scatter of party-HHI × β, histogram of the three HHI distributions overlaid with the universe HHI mean.

Results

Party-HHI vs within-firm β

Sample

793 candidate-sponsored deduplicated (protocol × party) observations on 204 firms; 38 firms clear the n_sponsored ≥ 5 cut. 22 of the 38 have a within-firm β estimate from AN-016.
Universe baselines (mean of the multinomial-null HHI): party 0.133, ideological-bloc 0.377, state 0.124.

Concentration tests (firms with permutation p < 0.05)

Dimension	n significant / 38	Reading
Party-HHI	10 / 38 (26 %)	Uncommon — most firms take work across many parties even within their footprint.
Bloc-HHI	6 / 38 (16 %)	Even less common at the bloc level.
State-HHI	37 / 38 (97 %)	Universal regional specialization — every firm in this segment is a state-local operator.

Party-HHI × within-firm β

Sample	n	Pearson r	Reading
All firms with β	22	−0.456	Apparent negative correlation.
Drop CENSUS + EVA FRANCIELI (the two β ≈ −8 firms `thinking.md` already flagged as suspected selection)	20	−0.155	Collapses to near zero.
Restrict to n_sponsored ≥ 10	16	−0.514	Driven by same two firms.
Both filters (drop both + n ≥ 10)	14	−0.104	Null.

Trimmed (no CENSUS / EVA FRANCIELI) mean β by party-HHI tertile: low 7.38, mid 8.52, high 8.22 — flat.

Diagnostic — what's happening at the extremes

High-HHI, negative-β (the suspected-selection cell, n=2): CENSUS (52 polls, 92 % PSD, β −8.2); EVA FRANCIELI (12 polls, 92 % PL, β −8.4). thinking.md § "Leads from an-071" already proposes a selection-mix diagnostic for these — they take sponsors who are losing and so the sponsored side captures the candidate's true (low) standing.
High-HHI, modest-β (the partisan-specialist cell that survives the trim, n=2): INSTITUTO PARANA (28 polls, 100 % PL, β +1.30); INSTITUTO VERITA (20 polls, 45 % PL, β +1.13). Both are well-known firms in the AN-016 sample with low β. The partisan-specialists that do appear are not the high-β tail.
Low-HHI, high-β (the menu-pricing cell, ~6 firms): METHODUS (HHI 0.36, β +23), CAMARGO E MEDINA (HHI 0.19, β +22), TRIANGULO (HHI 0.14, β +15), BRASLOPES (HHI 0.33, β +16), VISAO (HHI 0.25, β +14), DATA SC (HHI 0.22, β +13). These are the high-β firms — they take candidate work across many parties.

Interpretation

The M1/M3 relational-reputation prediction stated in docs/thinking/enforcement-puzzle.md — that firms with durable party ties (high party-HHI) deliver larger β because the relationship sustains a credible slant-commitment — is not supported at point estimate. The naïve full-sample r = −0.456 is driven by the two suspected-selection negative-β firms; trimming them collapses the correlation to −0.155 and the high-HHI tercile mean β to 8.22 pp, essentially identical to the low-HHI tercile (7.38). High-β firms cluster at low party-HHI: they take candidate work across many parties. The cell that would empirically validate M1/M3 (high HHI

high β) contains the two negative-β CENSUS/EVA-FRANCIELI cases (selection artifact) plus INSTITUTO PARANA and INSTITUTO VERITA (both β ≈ +1, the low end of the AN-016 β distribution).

The M4 / single-shot pricing reading gains relative weight by elimination: bias is independent of the firm's party portfolio concentration, consistent with firms slanting on a per-poll basis for whichever client pays rather than via durable party-level relationships. The Goiás 2020 / IPOP existence proof in docs/thinking/enforcement-puzzle.md (R$ 6 k per "first-place" poll, posted-menu pricing) fits this reading better than relational-trust storytelling.

The state-HHI universality (37/38 firms) is the cleanest positive finding here — every firm in the candidate-sponsored segment is a state-local operator. Combined with low party-HHI, the modal firm is "small local pollster takes whoever pays in its state," not "small local pollster with durable ties to one party." This is structural background for the M1-via-reputation story: the reputation network in the candidate-sponsor segment, if it exists, is geographic not partisan.

Caveat: n = 22 on the β cross-cut. The point estimates are robust to the obvious sensitivity (the trimmed correlation is near zero under three independent filters), but a published version of this test would benefit from broader β coverage (lower the n_sponsored threshold in AN-016, or estimate β on the AN-073 ≥ 5 sample directly).

Follow-ups

CENSUS / EVA FRANCIELI selection diagnostic (puzzle): confirm the suspected-selection story by computing the per-firm rank/competitiveness mix of sponsored races vs all polled races. Already on todo.md as a lead from AN-071. AN-073 raises the priority — these two firms are also the sole driver of the apparent negative HHI × β correlation, so the diagnostic simultaneously cleans the AN-071 and AN-073 readings. Suggested script: source/analysis/an-NNN-negative-beta-firm-rank-mix.py.
Extend β estimation to the AN-073 ≥ 5 sample (extension): AN-016 used n_sponsored ≥ 10. With n_sponsored ≥ 5 we have 38 firms here; estimating β on the same cut would give us 38 instead of 22 firms for the HHI × β cross-cut and would push the correlation test out of the underpowered range. Direct re-run of AN-016's per-firm within-firm estimator with the lower cut. ~30 min. Suggested script: source/analysis/an-NNN-within-firm-beta-min5.py.
Two-axis specialization map (extension): scatter party-HHI × state-HHI (with β as color). The unified picture of "regional only", "regional + partisan", and "national" specialists, plus how β varies across the four cells, is more informative than either dimension alone. The natural visual companion to hhi_distributions.pdf. ~1 hour. Suggested script: source/analysis/an-NNN-firm-specialization-2d.py.
Cross-cycle dyad persistence (blind spot): a firm-party ratio is a single-cycle snapshot. The true relational test is whether the same firm × party (or firm × political family) dyads recur in 2020 and 2022. Currently parked — needs harmonized 2020 sponsor data.
Post-election municipal-contract follow-up (M4 direct test) (blind spot): the AN-073 null on the M1/M3 prediction strengthens the case for the M4 / quid-pro-quo test the enforcement-puzzle doc proposed. Portal da Transparência CNPJ match of pollster firms against 2025–26 municipal contracts in the munis where their candidate-clients won, vs lost. Out of scope for this iteration; lives on todo.md as a Leads block.