State-HHI is above chance for 37/38 firms (regional specialization is universal in this segment); party-HHI is above chance for only 10/38 (party concentration is uncommon). The relational-reputation prediction — high party-HHI tracks high within-firm β — is not supported: full-sample r(party-HHI, β) = −0.456 (n=22), but the negative correlation is driven entirely by CENSUS (HHI 0.86, β −8.2) and EVA FRANCIELI (HHI 0.85, β −8.4), the two firms thinking.md already flagged as suspected-selection cases. Excluding them, r = −0.155 (n=20) and the party-HHI tercile means in β flatten to 7.38 / 8.52 / 8.22. M1/M3 relational reputation is refuted at point estimate; M4 single-shot / menu-pricing gains by elimination.
Question
Do candidate-sponsored polls from a given firm concentrate on one party (or ideological bloc, or state) more than would be expected if the firm drew its candidate clientele at random from the universe candidate-sponsored party distribution? And does the observed concentration co-vary with the firm's within-candidate sponsor bias (AN-016)?
The follow-up motivation is the enforcement-puzzle thinking note: relational-reputation enforcement (M1, M3) predicts firms develop durable ties with specific parties / political families, and that durability gets priced into the bias (high party-HHI ↔ high β). The competing single-shot-pricing reading (M4 quid-pro-quo via post-election contracts, or pure menu pricing à la Goiás 2020) predicts firms take whoever pays, so party-HHI is universally near the chance baseline.
Design
source/analysis/an-073-firm-party-specialization.py:
- Load
pipelines/politica/build/clean/poll_sponsor_2024.parquet, keep rows withsponsor_candidate_partynon-null (the candidate- sponsored subsample). Deduplicate to (protocol, institute, sponsor_candidate_party) — a single protocol with two coalition sponsors of different parties contributes once per distinct party. - Apply the party-name normalization (
PC do B→PCdoB). - Restrict to firms with n_sponsored ≥ 5.
- For each firm, compute party-HHI, bloc-HHI (using the bloc map
in the frontmatter
notes), and state-HHI (onuf). - Permutation null: for each firm independently, draw n_i parties with replacement from the universe candidate-sponsored party distribution (empirical multinomial). 2,000 reps. Empirical one-sided p = P(HHI_null ≥ HHI_observed). Run separately for party, bloc, state.
- Join the AN-016
within_firm_beta.csvβ estimate per firm (preferringcandidate FEspec when available, elserace FE (fallback)). - Report: per-firm table, scatter of party-HHI × β, histogram of the three HHI distributions overlaid with the universe HHI mean.
Results

Sample
- 793 candidate-sponsored deduplicated (protocol × party) observations on 204 firms; 38 firms clear the n_sponsored ≥ 5 cut. 22 of the 38 have a within-firm β estimate from AN-016.
- Universe baselines (mean of the multinomial-null HHI): party 0.133, ideological-bloc 0.377, state 0.124.
Concentration tests (firms with permutation p < 0.05)
| Dimension | n significant / 38 | Reading |
|---|---|---|
| Party-HHI | 10 / 38 (26 %) | Uncommon — most firms take work across many parties even within their footprint. |
| Bloc-HHI | 6 / 38 (16 %) | Even less common at the bloc level. |
| State-HHI | 37 / 38 (97 %) | Universal regional specialization — every firm in this segment is a state-local operator. |
Party-HHI × within-firm β
| Sample | n | Pearson r | Reading |
|---|---|---|---|
| All firms with β | 22 | −0.456 | Apparent negative correlation. |
Drop CENSUS + EVA FRANCIELI (the two β ≈ −8 firms thinking.md already flagged as suspected selection) |
20 | −0.155 | Collapses to near zero. |
| Restrict to n_sponsored ≥ 10 | 16 | −0.514 | Driven by same two firms. |
| Both filters (drop both + n ≥ 10) | 14 | −0.104 | Null. |
Trimmed (no CENSUS / EVA FRANCIELI) mean β by party-HHI tertile: low 7.38, mid 8.52, high 8.22 — flat.
Diagnostic — what's happening at the extremes
- High-HHI, negative-β (the suspected-selection cell, n=2):
CENSUS (52 polls, 92 % PSD, β −8.2); EVA FRANCIELI (12 polls, 92 %
PL, β −8.4).
thinking.md§ "Leads from an-071" already proposes a selection-mix diagnostic for these — they take sponsors who are losing and so the sponsored side captures the candidate's true (low) standing. - High-HHI, modest-β (the partisan-specialist cell that survives the trim, n=2): INSTITUTO PARANA (28 polls, 100 % PL, β +1.30); INSTITUTO VERITA (20 polls, 45 % PL, β +1.13). Both are well-known firms in the AN-016 sample with low β. The partisan-specialists that do appear are not the high-β tail.
- Low-HHI, high-β (the menu-pricing cell, ~6 firms): METHODUS (HHI 0.36, β +23), CAMARGO E MEDINA (HHI 0.19, β +22), TRIANGULO (HHI 0.14, β +15), BRASLOPES (HHI 0.33, β +16), VISAO (HHI 0.25, β +14), DATA SC (HHI 0.22, β +13). These are the high-β firms — they take candidate work across many parties.
Interpretation
The M1/M3 relational-reputation prediction stated in
docs/thinking/enforcement-puzzle.md
— that firms with durable party ties (high party-HHI) deliver
larger β because the relationship sustains a credible
slant-commitment — is not supported at point estimate. The
naïve full-sample r = −0.456 is driven by the two
suspected-selection negative-β firms; trimming them collapses the
correlation to −0.155 and the high-HHI tercile mean β to 8.22 pp,
essentially identical to the low-HHI tercile (7.38). High-β firms
cluster at low party-HHI: they take candidate work across many
parties. The cell that would empirically validate M1/M3 (high HHI
- high β) contains the two negative-β CENSUS/EVA-FRANCIELI cases (selection artifact) plus INSTITUTO PARANA and INSTITUTO VERITA (both β ≈ +1, the low end of the AN-016 β distribution).
The M4 / single-shot pricing reading gains relative weight by
elimination: bias is independent of the firm's party portfolio
concentration, consistent with firms slanting on a per-poll basis
for whichever client pays rather than via durable party-level
relationships. The Goiás 2020 / IPOP existence proof in
docs/thinking/enforcement-puzzle.md (R$ 6 k per "first-place"
poll, posted-menu pricing) fits this reading better than
relational-trust storytelling.
The state-HHI universality (37/38 firms) is the cleanest positive finding here — every firm in the candidate-sponsored segment is a state-local operator. Combined with low party-HHI, the modal firm is "small local pollster takes whoever pays in its state," not "small local pollster with durable ties to one party." This is structural background for the M1-via-reputation story: the reputation network in the candidate-sponsor segment, if it exists, is geographic not partisan.
Caveat: n = 22 on the β cross-cut. The point estimates are robust to the obvious sensitivity (the trimmed correlation is near zero under three independent filters), but a published version of this test would benefit from broader β coverage (lower the n_sponsored threshold in AN-016, or estimate β on the AN-073 ≥ 5 sample directly).
Follow-ups
CENSUS / EVA FRANCIELI selection diagnostic (puzzle): confirm the suspected-selection story by computing the per-firm rank/competitiveness mix of sponsored races vs all polled races. Already on
todo.mdas a lead from AN-071. AN-073 raises the priority — these two firms are also the sole driver of the apparent negative HHI × β correlation, so the diagnostic simultaneously cleans the AN-071 and AN-073 readings. Suggested script:source/analysis/an-NNN-negative-beta-firm-rank-mix.py.Extend β estimation to the AN-073 ≥ 5 sample (extension): AN-016 used n_sponsored ≥ 10. With n_sponsored ≥ 5 we have 38 firms here; estimating β on the same cut would give us 38 instead of 22 firms for the HHI × β cross-cut and would push the correlation test out of the underpowered range. Direct re-run of AN-016's per-firm within-firm estimator with the lower cut. ~30 min. Suggested script:
source/analysis/an-NNN-within-firm-beta-min5.py.Two-axis specialization map (extension): scatter party-HHI × state-HHI (with β as color). The unified picture of "regional only", "regional + partisan", and "national" specialists, plus how β varies across the four cells, is more informative than either dimension alone. The natural visual companion to
hhi_distributions.pdf. ~1 hour. Suggested script:source/analysis/an-NNN-firm-specialization-2d.py.Cross-cycle dyad persistence (blind spot): a firm-party ratio is a single-cycle snapshot. The true relational test is whether the same firm × party (or firm × political family) dyads recur in 2020 and 2022. Currently parked — needs harmonized 2020 sponsor data.
Post-election municipal-contract follow-up (M4 direct test) (blind spot): the AN-073 null on the M1/M3 prediction strengthens the case for the M4 / quid-pro-quo test the enforcement-puzzle doc proposed. Portal da Transparência CNPJ match of pollster firms against 2025–26 municipal contracts in the munis where their candidate-clients won, vs lost. Out of scope for this iteration; lives on
todo.mdas a Leads block.