Trusted-firm advantage survives 3 of 4 alternative definitions for the two simplest outcomes (calls_winner_first +10 pp, mean |error| −0.9 to −1.7 pp) — robust to hand-picked, top-10-UF-spread, or low-|β| definitions. Volume-based (top-10 by
User concern (2026-06-17): the AN-085 "trusted firm" list was a hand-picked name-recognition heuristic. This script tests robustness under four alternative definitions, plus checks whether the bucket-dummy findings (other_firm, candidate) survive each.
Four definitions
| Definition | Selection rule | n firms | n polls |
|---|---|---|---|
| D1 hand-picked | DATAFOLHA, QUAEST, REAL TIME MIDIA, PARANÁ PESQUISAS, VERITA (AN-085 / AN-086 cut) | 5 | 558 |
| D2 top-10 volume | 10 firms with most polls in matched sample | 10 | 1,442 |
| D3 top-10 UF spread | 10 firms in most distinct UFs | 10 | 958 |
| D4 bottom-10 |β| | 10 firms in AN-016 with smallest |within-firm β|, n≥30 | 10 | 656 |
Overlap
D1 ⊂ D3 — all five hand-picked firms appear in the top-10 by geographic spread. D1 ∩ D2 = 3 (Paraná, Veritá, Real Time); only 3 hand-picked firms are also top-10 by volume because DATAFOLHA and QUAEST focus on large cities with fewer protocols. D1 ∩ D4 = 2 (Paraná, Veritá) — DATAFOLHA, QUAEST, REAL TIME don't have enough self-sponsored polls to identify a within-firm β at all, so they can't be ranked by |β|. D4 is dominated by smaller regional firms with mixed candidate/media books and low differential slant (AGILI, AR7, IIP, INSTITUTO GERAIS, NEXXUS MAIS, PROMIDIA, W J MENDES, ROBERTO LORENZZON).
Trusted-firm coefficient — robustness table
Universe spec, race FE, dependent variable = each accuracy
measure. Coefficient on is_trusted controlling for
is_candidate, is_pollster_self, is_other_firm.
| Definition | n_trusted | calls_winner_1st | margin_error | mean |err| |
|---|---|---|---|---|
| D1 hand-picked | 558 | +0.100 (p<0.001) | −2.02 (p=0.001) | −1.15 (p<0.001) |
| D2 top-10 volume | 1,442 | +0.046 (p=0.059) | +1.09 (p=0.08) | −0.18 (p=0.49) |
| D3 top-10 UF spread | 950 | +0.103 (p=0.001) | −1.15 (p=0.08) | −0.87 (p=0.006) |
| D4 low |β| | 656 | +0.101 (p<0.001) | −0.98 (p=0.20) | −1.74 (p<0.001) |
Reading:
The +10 pp advantage on poll_calls_winner_first is robust to
3 of 4 definitions. D1, D3, D4 all give +0.100 (p ≤ 0.001). D2
gives only +0.046 (p = 0.06). The mean |error| advantage is
also robust to 3 of 4 — D2 gives null.
The margin_error advantage is fragile. Only D1 shows −2.02 (p = 0.001). D2 actually REVERSES the sign (+1.09, p = 0.08); D3 and D4 are borderline null (p = 0.08, p = 0.20). The AN-085 headline "trusted firms reduce margin error by 2 pp" is sensitive to the firm list — DATAFOLHA / QUAEST are doing the work, not Veritá / Paraná. Without DATAFOLHA + QUAEST in the set (D2, D4), the margin_error advantage shrinks or disappears.
D2 (top-10 by volume) is the OUTLIER and a bad trust marker. It returns the WORST trusted-firm coefficients on every outcome — calls_winner_first marginal, margin_error positive, mean |error| null. The reason: high-volume firms in the matched sample are state-level specialist firms (INSTITUTO DATATRENDS n=203, VOX BRASIL n=167, 100% CIDADES n=158, RANKING BRASIL n=128, MOREIRA & NOLETO n=119) doing lots of candidate work, not the national-tier firms. Volume conflates "produces many polls" with "produces good polls."
Bucket dummies under each definition
Critical robustness check: does the other_firm finding (AN-082, AN-084, AN-085) depend on the trusted-firm definition?
Race FE, dependent variable = margin_error:
| Definition | is_candidate | is_pollster_self | is_other_firm |
|---|---|---|---|
| D1 hand-picked | −2.53 (p=0.007) | −0.91 (p=0.14) | −1.95 (p=0.002) |
| D2 top-10 volume | −2.61 (p=0.005) | −1.08 (p=0.09) | −1.83 (p=0.003) |
| D3 top-10 UF spread | −2.48 (p=0.008) | −0.75 (p=0.23) | −1.84 (p=0.003) |
| D4 low |β| | −2.42 (p=0.011) | −0.74 (p=0.24) | −1.81 (p=0.004) |
All three bucket coefficients hold the same sign, magnitude, and significance level across all 4 trusted-firm definitions.
- is_candidate: −2.42 to −2.61 (all p ≤ 0.011)
- is_other_firm: −1.81 to −1.95 (all p ≤ 0.004)
- is_pollster_self: −0.74 to −1.08 (none sig)
The shell-bucket finding is independent of how we define trust. This is the cleanest robustness story we have for the H13 prediction.
Implications for the headline
The headline AN-082 / AN-085 / AN-086 sponsor-class contrasts are robust to trust definition. The shell finding stands.
The trusted-firm rank-concordance advantage is robust to 3 of 4 definitions. The +10 pp on calls_winner_first reproduces under hand-picked, UF-spread, and low-|β| definitions. The paper can claim this confidently.
The trusted-firm margin-error advantage should be downgraded. It's −2.02 pp in D1 but vanishes or reverses under D2, D3, D4. The AN-085 statement that trusted-firm media polls "reduce margin error by 3.16 pp within race" is driven mostly by DATAFOLHA + QUAEST in the cleanest cells; it does not generalize.
β-based "trust" (D4) gives the strongest mean |error| advantage (−1.74, p<0.001). This is conceptually clean: firms that don't differentially slant for sponsors also produce lower-error polls overall. The two trust dimensions (rank-concordance and magnitude accuracy) align with low-β firms — but rank-concordance is the more robust headline.
Updates to the canonical sponsor-group table (AN-086)
The 9-cell AN-086 table used D1 throughout. The qualitative ranking holds under D3 and D4 (trusted-firm rows at the top, non-trusted firm × small-media-sponsor near the bottom). The specific magnitudes shift — under D4, trusted-firm rows lose their margin-error edge but keep their mean |error| edge. The calls_winner_first ordering is essentially invariant.
For the paper, I'd recommend reporting:
- AN-086 table using D1 (clearest narrative)
- AN-087 robustness table as an appendix or footnote showing the +10 pp on calls_winner_first holds under D3/D4
- A caveat that margin_error effects are firm-list-sensitive
Caveats
- D4 sample is small. AN-016 only identifies β for 22 firms total; the bottom-10 |β| set is dominated by smaller regional firms with mixed books. Larger β-identifiable universe would require a more lenient cutoff (e.g. lowering the n≥5 self- sponsored requirement).
- D2 / D3 cutoffs at top-10 are arbitrary. Top-15 or top-20 would change set composition. AN-085's caveat about the diluting effect of adding more firms applies.
- The "true trusted" set is unobservable. Each definition proxies for a different aspect: D1 ≈ industry name recognition, D2 ≈ market share, D3 ≈ national operations, D4 ≈ differential-slant track record. They give different but partially overlapping sets and partially overlapping results. The aggregate signal is "trust matters; the exact list is fuzzy."
- β-based circularity. D4 firms have low |β| by construction. Using them to predict accuracy is conceptually separate (β measures sponsor differential, accuracy measures level), but a careful reader could argue the two share unobserved firm-quality variance. The β-based definition is cleanest as a robustness layer to D1, not as the primary cut.
Follow-ups
- Pool D1 ∪ D3 ∪ D4 as a "any-definition trusted" set and re-run the AN-086 table. Probably the strongest defensible cut for the paper.
- External-source list. ABEP (Associação Brasileira de Empresas de Pesquisa) maintains a membership directory. Cross-reference would give an independent industry-association trust signal — the cleanest external anchor possible.
- 2020-cycle β. Compute β on 2020 mayoral data, label trusted as bottom-quartile by 2020 |β|, then use that label on 2024 polls. Removes the post-hoc circularity in D4.
Artifacts
- Script:
source/analysis/an-087-trusted-firm-robustness.py - Spec-level coefficients:
build/table/an-087-trusted-firm-robustness.csv - Firm sets per definition:
build/table/an-087-trusted-firm-robustness__sets.csv - Headline JSON:
build/table/an-087-trusted-firm-robustness.json