Firm is the slant-control unit, not the statistician. Within multi-statistician firms, adding statistician FE on top of firm FE raises R² by only 0.016 (0.174→0.190) and the interaction F(16, 1249)=1.19, p=0.27 is null. Within-firm sponsor-routing is also null (permutation p=0.73 on 19 multi-stat firms). Consistent with bias being either uniformly accepted by all signers of a firm, or induced at a margin the statistician does not supervise — the test cannot discriminate.

Hypothesis
statistician-as-slant-unit
Confidence
yellow
Type
descriptive
Design
Sample
cand_poll.parquet matched_share=1 universe joined to the cleaned statistician map (build/intermediate/statistician_map_2024.tsv). 21,425 (protocol × candidate) rows, 282 statisticians, 421 firms; 448 self-sponsored rows. FE-ladder subframe restricts to firms with ≥3 sponsored, ≥3 unsponsored, ≥2 statisticians — 14 firms, 24 statisticians, n=1,295.
Specification
spec ladder A (no FE) / B (firm FE) / C (statistician FE) / D (firm + statistician FE) / E (D + stat × sponsored interaction). Cluster-robust SE not applied to keep the F-test apples-to-apples with the nested OLS. Per-statistician β = mean(bias | sponsored) − mean(bias | unsponsored) on the headline universe, restricted to signers with ≥5 sp & ≥5 un.
Notes
Bias = poll_percent_raw − 100·final_share. Per-statistician β is unconditional and confounds with firm-mix. The FE ladder is the identified test.
Script
source/analysis/an-080-slant-unit-firm-vs-statistician.py
Target
build/table/an-080-slant-unit-firm-vs-statistician.csv
Status
interpreted · 2026-06-16
Created
2026-06-16

Question

The CONRE statistician thread in docs/thinking/conre-statistician-lever.md proposed that personally-on-record signatories (NM_ESTATISTICO_RESP + CD_CONRE) could be a sleeping legal lever for policing bias. The empirical version of that question: is the statistician a separate slant-control unit from the firm? Two complementary tests.

Q1. Within firms, do sponsored polls cluster on specific signers?

Permutation test on within-firm sponsor-rate spread across statisticians, on 19 multi-stat firms with ≥2 sponsored, ≥2 unsponsored polls. Per-firm chi-squared on (statistician × sponsored) crosstabs as a secondary descriptive.

Q2. Within firms, does β vary by statistician?

Four-spec FE ladder + F-test of statistician × sponsored interaction over firm + stat FE base. Subframe restricted to firms with ≥3 sp, ≥3 un, ≥2 stats (14 firms, 24 statisticians, n=1,295).

Findings

Per-statistician β (unconditional, 23 signers)

Statistic Value
Mean 5.37
Std 5.06
Range −4.43 to +13.46
Share β > 0 82.6 %
Share β > 5 60.9 %
Share β > 10 17.4 %

Top by β (with n_sp ≥ 5 & n_un ≥ 5):

CONRE Name n_sp n_un n_firms β
11248 ANGELA MARIA DA SILVA 5 74 1 +13.46
9019 UBIRAJARA ALVES TRINDADE SAMPAIO 5 10 1 +12.89
8151 JULIANE SILVEIRA FREIRE DA SILVA 33 203 3 +12.39
9443 MARCELO HIDEMI UEMURA 8 376 4 +11.34
9356 LAÉRCIO DE SOUSA ARAÚJO 54 619 8 +6.04
9063 LINIANE GAZOLA 56 1,722 30 +3.29

Cross-section heterogeneity is real. But it is unconditional — each statistician sits in their own firm portfolio, and β is being attributed to the signer rather than to the firms they sign for. The FE ladder tests whether this attribution holds up.

FE ladder + interaction (n=1,295; 14 firms; 24 statisticians)

Spec β_sponsored SE p
A: no FE 6.31 1.29 <0.001 0.018
B: firm FE 4.90 1.24 <0.001 0.174
C: statistician FE 4.59 1.22 <0.001 0.187
D: firm + stat FE 4.99 1.24 <0.001 0.190
E: D + (stat × sp) (F-test below) 0.202

Within-firm sponsor routing (19 firms)

Observed Null (perm, n=500)
Mean within-firm sponsor-rate spread 0.140 0.157 (sd 0.026)
Permutation p 0.726
Share of firms with chi² p < 0.05 5.3 % (≈ chance)

Observed spread is less than the null mean — actively the opposite of routing. Sponsored polls are not directed to specific statisticians within firms; the signer is whoever's available.

Interpretation

The firm is the slant-control unit. The statistician is a passive signatory whose unconditional β reflects which firms they happened to sign for, not what they did at any one firm.

The user's reading of the null (Henrik, 2026-06-16, while discussing this run): if statisticians knew about and disagreed with the bias, we would expect some to refuse to sign biased polls. The within-firm homogeneity is then evidence that bias is induced at a point the statistician does not see. That is consistent with two distinct mechanisms — neither of which this test can discriminate between:

  1. Bias at a margin outside the statistician's purview. The plano amostral the statistician signs is the declared methodology. The actual fielding — which substrata get over-quotaed, which interviewers ask what, where the door-knocks land, how the post-stratification weights resolve borderline cases — happens at the firm's operational level, often without statistician supervision. This is especially likely for the rent-a-signature signers (LINIANE GAZOLA signing for 39 firms across 19 UFs cannot be supervising any of them). Channel A as executed may diverge from Channel A as declared without the statistician's awareness or consent.
  2. Channel B fabrication after data collection. The statistician signs the registration with the methodology declaration; the published numbers are edited at the commercial / management layer after the data come in. This bypasses the statistician entirely.

Both predict (i) within-firm statistician homogeneity (the FE test), (ii) null within-firm sorting (the permutation test), and (iii) the absence of large declared-design differences between sponsored and unsponsored polls (the AN-024 / AN-033 / AN-041 / AN-042 / AN-043 rule-out series). The §sec:policy "size-mismatch problem" — that documented design levers do not add up to the +7 pp headline — finds a possible explanation here: the slant is induced at margins outside what gets disclosed.

A test that would discriminate

(i) and (ii) differ in what they predict about the audit trail (LE.34 §1). Under (i), the planilhas-individuais audit recovers the actual fielding pattern (if collected) and the slant should be visible against the declared plano amostral. Under (ii), the planilhas back the published numbers (because the published numbers were never the data) and the audit fails to detect anything.

The LE.34 §1 audit right is barely exercised in practice. If our sample of audit cases (Use 2 of the EJ agenda, future work) ever materializes, the conditional distribution of outcomes (audit finds operational deviation vs audit comes up empty) is the test.

Limits

Files