id: an-093 hypothesis: shell-contratante headline: Final paper appendix table. Five buckets (major_media reference, small_media, candidate, pollster_self, other_firm) × four FE specs × three accuracy outcomes, all with log_sample_size control. Under race FE (S1): candidate is 2.69 pp more accurate on margin than major-media (p<0.05); other_firm 2.00 pp more accurate (p<0.10). Under firm + race FE (S3): all four non-major-media buckets show 1.8 to 3.8 pp lower margin error than major-media within firm × race — but the major-media reference is thin (n=304, 4.3% of sample), so S3 selection is sharp and the magnitudes should be read with caution. S1 is the more interpretable spec. type: synthesis-table-paper-ready question: "Single consolidated appendix table for the paper, with major-media as the cleanest possible reference and sample size controlled." tags: ["hyp:shell-contratante", "paper-ready", "spec-ladder", "synthesis", "headline"] status: interpreted status_date: 2026-06-17 confidence: green created: 2026-06-17 script: source/analysis/an-093-paper-spec-ladder-final.py target: build/table/an-093-paper-spec-ladder-final.csv

AN-093: Final paper appendix table

Supersedes AN-090 for paper appendix purposes. Five buckets with major-media as the cleanest possible reference, log_sample_size control in every spec, four FE levels.

Bucket counts (n = 6,991)

Bucket n % Description
major_media (reference) 304 4.3 % GLOBO / FOLHA / ESTADÃO / large regional outlets
small_media 2,644 37.8 % Small digital outlets / blogs
candidate 1,026 14.7 % CPF / committee / party / party-name
pollster_self 1,464 20.9 % Firm self-contracts
other_firm 1,553 22.2 % Third-party / shell-suspect

Table

Cluster SE at race in parens. * p<0.10, ** p<0.05, *** p<0.01.

Margin error (the headline outcome)

Bucket (ref = major_media) S0 No FE S1 Race FE S2 Firm FE S3 Firm + Race FE
small_media −1.46 (2.48) −0.14 (0.99) −4.23 (3.33) −2.23* (1.20)
candidate −2.94 (2.53) −2.69** (1.25) −4.07 (3.35) −3.81** (1.50)
pollster_self −1.85 (2.43) −0.99 (1.03) −2.45 (3.29) −1.82 (1.32)
other_firm −5.08** (2.57) −2.00* (1.09) −6.97** (3.38) −2.94** (1.27)
log_sample −6.45*** (1.45) −1.31 (0.90) −7.70*** (1.35) −0.37 (1.07)
n 6,693 5,521 6,633 5,424

Calls winner first

Bucket (ref = major_media) S0 No FE S1 Race FE S2 Firm FE S3 Firm + Race FE
small_media −0.012 (0.038) −0.056 (0.040) +0.033 (0.054) −0.002 (0.065)
candidate −0.032 (0.039) −0.021 (0.048) +0.069 (0.058) +0.056 (0.074)
pollster_self −0.036 (0.039) −0.026 (0.042) +0.073 (0.058) +0.045 (0.070)
other_firm −0.056 (0.038) −0.071* (0.041) +0.020 (0.053) +0.008 (0.063)
log_sample −0.061** (0.030) −0.014 (0.036) −0.119*** (0.034) −0.025 (0.041)
n 6,991 5,750 6,929 5,655

Mean |error|

Bucket (ref = major_media) S0 No FE S1 Race FE S2 Firm FE S3 Firm + Race FE
small_media +1.35*** (0.50) +0.90** (0.39) −0.25 (0.66) −1.03 (0.65)
candidate +2.10*** (0.58) +1.05* (0.54) −0.18 (0.70) −0.84 (0.66)
pollster_self +1.42*** (0.51) +1.01** (0.46) +0.05 (0.69) −0.94 (0.66)
other_firm +1.15** (0.51) +0.91** (0.44) −0.49 (0.67) −0.82 (0.67)
log_sample −0.76** (0.34) −0.56 (0.45) −0.28 (0.40) −0.62 (0.54)
n 6,991 5,750 6,929 5,655

Reading

Race FE (S1) — the headline spec

This is the cleanest interpretable comparison. Relative to major-media polls in the same race:

Firm + Race FE (S3) — the most demanding spec, but interpret carefully

Under S3, every non-major-media bucket has 1.8 to 3.8 pp lower margin error than major-media within firm × race (small_media −2.23*, candidate −3.81**, pollster_self −1.82, other_firm −2.94**). On mean |error|, all four buckets are now directionally NEGATIVE (more accurate than major-media within firm × race).

This counterintuitive direction needs care. Two readings:

The two readings can't be separated in this data. S1 (race FE only) is the more reliably interpretable spec for the paper appendix.

log_sample_size

Negative on all three outcomes under S0 — bigger samples are more accurate cross-sectionally. Race FE absorbs almost all of it (S1, S3 coefficients shrink and lose significance). The sample-size correlation is mostly race-difficulty composition (big samples cover important races where the polls are inherently harder to call). The S2 (firm FE alone) coefficient is still significant — within firm, larger samples are more accurate.

What the paper should report

  1. Lead with the 4-bucket S1 result from AN-090 in the §Results headline. That's the most familiar framing with the broad "media" reference.
  2. In the §Identification or §Robustness section, present AN-093 (this table) as the appendix. Show that the findings survive (a) major-media as reference, (b) sample-size control. Highlight the small_media coefficient as evidence the media bucket is heterogeneous.
  3. Caveat AN-093 S3 carefully. The thin major-media reference may inflate the S3 magnitudes via selection. The clean interpretation is S1; S3 is a robustness layer.

Tradeoffs vs AN-090

AN-090 (4 buckets) AN-093 (5 buckets)
Reference media (all 2,948) major_media (304)
Reference interpretation mostly small-blog-sponsored high-rep national outlets
Sample-size control no yes
Cleanest single spec S1 race FE S1 race FE
Reference cleanliness weaker stronger
Reference power strong (n=2,948) weaker (n=304)
S3 reliability reasonable suspicious (thin ref)
Paper role headline §Results appendix §Robustness

Both have value. Recommend keeping both: AN-090 for §Results, AN-093 for §Appendix. Each speaks to a different audience concern.

Caveats

Artifacts