id: an-102 hypothesis: shell-contratante headline: Corrected headline table with 4 sponsor buckets + shell as separate category (per AN-094 audit of 14 probable shells, 229 polls / 668 cand-poll rows in analysis sample). |error| at cand-poll level: shell coefficient is null across all FE specs (+0.65 raw, ns; collapses under any FE). Predicted bias (GBM, OOS) at poll level: shell coefficient also null (~0.00 across specs). The non-identified other_firm tier (other_firm excluding 14 shells) shows persistent positive predicted bias (+0.024 raw p<0.01, +0.013 firm FE p<0.10), suggesting the within-poll fingerprint of shell-style slant lives in the UNAUDITED residual rather than in the top-25-identified shells. Two readings: (i) AN-094 shells are professional operators producing media-pattern polls to evade detection; (ii) 229 polls is underpowered for shell-bucket inference. GBM AUC = 0.673 OOS. type: headline-corrected question: "With media as the single reference category and the AN-094 identified shells as a separate bucket, what do the |error| and predicted-bias spec ladders show?" tags: ["hyp:shell-contratante", "headline", "shell-bucket", "ml", "synthesis"] status: interpreted status_date: 2026-06-17 confidence: yellow created: 2026-06-17 script: source/analysis/an-102-headline-with-shell-bucket.py target: build/table/an-102-headline-with-shell-bucket.csv

AN-102: Headline tables with shell bucket

User direction (2026-06-17):

Revert the headline to 4-bucket (media all-in as reference, no major/small media split — that's an appendix robustness).
Add the AN-094 identified shells (14 CNPJs / 668 polls) as a separate sponsor bucket.

Bucket	Polls (analysis sample)	Cand-poll rows	Description
media (reference)	3,140	10,350	All media-sponsored, no split
pollster_self	1,464	5,307	Firm self-contracts
candidate	1,026	2,841	CPF/committee/party-linked
other_firm	851	2,542	Third-party CNPJs, EXCLUDING AN-094 shells
shell	229	620	AN-094 PROBABLE_SHELL CNPJs

The 14 shell CNPJs (from AN-094 of the other session) include VS Publicidade (254 polls in MS+SP), FacUnicamps (80 polls in GO — paper §2 case), ESTACAO I ESTUDIO CRIATIVO (52), G S NEGREIROS (51), ABC PUBLICIDADE (29), TRES MARIAS (27), GLEDSON LOPES SANTIAGO MEI (27), NIVALDO GALINDO (26), RAMON MARGIOLLE MEI (25), SX EMPREENDIMENTOS (24), HYAGO CAVALCANTE MEI (20), JOSE VANDERLUCIO MEI (18), ASSOC MARKETING MG (18), DDD91 LTDA (17). See AN-094 doc for classification evidence per entity.

Table A — |error| at cand-poll level

Reference = media. Cluster SE at race. log_sample as continuous control.

Bucket	S0 No FE	S1 Race FE	S2 Race + Cand FE	S3 Race + Cand + Firm FE
is_candidate	+0.98*** (0.32)	+0.07 (0.40)	+0.05 (0.39)	−0.30 (0.34)
is_pollster_self	−0.14 (0.22)	−0.23 (0.23)	−0.06 (0.21)	−0.19 (0.25)
is_other_firm	+0.03 (0.28)	−0.32 (0.32)	−0.32 (0.28)	−0.29 (0.28)
is_shell	+0.65 (0.53)	−0.14 (0.55)	+0.06 (0.53)	−0.10 (0.57)
log_sample	−1.75*** (0.30)	−0.66* (0.36)	−0.51 (0.31)	−0.62* (0.35)
n (cand-poll obs)	21,660	21,653	18,205	18,205

Reading: shell |error| coefficient is null across all specs. This is the same noise-floor pattern AN-095 / AN-098 showed for candidate-sponsored polls — the slant doesn't translate to measurable |error| degradation when cand FE absorbs the candidate-specific variance. With only 229 shell polls, SE is wide (0.53 in S0; 0.57 in S3) and a true effect of ±0.5 pp would not be detectable.

Table B — Predicted bias (GBM, OOS) at poll level

The predicted bias is the out-of-sample 5-fold-CV GBM probability from sponsor-blind features (max_signed_dev, signed_spike, poll_std_dev, mean_abs_dev, skew_dev, max_abs_dev, log_sample) trained on poll_has_candidate_sponsor. OOS AUC = 0.673.

Bucket	S0 No FE	S1 Race FE	S2 Firm FE	S3 Firm + Race FE
is_candidate	+0.0277*** (0.007)	−0.0107 (0.010)	+0.0134 (0.009)	−0.0200 (0.013)
is_pollster_self	−0.0001 (0.004)	−0.0030 (0.004)	−0.0045 (0.006)	−0.0048 (0.007)
is_other_firm	+0.0242*** (0.007)	+0.0122 (0.008)	+0.0129* (0.008)	+0.0019 (0.010)
is_shell	−0.0054 (0.009)	+0.0056 (0.008)	−0.0043 (0.015)	−0.0045 (0.013)
log_sample	−0.0805*** (0.005)	−0.0460*** (0.009)	−0.0784*** (0.006)	−0.0521*** (0.010)
n (polls)	2,400	2,388	2,331	2,279

The surprise: shell coefficient is null, but other_firm-residual is positive

The most interesting finding: other_firm (now excluding the 14 audited shells) still shows positive predicted-bias coefficients (+0.024 raw, p<0.01; +0.013 firm FE, p<0.10), while shell is null across every spec.

Two readings, neither verifiable from this analysis alone:

AN-094 shells are professional operators that deliberately produce polls looking like media polls — clean within-poll patterns, typical sample sizes — to evade pattern-based detection. The within-poll "slant fingerprint" the classifier learned (one-cand spike + others near zero) is absent because professional shell-operators distribute their slant across multiple cand-rows or use sampling/methodology tricks instead of crude spikes. The non-audited other_firm population includes less-sophisticated operators whose polls still ping the classifier.
Power constraint: 229 shell polls is too thin. The shell coefficient SE is 0.013–0.015 in probability-points — the true effect could be +0.01 to +0.02 (similar to other_firm) but invisible at this sample size. The shell-coefficient confidence intervals overlap zero AND overlap the other_firm coefficient; the test isn't well- identified.

The two readings can't be disambiguated here. They have very different policy implications:

Reading 1 → AN-094 shells operate at a sophistication level that defeats AN-100's blind detection, but the identification of them via CNAE / capital / web-presence audit (AN-094) succeeds. Detection requires CNAE-based rather than statistical-pattern-based audit.
Reading 2 → shells DO show the slant pattern but n is too thin to confirm. Expanding AN-094 to top-50 or top-100 shells (estimated by the same audit method) would tighten.

Comparison to the pre-shell-bucket version (AN-101)

AN-101 used 5 buckets (major_media / small_media / candidate / pollster_self / other_firm), with the shells INSIDE the other_firm bucket. Other_firm coefficient on predicted bias was +0.0162 (race FE, p<0.05).

AN-102 splits shells OUT of other_firm. Other_firm residual is now +0.0122 (race FE, ns) — closer to zero. Shell is null (+0.0056, ns).

Reading the change: removing the 14 identified shells from other_firm modestly REDUCES the residual signal. The shells in the original AN-101 other_firm bucket were contributing slightly to the positive predicted-bias coefficient — but the remaining residual still shows positive directional signal. The shell-pattern is distributed across many sponsors in the universe, not concentrated in 14 specific ones.

log_sample dominance

In Table B, log_sample is the strongest predictor across all four specs (β ≈ −0.05 to −0.08, p<0.001). Sponsored polls (candidate-linked) run systematically smaller samples than media polls, and the classifier learned this.

But shells appear to run typical sample sizes — this is part of why shell predicted-bias coefficient is null. They're mimicking media-poll sample sizes (250–500 entrevistados) rather than the smaller samples typical of candidate-paid polls (~360).

This is itself a finding: shells optimize for blend-in. They don't just hide the sponsor — they hide the structural signatures that would otherwise reveal them.

Caveats

AN-094 audit is top-25 only: the 14 shells identified are the most-visible. Many less-prolific shells in the tail are still classified as other_firm; the "other_firm" residual category in AN-102 still contains an unknown fraction of unaudited shells.
229 polls is statistically thin: shell coefficients have CI half-widths of ±1 to 2 percentage-points on the predicted-bias outcome — many substantive effect sizes would not be detected.
GBM AUC dropped from 0.692 (AN-101) to 0.673 because removing major_media reduces the discriminative power of the major-vs-small-media split in the features. The classifier is the same; only the sample is now slightly different (excludes the major_media as a separate label).
Reading 1 is speculative — "shells are professional" is consistent with the data but cannot be proven from observational analysis.

Follow-ups

Extend AN-094 audit to top-50 or top-100 shells. Would roughly double the shell-bucket n and test whether the null shell-bucket coefficient is power or pattern.
Decompose log_sample within shells: do shell polls actually run media-typical sample sizes, or candidate- typical? If shells run media-sized samples, that's structural evidence for the "professional blend-in" reading.
Cross-pollster pooling within shells: AN-096 (other session) found shells route to 1 pollster predominantly. Could pool shells under each captive pollster and test firm-level signals.
Train classifier on the explicit shell label instead of poll_has_candidate_sponsor. If trained on shell labels, does the classifier identify other_firm tail entries as shell-like? Multi-class label is the natural extension.
Wire to the paper: AN-102 is the corrected headline table. AN-101 (5-bucket) goes to appendix as a media-heterogeneity robustness check.

Artifacts

Script: source/analysis/an-102-headline-with-shell-bucket.py
CSV: build/table/an-102-headline-with-shell-bucket.csv
JSON: build/table/an-102-headline-with-shell-bucket.json

AN-094 (other session) shell audit — source of the shell CNPJ list
AN-095 (other session) VS Publicidade deep dive
AN-096 (other session) bipartite shell-pollster structure
AN-101 (this session) predicted-bias as outcome — superseded as headline
AN-100 sponsor-blind detection
AN-098 noise floor

AN-102: Headline tables with shell bucket

AN-102: Headline tables with shell bucket

New sponsor classification

Table A — |error| at cand-poll level

Table B — Predicted bias (GBM, OOS) at poll level

The surprise: shell coefficient is null, but other_firm-residual is positive

Comparison to the pre-shell-bucket version (AN-101)

log_sample dominance

Caveats

Follow-ups

Artifacts

Related