AN-105: Theoretical slant signals + within-race demeaning

User's concern (race-proxy leak in AN-104) confirmed. Mode A (7 theoretical slant signals, no race-attention features) gives OOS AUC 0.614 — barely above chance. Mode B (28 AN-104 features demeaned within race) gives 0.693 — basically the AN-101 baseline. The 'genuine within-poll slant signature' detection ceiling is ~0.69 (within-race signal only); AN-104's 0.742 included ~0.05 AUC worth of race-attention proxy leak. Critical change: under within-race demeaning, is_other_firm survives only weakly (+0.006 ns at S3) — the 'shell-like tail signal' in AN-104 was partly race-proxy. Shells light up POSITIVELY under pure theoretical features (+0.016*** at S0 of Mode A) — opposite sign from AN-104's negative — because race-attention features no longer absorb shell variation. Honest §Policy framing: blind detection ceiling is 0.69, well below earlier claims; the genuine within-poll slant fingerprint is weak (~0.61 with theoretical features only).

Hypothesis: H13: Shell-contratante polls show larger residual β
Confidence: green
Type: ml-purification

Script: source/analysis/an-105-theoretical-slant-detection.py
Target: build/table/an-105-theoretical-slant-detection.csv
Status: interpreted · 2026-06-17
Created: 2026-06-17

User's concern (2026-06-17): the AN-104 0.742 AUC might be detecting "race characteristics that attract sponsorship" rather than slant signatures. Two purifications:

Mode A: hand-curated theoretical slant signals only
Mode B: within-race demeaning of the AN-104 feature set

Both purify against race-attention proxy leak.

Mode A: theoretical slant signals (7 features)

Hand-curated features, each with direct slant interpretation:

Feature	Theoretical interpretation
signed_spike_z	L-shape spike scaled by within-poll noise
l_shape_ratio	max-deviation concentration (max\|dev\| / mean\|dev\|)
top1_signed_inflation	how much top cand boosted vs consensus
max_dev_is_top_ranked	is the max-deviation cand the poll's #1?
skew_dev	positive skew = one outlier high (slant signature)
max_pos_dev	magnitude of biggest positive deviation
n_cands_at_zero	count of minor cands at near-zero (honest pattern)

Results

Model	AUC	Log-loss	AP
Logistic L2	0.602	0.416	0.198
XGBoost	0.614	0.419	0.201
LightGBM	0.610	0.434	0.198

AUC 0.614 — barely above chance. Pure theoretical signals alone are weak detectors. The slant fingerprint exists (the +7 pp slant is precisely identified in AN-096) but it does not manifest as a tightly-detectable individual-poll pattern.

Mode B: within-race demeaning (28 features)

Take the AN-104 strict-blind feature set. For each feature f: f'(p) = f(p) − mean(f for polls in same race(p))

Forces the classifier to use ONLY within-race variation. Race- level signal is absorbed before training.

Results

Model	AUC	Log-loss	AP
Logistic L2	0.526	0.429	0.164
XGBoost	0.692	0.399	0.277
LightGBM	0.693	0.417	0.276

AUC 0.693 — back to the AN-101 baseline. Within-race demeaning removes ~0.05 AUC of the AN-104 advantage over the basic feature set. That's the race-attention proxy leak.

Detection AUC progression — honest version

Analysis	Features	AUC
AN-100 / AN-101	7 basic within-poll features	0.69
AN-105 Mode A	7 theoretical slant signals	0.614
AN-105 Mode B	28 AN-104 features, race-demeaned	0.693
AN-104 (race-proxy leak present)	28 raw features	0.742
AN-103 (firm-aggregate leak)	+ firm features	0.911

The honest "blind detection ceiling from public data" is AUC 0.69, not 0.74. AN-104's 0.05 advantage was race-proxy signal — popular races have more peer polls, late polls, more cands → those features correlate with sponsor mix.

The honest "pure within-poll slant signal" is AUC 0.61. The slant fingerprint exists but is not strongly identifiable in a small interpretable feature set.

Spec ladder — Mode B (within-race demeaned predictions)

This is the methodologically clean version. Reference = media.

Bucket	S0 No FE	S1 Race FE	S2 Firm FE	S3 Firm + Race FE
is_candidate	+0.066***	+0.020**	+0.044***	+0.014*
is_pollster_self	−0.007*	−0.003	+0.006	+0.006
is_other_firm	+0.011**	+0.007	+0.001	+0.006
is_shell	+0.020***	−0.011	+0.001	−0.000

Changes vs AN-104 (raw 28-feature spec):

is_other_firm at S3: AN-104 +0.013** → AN-105 Mode B +0.006 (ns). The AN-104 "shell-tail signal survives joint FE" finding was partly race-proxy. Under within-race demeaning, the within-firm-within-race other_firm signal weakens substantially.
is_shell flips sign: AN-104 S0 −0.055*** → AN-105 Mode B S0 +0.020***. With race-attention features removed, shells weakly DO show up positive on within-poll signals at S0. Still null under any FE.
is_candidate survives joint FE at +0.014* (p<0.10). Real but marginal signal.

Spec ladder — Mode A (theoretical signals predictions)

Bucket	S0	S1 Race FE	S2 Firm FE	S3 Firm + Race FE
is_candidate	+0.022***	−0.002	+0.019***	+0.002
is_pollster_self	−0.004	+0.002	+0.003	+0.001
is_other_firm	+0.008**	−0.000	+0.007*	+0.002
is_shell	+0.016***	+0.003	+0.015*	+0.008

Pure theoretical-feature predictions give modest positive coefficients for candidate, other_firm, AND shell at S0 and S2. Under theoretically-justified slant signals alone, shells DO light up weakly positive (+0.016*** at S0). The opposite sign from AN-104 confirms that AN-104's negative-shell coefficient was driven by race-attention features absorbing shell variation. With those removed, shells show a small positive slant-signature signal but it collapses under any FE — same noise-floor pattern.

Substantive synthesis: what the purifications tell us

The genuine within-poll slant signal is weak at the individual-poll level. Mode A AUC 0.61, Mode B AUC 0.69 — both modest. The slant of +7 pp is real (AN-096) but spread across natural noise in a way that defeats individual-poll classification (AN-098 noise floor).
Race-attention features were doing real work in AN-104. The 0.05 AUC gap between AN-104 raw and AN-105 Mode B is genuine signal — but it's signal about WHICH RACE you're in, not about whether the poll is slanted. For policy purposes that's a distinction with a difference.
The shell signal is unstable across detector regimes. AN-104 raw: shells look LESS sponsored (−0.055***) at S0. AN-105 Mode A (theory): shells look MORE sponsored (+0.016***) at S0. AN-105 Mode B (demeaned): shells look slightly more sponsored (+0.020***). The flip reveals that AN-104's negative shell signal was a race-feature absorption artifact, not a "shells evade detection" effect.
The 'professional evasion' interpretation needs refinement. AN-104's hardened "shells don't ping" story was overstated. Under purified detectors, shells show a small positive but marginal slant signal. The structural mimicry is real but not absolute. Shells are weakly detectable above noise; the AN-094 CNPJ-side audit remains the higher-power identification tool.
is_other_firm signal weakens substantially. AN-104 trumpeted "other_firm tail survives joint FE" but that was partly race-proxy. The genuine within-race signal is +0.006 at S3 (ns) — directional but not statistically distinguishable from zero. The within-poll fingerprint of unaudited shell-style polls is real but small.

Honest framing for the paper §Policy

Three detection regimes, with corrected AUC numbers:

Regime	Features	AUC	Interpretation
Theoretical signatures alone	7 hand-curated	0.61	Slant fingerprint is real but small in individual polls
Strict-blind within-race	28 demeaned features	0.69	Genuine within-poll detection from public data
Strict-blind raw (AN-104)	28 features	0.74	Inflated 0.05 by race-attention proxy
Firm-augmented (AN-103)	+ firm aggregates	0.91	Inflated 0.22 by firm-identity proxy

The defensible "what public-data blind detection achieves" number for the paper is 0.69, not 0.74. The 0.69 ceiling is tight: it represents the within-race within-poll signal that isn't a race-type proxy. Above-poll-level aggregation (firm- level, AN-103) buys more AUC but at the cost of conflating "identifying slanted polls" with "identifying firms that take candidate work."

The blind detector at AUC 0.69 is still in "fair triage" territory. Useful for flagging polls for closer audit, not for declaring individual polls slanted.

Implication for the AN-094 CNPJ-side approach

The fact that purified blind classifiers struggle (AUC 0.61– 0.69) reinforces the CNPJ-side audit's necessity, not its sufficiency. Two complementary detection mechanisms:

Blind statistical: 0.61–0.69 AUC, useful for triage
CNPJ-side audit (AN-094): high-precision identification of professional shells; lower coverage but higher confidence

For a §Policy proposal, both should be deployed together. The 0.69 blind detector flags candidates for audit; the CNPJ-side audit (capital social, CNAE, web presence) provides high- confidence shell flagging for the polls the statistical detector cannot distinguish.

Caveats

Mode A's 7 features may be sub-optimal. Hand-curation is not exhaustive. Better-engineered theoretical features could push AUC up modestly (maybe to 0.65). But the qualitative point — theoretical slant signals are weak detectors — holds.
Within-race demeaning loses signal too. Some race-level features (e.g. log_sample shape, days_to_election) carry real information that's also race-correlated. The Mode B AUC of 0.69 is a lower bound; the true "genuine within-poll signal" is probably between 0.69 and 0.74. The exact apportionment between race-proxy and within-poll signal is not fully identifiable.
The shell-coefficient sign flips are striking but follow from the feature-set design, not data artifacts. AN-104's negative was real (driven by sample-size + race- attention features); AN-105's positive is real (driven by theoretical signals). Both are honest views.

Follow-ups

Cross-validate Mode A and Mode B AUCs with cross-cycle data (2020 + 2022) to confirm the 0.61 / 0.69 ceilings.
Better-engineered theoretical features. SHAP analysis of AN-104 LightGBM to find which features are doing the work, then design hand-curated versions of those signals.
A "double-machine-learning" version: predict poll-features from race characteristics first (a separate model), use the residuals as input to the bias classifier. Methodologically cleaner than within-race demeaning for non-linear race effects.
Within-firm demeaning as a parallel test. Removes firm- level signal in the same way Mode B removes race signal.
Update the §Policy framing in paper with the corrected 0.69 ceiling and the "blind statistical + CNPJ-side audit" complementary mechanism story.

Artifacts

Script: source/analysis/an-105-theoretical-slant-detection.py
Model comparison: build/table/an-105-theoretical-slant-detection.csv
Headline JSON: build/table/an-105-theoretical-slant-detection.json

AN-104 strict-blind detection (race-proxy leak) — corrected here
AN-103 full ML pipeline — firm-leak ceiling
AN-098 noise floor — why individual-poll detection is hard
AN-094 (other session) shell audit — CNPJ-side method

AN-105: Theoretical slant signals + within-race demeaning

Mode A: theoretical slant signals (7 features)

Results

Mode B: within-race demeaning (28 features)

Results

Detection AUC progression — honest version

Spec ladder — Mode B (within-race demeaned predictions)

Spec ladder — Mode A (theoretical signals predictions)

Substantive synthesis: what the purifications tell us

Honest framing for the paper §Policy

Implication for the AN-094 CNPJ-side approach

Caveats

Follow-ups

Artifacts

Related