AN-061: Same-firm × same-race customer-discrimination test

Within firm × race cells where the same firm filed BOTH a candidate-sponsored poll and an independent comparator (24 cells, 242 rows), the sponsor coefficient attenuates to −2.18 pp (SE 2.30, p = 0.345), ruling out the +7.4 pp headline on this overlap subsample. The full headline survives unchanged on the universe sample (Spec A: +7.43; Spec B with firm FE: +7.94). The customer-segment discrimination claim of §6 therefore needs refining: the +7 pp does NOT live in within-firm-within-race differentiation. Of 4,510 firm-race cells in the sample, only 263 have a sponsored poll and only 24 have BOTH a sponsored AND an independent poll. Firms keep customer-type allocations non-overlapping by race: when a firm has a public media client for a race, it does not simultaneously take candidate money for that race. The reputational constraint appears to operate by avoiding the within-race comparison rather than by erasing the bias within it. Per-cell descriptive: mean within-cell (sponsored − indep) = +2.40 pp (sd 9.64), paired-t = 1.22, 15/24 (62.5%) cells positive.

Hypothesis: customer-segment-reputation
Confidence: green
Type: descriptive

Design

Sample: cells defined by (institute × muni_id) within the 24,868-row analysis sample (cand FE-identifiable subset of build/assemble/cand_poll.parquet). Of 4,510 firm-race cells, 263 have a sponsored poll; 2,965 have an independent poll; 24 have BOTH. The 24-cell sample contains 242 candidate-poll rows (30 sponsored) across 73 candidates.
Specification: Four nested specifications. Spec A: error ~ sponsored | cand FE (headline). Spec B: + firm FE (AN-059, identifies off within-firm variation across races). Spec C: error ~ sponsored | cand FE + (firm × race) FE on the 242-row mixed-cell subsample. Spec D: naive OLS on the 242-row subsample with muni-clustered SE. Spec C is the load-bearing test: a firm × race fixed effect absorbs everything constant within a (firm, race) cell, leaving only variation across sponsored / non-sponsored polls filed by the same firm in the same race.
Comparator: independent polls filed by the same firm in the same race
Cluster: muni_id

Script: source/analysis/an-061-firm-race-customer-discrim.py
Target: build/table/an-061-firm-race-customer-discrim.csv
Status: interpreted · 2026-06-14
Created: 2026-06-14

Question

Section 6's synthesis claims "the same firm produces both the unbiased polls it sells to media customers and the tilted polls it sells to candidates." AN-059 supports this by elimination: adding firm FE to the headline cand-FE specification moves the sponsor coefficient by 0.02 pp, so firm selection (sponsors hiring high-baseline-error firms) is not the mechanism. But AN-059's "within-firm" identification still uses variation across races within each firm. The most direct customer-segment test would restrict to comparisons WITHIN a single (firm, race) cell.

Design

source/analysis/an-061-firm-race-customer-discrim.py:

Identify (firm × race) cells with both ≥ 1 sponsored poll and ≥ 1 independent comparator. Define firm by institute.
Spec A: headline replicate on full sample (cand FE).
Spec B: full sample, cand FE + firm FE (≡ AN-059).
Spec C: mixed-cell subsample, cand FE + (firm × race) FE.
Spec D: same subsample, naive OLS with muni-clustered SE.

Results

Sample sizes

Total firm × race cells in the analysis sample: 4,510
Cells with a sponsored poll: 263
Cells with an independent poll: 2,965
Cells with both: 24 (0.5 % of all cells)

The mixed-cell subsample is structurally small. Firms that file sponsored polls in a race rarely also file independent comparators for that race.

Estimates

Spec	β_sponsored	SE	p	n
A: full, cand FE	+7.43	1.12	3 × 10⁻¹¹	24,868
B: full, cand FE + firm FE	+7.94	1.15	4 × 10⁻¹²	24,868
C: mixed cells, cand + (firm × race) FE	−2.18	2.30	0.34	242
D: mixed cells, naive	+3.42	1.85	0.06	242

On the rare cells where the same firm files for both customer types in the same race, the sponsor coefficient is essentially zero. The 95 % CI on Spec C is approximately [−6.7, +2.3] pp, ruling out the +7.4 pp headline on this subsample.

Per-cell descriptives

Cells with usable means on both sides: 24
Mean within-cell (sponsored − indep): +2.40 pp (sd 9.64)
Paired-t: 1.22 (n.s.)
Cells with sponsored > indep: 15 / 24 (62.5 %)

The point estimate is positive (sponsored polls run hotter than indep polls by the same firm in the same race) but the within-cell spread is large (sd 9.64 across 24 cells); the paired-t doesn't reach significance.

Interpretation

What the result rules out

The customer-discrimination claim as originally written --- "the same firm produces both the unbiased polls it sells to media customers and the tilted polls it sells to candidates" --- needs refining. Within a single firm × race cell, the same firm does NOT produce a sharply differentiated output for the two customer types. The +7 pp does not survive when the comparison is held to within-firm-within-race variation.

Where the +7 pp actually lives

The headline survives with firm FE (Spec B: +7.94) but collapses with firm × race FE on the mixed-cell subsample (Spec C: −2.18). The relevant identifying variation must therefore be:

Across-race within-firm: firms differentiate their output by which races they take which customer types for. A firm serving a candidate in race X does not simultaneously serve a public media client in race X; it serves media in some other race Y. The customer-segment discrimination is preserved at the firm portfolio level but not within any single race in which the firm is publicly serving multiple sides.

This refines the §6 synthesis. The reputational constraint operates by avoiding the within-race comparison rather than by erasing the bias within it. Firms keep customer-type allocations non-overlapping by race; when overlap does occur (0.5 % of cells), the firm's output looks roughly the same across the two customer types.

Two readings

Selection on the overlap cells. Firms that DO file both types in the same race may be specifically the firms that maintain consistency, because their reputational stakes in the media business require it. The 24-cell sample is non-random in the direction of attenuating any discrimination effect.
Within-race consistency as a credibility floor. Even firms that systematically tilt for candidates may be unable to sustain that tilt when the firm's other-customer poll in the same race contradicts it. The within-race comparison is the binding credibility constraint that disciplines the firm.

Both readings sharpen the reputation mechanism rather than contradicting it. The substantive policy implication --- that public visibility of pollster output is the constraint, and that mandatory disclosure of each firm's history would amplify the constraint --- survives.

Follow-ups

Universe-scale visibility × sponsor interaction (AN-060) complements this finding with a continuous-race-level test.
Customer-type-by-firm portfolio analysis. Test whether the distribution of customer types across a firm's races is non-random: do firms specialize in candidate races for specific muni types (low-visibility, low-media-coverage)?
Identify the 24 mixed-cell firms. Their willingness to file both types in the same race may itself be a reputational signal that future work could exploit.