Literature

Foundational positioning

Schelling (1960)
- topic: focal points and coordination
- relevance: foundational idea that law serves as a focal point for judicial coordination
McAdams
- topic: expressive law and focal points
- relevance: develops Schelling's insight for legal settings; this paper aims to go further by explaining how the focal point is created, sustained, and improved
Segal & Spaeth (attitudinal model)
- topic: ideology as the primary driver of judicial decisions
- relevance: our model nests the attitudinal model as a special case when the feasible set is large (underdetermined law)
Epstein & Knight (strategic model)
- topic: judicial strategy and interdependence among judges
- relevance: our model adds holdings as strategic instruments shaping future doctrine
Hart, H.L.A. — "The Concept of Law" / "no vehicles in the park"
- topic: open texture of legal language; core of settled meaning plus penumbra of uncertainty
- relevance: our feasible set of admissible rules corresponds to Hart's penumbra; multiple outcomes can be justified by the letter of prior law; foundational justification for why judges must trade off precedent-fit with other factors (discussed Oct 20)
Dworkin — "Law's Empire" / chain novel theory
- topic: adjudication as continuation of a chain novel; each judge writes the next chapter but must be coherent with what came before; multiple principled interpretations can fit the same past decisions
- relevance: our admissible set is the set of all interpretations that achieve sufficient "fit"; judges choose from it by injecting ideology or simplicity preference, as Dworkin's judge would choose the best substantive justification (discussed Oct 20)
Kennedy, Duncan — "A Critique of Adjudication"
- topic: legal indeterminacy and the work required to deviate from accepted meaning
- relevance: "you need to do work to get away from the accepted meaning; you cannot just cite another judge" (discussed Oct 20)

Formal models of precedent and case law

Gennaioli & Shleifer (JPE 2007)
- topic: evolution of common law
- relevance: postulates law is known but judges can change it at a cost; distinguishing leads to gradual convergence, overruling causes oscillation; our model differs by building on inherent ambiguity of law (feasible set rather than known rule)
Fernandez & Ponzetto (JLEO 2012)
- topic: stare decisis — rhetoric and substance
- relevance: intermediate adherence to precedent leads to gradual legal shifts (not wild swings); aligns with our premise that precedent has binding force yet leaves a range of choice; contrasted with Holger's ambiguity-based approach
Niblett, Posner & Shleifer (2010) — "The Evolution of a Legal Rule," J. Legal Studies
- link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1665184
- topic: model of case law evolution; judges choose broader or narrower holdings affecting pace of legal change
- relevance: Holger noted our base model is essentially Niblett's; one-dimensional case law evolution with judge decisions and precedent; holding breadth maps to our linear constraint breadth (broad = stronger constraint, narrow = more future discretion)
Cameron, Kornhauser & Parameswaran (2019) — "Stare Decisis and Judicial Log-Rolls," RAND J. Econ.
- topic: how horizontal stare decisis can be sustained among ideologically diverse judges through implicit cross-case trades
- relevance: explains why polarized benches might still follow precedent — it is a mutually beneficial "truce"; judges choose dispositions, not rules — a rule emerges when heterogeneous judges each individually choose the same rule; "the legal literature disagrees about what in the prior decision 'binds' the judge: the disposition, the announced rule, or the articulated reasons"
Baker & Mezzetti (2012) — "A Theory of Rational Jurisprudence," J. Political Economy
- topic: dynamic model of judge-made law under uncertainty; court follows precedent and decides new cases by analogy
- relevance: endogenously produces analogical reasoning as optimal; connects to our idea that judges adhere to precedent as a constraint while exercising discretion in novel fact scenarios
Callander & Clark (2017) — "Precedent and Doctrine in a Complicated World," APSR
- topic: game-theoretic model of judicial learning in multi-dimensional fact space; higher court develops doctrine from limited cases
- relevance: closest competitor/complement — models precedent as progressively shrinking a feasible set in a complex fact space, leading to path-dependent evolution; our model distinguishes itself by formalizing holdings as linear constraints and explicitly accounting for judges' ideological utility
Fox & Vanberg (2014) — "Narrow versus Broad Judicial Decisions," J. Theoretical Politics
- topic: whether judges should craft narrow or broad rulings; challenges "judicial minimalism" by showing broad rules can be optimal under ignorance
- relevance: directly relevant to our citation-incentives / breadth tradeoff; judges influence which cases arise later through holding breadth; connects to endogenous bright-line emergence in EP application
Cameron & Kornhauser (2005) — "Modeling Collegial Courts I & II"
- topic: formal models of how multi-judge courts produce decisions and doctrines; majority coalitions and opinion-writing strategy
- relevance: need to keep a majority can lead to narrower holdings than any judge individually prefers; explains why doctrine usually evolves incrementally; median-driven dynamics reinforce path dependence; regime shifts occur when medians shift

AI & Law: case-space models with explicit outcome/rationale separation

This tradition is the closest formal predecessor to our model. Cases are vectors of factors/dimensions, doctrine is a classifier or decision boundary, and — crucially — the binding content is in the rationale (which factors/dimensions are treated as decisive), not just the outcome label. Our contribution relative to this tradition: continuous geometry (linear constraints on affine rules in R^k) rather than discrete factor orderings; explicit judicial utility with ideology, sanctions, and citation incentives; dynamic feasible-set evolution with overruling.

Ashley & Rissland (HYPO, 1980s–90s); Aleven & Ashley (CATO, 2000s)
- topic: represent cases as vectors of legally salient factors; HYPO builds argumentation about which factors control outcomes; CATO models why a precedent applies or can be distinguished by identifying relevant factors and higher-level reasons
- relevance: doctrine-as-classifier in factor space with explicit "why" beyond labels; closest AI & Law precursor to our holdings-as-constraints idea; CATO's distinction/extension reasoning parallels our feasible-set narrowing/widening
Bench-Capon & Sartor (2003) — theory construction from cases
- topic: reasoning with cases as constructing and evaluating a theory (rules + values) that explains outcomes; cases described by factors/dimensions, doctrinal content is the theory that justifies and generalizes
- relevance: directly separates outcome from holding/rationale; "case outcomes constrain acceptable theories" matches our "holdings constrain admissible rules"; the constructed theory generalizes beyond the instant case, as our linear constraints do
Bench-Capon, Atkinson et al. — "Dimensions" programs
- topic: replace discrete factors with ordered dimensions (more/less of something legally relevant); yields a more geometrically natural boundary; explicitly about which regions of the space are controlled by which precedents
- relevance: closest to our geometric approach within AI & Law; ordered dimensions are a step toward our continuous R^k fact space; "intermediate factors" that matter for explanation even if binding constraint is defined at a lower level
Prakken & Sartor — logical reconstructions of case-based reasoning
- topic: reconstruct HYPO/CATO-style reasoning as generating defeasible rules ("if these pro factors and not these con factors, then outcome"); subsequent cases attack, distinguish, or defeat rules
- relevance: our "holding = linear constraint on rule parameters" is a continuous analogue of "case ⇒ defeasible rule"; subsequent holdings refine which rules remain acceptable; the defeasibility maps to our overruling mechanism
Clark & Lauderdale (2010) — "Locating Supreme Court Opinions in Policy Space"
- topic: locate opinions and disputes in a spatial "doctrine" dimension using citation patterns and treatments of precedent
- relevance: doctrine as structured geometry of positions; the rationale signal comes through how opinions relate to prior precedents (affirm/criticize); provides empirical methods for estimating our model's doctrine space from real data
Branting (1993) — "A Computational Model of Ratio Decidendi," AI & Law 2(1): 1–31; also (1991) "Reasoning with Portions of Precedents," ICAIL; (2003) "A Reduction-Graph Model of Precedent," AI
- topic: explicit computational representation of ratio decidendi as a justification structure (chain/graph of reasoning steps linking abstract predicates and case-specific facts); a precedent's effect depends on the theory/justification under which it was decided, not merely the outcome; "portions of precedents" shows later cases combine selective parts of prior justifications
- relevance: most direct predecessor to our "outcome ≠ holding; holding constrains future rule choice"; our linear-constraint holding is the geometric cousin of Branting's "theory controls precedential effect"; his reduction-graph (multiple interpretive theories per case) maps to our feasible set containing multiple admissible rules consistent with the same outcome
Horty & Bench-Capon (2012) — "A Factor-Based Definition of Precedential Constraint"
- topic: precise formal account of when precedent forces outcomes using factors/reasons pro/con each side; distinguishes minimalist "result-model" constraint from richer "reason-model" constraint; the ratio corresponds to a subset/structure of reasons that must be preserved
- relevance: closest formal cousin to our admissible-set idea in reason-set rather than parameter-set space; "binding content is in the reasons, not just the outcome" parallels our "holding constrains admissible (w,c), not just d"; key difference: they use discrete reason sets, we use continuous linear constraints
Rigoni (2015; 2018) — "An Improved Factor-Based Approach to Precedential Constraint," AI & Law; "Representing Dimensions within the Reason Model," AI & Law
- topic: refines reason-model to address edge cases; extends to ordered dimensions (magnitudes) showing how naive treatments collapse important distinctions; addresses how to keep ratio-level constraint from collapsing into mere outcome matching
- relevance: dimensions with ordered magnitudes are where our hyperplane/threshold geometry is most naturally compared; Rigoni is working on the same "what exactly binds?" problem with different formal objects
Prakken (2021) — "A Formal Analysis of Some Factor- and Precedent-Based Accounts of Precedential Constraint," AI & Law 29(4): 559–585
- topic: systematic comparison of leading formal accounts (result vs reason models, factor vs dimension extensions); highlights when they succeed/fail at capturing ratio vs dicta distinctions
- relevance: useful for positioning our "holding as constraint set" as (i) addressing known collapse/identification issues in the literature and (ii) offering a cleaner geometric object (polytope over rule parameters) than some reason/dimension constructions
van Woerkom, Verheij & Prakken (2023) — "Hierarchical Precedential Constraint," ICAIL '23; Prakken & van Woerkom (2025) — "Defending the Hierarchical Result Models," arXiv
- topic: generalizes precedential constraint to factor hierarchies with multi-step reasoning from base factors to intermediate factors to outcomes; holding can bind at different abstraction levels than the raw outcome
- relevance: directly relevant to our entailment constraint — which projection of the fact space should the holding constrain?; hierarchies formalize the idea that "what binds" may live at a different explanatory level; our affine-rule constraints are implicitly a choice of abstraction level

Underdetermination, admissible sets, and legal indeterminacy

Re (2021) — "Precedent as Permission," Texas Law Review
- topic: reconceptualizes precedent not primarily as constraint but as permission; distinguishes "permissive aspect" (enabling certain outcomes) from "prohibitory aspect" (forbidding departures)
- relevance: directly informs our admissible-set concept; prior cases carve out a range of permissible rulings within which judges exercise choice; our notion that precedent "shrinks" the feasible set without collapsing it to a point is an application of Re's insight
Horty (2004; 2019) — "The Result Model of Precedent," Legal Theory; and related AI & Law work
- topic: uses logic to capture precedent's indeterminacy; a precedent constrains future cases only when new facts are "at least as strong" for the winning side (a fortiori); defines rigorous ordering of case "strength"; hierarchical extensions formalize multi-level factor structures where the explanatory rationale may differ from the minimal constraint needed for correct precedential control
- relevance: the result model's zones of determined outcomes vs. zones of indeterminacy mirror our feasible set approach; past case sets a boundary in fact-space beyond which outcomes are constrained, with freedom elsewhere; provides formal logic counterpart to our geometric framework; the hierarchical extensions connect to our entailment constraint (holding ≠ outcome)
Alexander & Sherwin (2008) — "Demystifying Legal Reasoning"
- topic: underdeterminacy — legal rules and past decisions often do not dictate a unique answer; "constraint by precedent as a spectrum" from heavily restrictive to merely suggestive
- relevance: underscores that binding force is a matter of degree; each new holding tightens the hypothesis space but rarely to a singleton; our intersecting linear constraints formalize this spectrum of partial constraint

Rules versus standards

Kaplow (1992) — "Rules Versus Standards: An Economic Analysis," Duke L.J.
- topic: rules (precisely defined ex ante) vs. standards (open-ended, applied ex post); rules costlier to promulgate but cheaper to apply
- relevance: connects to endogenous bright-line emergence in our model; judges may convert standards into rules when uniformity benefits outweigh flexibility loss; EP application's shift from multi-factor balancing to tiered scrutiny can be understood through this lens
Cross, Jacobi & Tiller (2012) — "A Positive Political Theory of Rules and Standards," U. Ill. L. Rev.
- topic: positive political theory of how courts choose between rules and standards; key factors: ideological alignment, case-fact distributions, institutional heterogeneity
- relevance: in polarized environments, judges issue rigid rules to bind ideological opponents; supports our model's prediction that judges with strong ideology prefer bright-line holdings to minimize dilution by differently inclined colleagues
Lax (2007, 2011) — series on "political constraints on legal doctrine"
- topic: Supreme Court strategically crafts doctrine anticipating lower court application; bright-line rule when fearing noncompliance, standard when trusting lower courts
- relevance: our horizontal model is the single-court analog — a judge chooses broad holdings (rules) to constrain future panels that might be ideologically different; parallels our breadth tradeoff
Schauer (1991) — "Playing by the Rules"
- topic: why legal systems employ rules despite over- and under-inclusiveness; rules cabin discretion and lead to consistency; but strict rules produce hard cases and pressure for exceptions
- relevance: captures the interplay our model generates — judges oscillate between creating rules for clarity and carving out exceptions for fairness, producing doctrinal drift

Judicial coordination and norm enforcement

Kornhauser (1989) — "An Economic Perspective on Stare Decisis," Chi.-Kent L. Rev.
- topic: why judges follow precedent absent external enforcement; coordination value of precedent as a Schelling focal point; precedents create a framework within which judges operate
- relevance: recommended by Holger (Dec 9); reinforces that precedent is a self-enforcing equilibrium constraint allowing strategic maneuvering; our judges' "precedent-fit" objective captures this coordination value
Landes & Posner (1976) — "Legal Precedent: A Theoretical and Empirical Analysis," J. Law & Econ.
- topic: body of precedents as capital stock yielding benefits (predictability) that depreciates over time; judges "invest" by creating clear precedents
- relevance: even self-interested judges have long-term interest in upholding precedent to preserve legal capital; our judges adding constraints invest in clarity; distinguishing/overruling acknowledges depreciation; regime shifts are capital replacement
Bueno de Mesquita & Stephenson (2002) — "Informative Precedent and Intrajudicial Communication," APSR
- topic: how courts use opinions and precedent to communicate information and instructions
- relevance: a judge's choice of bright-line holding vs. ambiguous one is a communication device to peers; connects to our breadth choice as signaling to future panels
Sagua & Wolitzky (JPE 2018)
- topic: first attempt to investigate whether transparency facilitates collusion
- relevance: private monitoring in cartels; possible connection to judicial monitoring and norm enforcement (discussed Oct 16)

Ideology and judicial behavior (empirical)

Brown & Epstein (2023)
- topic: US Supreme Court voting patterns and presidential loyalty
- relevance: Roberts Court as most "anti-president" court; partisan and loyalty bias; judicial supremacy
Epstein & Posner (2016)
- topic: Supreme Court justices' loyalty to the appointing president
- relevance: original empirical study this project grew out of; loyalty replication now archived
Sunstein, Schkade & Ellman (2004) — "Ideological Voting on Federal Courts of Appeals," Va. L. Rev.
- topic: judges' political ideologies strongly influence outcomes; panel effects (mixed panels moderate, unified panels amplify ideology)
- relevance: motivates our assumption that judges have ideal points they pursue; precedent channels ideology without eliminating it; panel composition changes produce different collective outcomes, which our path dependence captures
Lindquist & Cross (2005) — "Empirically Testing Dworkin's Chain Novel," NYU L. Rev.
- topic: how precedent's constraining force evolves as case law grows; in early cases ideology dominates, then precedent constrains more, but past a threshold more precedent makes it easier for judges to find support for preferred outcomes
- relevance: striking confirmation of our model — constraint peaks then diminishes as the feasible set grows rich enough for judges to selectively cite; too much precedent creates ambiguity; directly supports our notions of doctrine drift and selective distinguishing

Empirical measurement of doctrinal constraint (proxies for admissible-set size)

No existing work measures a literal admissible set, but several traditions approximate it. The model predicts: large admissible set → more disagreement, lower predictability, wider semantic dispersion; small set → convergence, predictability, rigidity. Key proxy families:

Disagreement-based: dissent rates, vote splits (2-1 vs 3-0), en banc fragmentation. Dissent rate ≈ width of feasible rule set.

Predictability-based: ML accuracy, classification entropy, outcome entropy conditional on facts. Prediction difficulty ≈ diameter of admissible set.

Variance decomposition: random judge assignment isolates how much judge identity explains outcomes after controlling for facts/precedent. Large judge effects → doctrine leaves room for choice.

Citation structure: precedent centrality (PageRank), citation dispersion, negative treatment frequency, citation half-life. Dense reliance on few precedents → narrow set; frequent distinguishing → constraint relaxation.

Semantic: embedding-based dispersion of opinions on similar issues; tight clusters → narrow set. Drift of key terms ("undue burden," "strict scrutiny") over time.

Epstein, Landes & Posner (2011) — "Why (and When) Judges Dissent"
- topic: when and why judges dissent; dissents increase with ambiguity and stakes
- relevance: dissent rates as direct proxy for admissible-set width; low dissent → narrow feasible set, high dissent → many defensible outcomes
Sunstein et al. (2006) — "Are Judges Political? An Empirical Analysis of the Federal Judiciary"
- topic: comprehensive empirical study of ideology and panel effects across case types
- relevance: extends the 2004 paper; shows doctrine sometimes leaves room for competing interpretations even on mixed panels; variation across legal domains proxies for variation in admissible-set size
Fowler et al. (2007) — "Network Analysis of the Supreme Court"
- topic: citation network structure; identifies hub precedents using PageRank and authority scores
- relevance: central precedents = binding constraints in our model; citation dispersion indicates doctrinal openness; network structure could proxy for constraint geometry
Random judge assignment studies (Kling; Chen & Yeh; Iaryczower & Shum)
- topic: exploit random assignment of judges to cases to identify causal effect of judge identity on outcomes
- relevance: cleanest identification strategy for admissible-set size — if judge identity explains outcome variance after controlling for case features, the residual = discretion within feasible set; directly tests our prediction that ideology matters more when doctrine is unsettled
Medvedeva, Vols & Wieling (2020) — ECtHR case outcome prediction
- topic: ML prediction of European Court of Human Rights outcomes; prediction difficulty varies across legal domains
- relevance: cross-domain variation in predictability maps to cross-domain variation in admissible-set size; some doctrines far less determinate than others
Livermore & Rockmore (2019) — computational analysis of semantic drift in judicial opinions
- topic: NLP methods tracking how meaning of legal terms and reasoning patterns shift over time
- relevance: semantic drift ≈ movement of feasible set center; dispersion of opinion embeddings ≈ volume of admissible set; tools for measuring our model's doctrinal drift empirically

Path dependence and regime shifts

Hathaway (2001) — "Path Dependence in the Law," Iowa L. Rev.
- topic: three mechanisms of legal path dependence: increasing returns (costlier to deviate from established rules), evolutionary/adaptive (early cases set direction), and sequencing (early contingent events lead law down one of multiple paths)
- relevance: directly mirrors our model — order of fact patterns and ideology of deciding judges lead to different doctrinal endpoints from same legal principles; overruling costs increase as doctrine becomes established; regime shifts occur when accumulated pressures break the path
Spriggs & Hansford (2001) — "Explaining the Overruling of U.S. Supreme Court Precedent," AJPS
- topic: empirical predictors of overruling — ideological distance between current and precedent-setting Court, negative subsequent treatments, doctrinal confusion
- relevance: supports our model's overruling mechanism; distinguishing erodes precedent force until formal overruling is almost a formality; their thresholds for overruling inform our parameter choices

Holdings: definition and scope

Stearns & Abramowicz (2005) — "Defining Dicta," 57 Stan. L. Rev. 953
- topic: rigorous definition of holding vs. dicta; holding = propositions along the chosen reasoning path that are actually decided, based on case facts, leading to the judgment
- relevance: justifies our treatment of holdings as extractable constraints; the breadth of a holding is partly in the opinion author's hands; later courts may dispute it by calling statements dicta; our distinguishing mechanism captures this contestability

Prediction and model selection

Katz, Bommarito & Blackman (2017) — "A General Approach for Predicting the Behavior of the Supreme Court," PLOS ONE
- topic: ML prediction of Supreme Court outcomes (~70% accuracy); features include issue area, lower court, ideology scores
- relevance: treats judging as a predictability problem, resonating with our hypothesis-class view of law; 70% success shows much is law-governed while 30% error leaves room for ideology and novel facts; unpredictable cases are those where feasible set is broad; judges in our model perform a kind of online learning
Ruger et al. (2004) — "The Supreme Court Forecasting Project," Columbia L. Rev.
- topic: statistical model slightly outperformed legal experts in predicting SCOTUS outcomes
- relevance: validates that precedent defines the playing field (it is predictive because judges follow it); each precedent adds information / reduces entropy about future cases; our linear constraints are an information measure removing uncertainty

Slippery slopes and doctrinal drift

Volokh (2003) — "Mechanisms of the Slippery Slope," Harv. L. Rev.
- topic: categorizes slippery slopes including precedent type (initial decision used to justify broader rules) and attitude type (attitudes shift after initial step, lowering resistance)
- relevance: our model formally captures the precedent slippery slope — each holding becomes a constraint that can be extended; judges push slightly further next time, distinguishing the prior holding's limit; over time the threshold moves while each step cites continuity

Formal analogues from learning theory and optimization

The model's core structure — a convex feasible set of admissible rules that shrinks via intersection with linear constraints as new observations arrive — has near-exact parallels in several literatures. These provide ready-made tools for comparative statics (how fast does doctrine rigidify? why do early precedents matter disproportionately?) and formal positioning for econ audiences.

Mitchell (1977, 1982) — Version Spaces / Candidate Elimination Algorithm
- citation: Mitchell, T. (1977) "Version Spaces: A Candidate Elimination Approach to Rule Learning," PhD Thesis, Stanford; (1982) "Generalization as Search," Artificial Intelligence
- topic: learner maintains set of hypotheses consistent with all labeled examples; each new example eliminates inconsistent hypotheses; when hypotheses are linear classifiers, version space is a polytope defined by linear inequalities
- relevance: closest mathematical analogue — precedents = labeled examples, admissible legal rules = version space, new holdings = constraints shrinking the space; monotone shrinkage via intersection is identical to our F_{t+1} = F_t ∩ H_t; candidate elimination maintains upper/lower bounds, paralleling how doctrine narrows from both sides
Kelley (1960) — Cutting Plane Methods
- citation: Kelley, J. (1960) "The Cutting-Plane Method for Solving Convex Programs," J. SIAM
- topic: iteratively add linear constraints ("cuts") that exclude infeasible regions; feasible set = shrinking polytope
- relevance: near one-to-one geometric match — each precedent = hyperplane cut in rule space, doctrine = intersection of cuts; provides convergence results and tools for analyzing how quickly the feasible set tightens
Vapnik (1995) — Statistical Learning Theory / VC Dimension
- citation: Vapnik, V. (1995) "The Nature of Statistical Learning Theory," Springer
- topic: selecting functions from constrained hypothesis classes based on consistency + complexity penalties; VC dimension measures capacity of hypothesis class
- relevance: mirrors our judges' tradeoff — fit precedent (training error), prefer simplicity (capacity control), ideology (bias); VC dimension could formalize "how much doctrine can a given constraint language express"; structural risk minimization = optimal holding breadth
Set-Membership Estimation (Schweppe 1968; Milanese & Vicino 1991)
- topic: unknown parameter vector belongs to feasible polytope; new observations impose linear constraints shrinking the polytope; used in robust control and system identification
- relevance: direct structural analogue — unknown legal rule ≈ unknown parameter, holdings ≈ observations, admissible doctrines ≈ feasible parameter set; provides well-developed tools for analyzing polytope diameter, volume, and convergence rate; novel connection for law & econ readers
Convex body shrinkage / Center-of-Gravity methods (Grünbaum 1960; Bertsimas & Vempala 2004)
- topic: each new hyperplane constraint reduces feasible volume by a constant fraction; used in active learning and online optimization
- relevance: provides quantitative tools for binding force = volume reduction rate, why early precedents matter disproportionately (largest volume removed first), how overruling resets feasible volume; could give formal comparative statics on doctrinal rigidity over time
Passive-Aggressive Algorithms (Crammer et al. 2006)
- citation: Crammer, K. et al. (2006) "Online Passive-Aggressive Algorithms," JMLR
- topic: minimal parameter update sufficient to satisfy new constraint; update magnitude proportional to constraint violation
- relevance: "minimal update consistent with new data" parallels judicial minimalism — judges minimally adjust doctrine to accommodate new precedent; connects to our entailment constraint (holdings must be outcome-entailed, not more)
List & Pettit (2002+) — Judgment Aggregation
- topic: logical consistency constraints define admissible collective judgments; shows that majority voting can produce inconsistent collective judgment sets even from individually consistent ones (discursive dilemma)
- relevance: multi-judge courts selecting doctrines consistent with prior commitments; feasible judgment sets as constraints on collective legal reasoning; potential formal tool for extending our model to panel decision-making

Explicit bridges: learning theory applied to legal doctrine

These papers explicitly connect version spaces / learning theory to legal reasoning — not NLP prediction, but doctrine-as-learned-structure. The Rissland line (1986–2003) is the earliest prior art for our feasible-set framing. The newer papers (Hartline 2022, Dutz 2025) are the closest modern competitors. Key finding: no existing work uses continuous geometry (polytopes, cutting planes, linear constraints in R^k) for doctrine. The connection was noted using discrete/symbolic version spaces, but never developed with our affine-rule framework, judicial utility, or equilibrium analysis.

Rissland & Collins (1986) — "The Law as a Learning System"
- topic: treats doctrinal development as a learning process; explicitly analyzes sequences of cases as training examples for Mitchell-style candidate elimination / version spaces; uses concrete doctrinal areas to illustrate how concepts/rules evolve
- relevance: earliest explicit "common law as version-space learning" framing; must cite as conceptual predecessor; our model develops the geometric structure they sketch — continuous affine rules instead of discrete concepts, judicial utility instead of passive learning
Rissland (1990) — "Stepping Stones to a Model of Legal Reasoning"
- topic: foundational AI & Law piece connecting legal reasoning to learning and representation; explicitly references candidate elimination as part of how legal concepts are refined from examples
- relevance: strengthens the Rissland lineage; "how representation + learning interact in legal reasoning" is essentially our question, formalized differently
Ashley & Rissland (2003) — "Law, Learning and Representation," Artificial Intelligence
- topic: explicitly discusses legal learning/CBR in version-space terms; ties doctrinal change to search/refinement in a space of concepts
- relevance: cleanest bridge text in mainstream AI journal form; "doctrine as a learned structure constrained by cases, not merely a classifier trained for accuracy" — this is exactly our framing; key distinction: they work with symbolic factor-based representations, we work with continuous linear constraints
Hartline, Linna, Shan & Tang (2022) — "Algorithmic Learning Foundations for Common Law," arXiv / CSLaw
- topic: models common-law system explicitly as an online learning problem; information arrives through litigated cases; settlement prevents information revelation; focuses on whether/when the system "learns" efficiently given litigation incentives
- relevance: rare modern formal "common law as learning algorithm" paper that engages institutional features (costs, case selection); complements our model by analyzing information aggregation where we analyze constraint accumulation; does not model holdings separately from outcomes
Dutz, Shao, Blum & Cohen (2025) — "A Machine Learning Theory Perspective on Strategic Litigation," arXiv
- topic: cases as points in instance space; high court labels; lower court learns rule from precedent; strategic litigants "teach" by selecting cases to shape the learned rule; discusses overturning
- relevance: very close in spirit — "holdings as instruments" formalized with learning-theory primitives; their strategic case selection maps to our citation incentives; key difference: they model the litigant-side teaching problem, we model the judge-side constraint problem with ideology and sanctions

Psychological foundations

DeKay (2015)
- topic: predecisional information distortion and self-fulfilling prophecy of early preferences in choice
- relevance: psychological foundation for own-side bias in legal interpretation (cited in Holger's project description)
Russo (2015)
- topic: predecisional distortion of information
- relevance: psychological foundation for biased signal interpretation by judges (cited in Holger's project description)