Thinking and Open Ideas

Current open questions

Possible directions

Connections to literature

Methodological sketches

Importable mathematical frameworks

Several existing frameworks provide machinery we could use directly for formal results, rather than building from scratch.

Convex body shrinkage (Grünbaum 1960; Bertsimas & Vempala 2004)

VC dimension for affine classifiers (Vapnik 1995)

Niblett, Posner & Shleifer (2010) as the $k=1$ base case

Cutting-plane convergence (Kelley 1960)

Callander & Clark (2017) as direct predecessor to extend

Set-membership estimation (Schweppe 1968; Milanese & Vicino 1991)

Assessment: most tractable path is probably to build on Niblett / Callander & Clark tradition (economists recognize the setup), then import convex geometry and VC dimension results for comparative statics essentially for free. Version-space / cutting-plane connection gives formal language for propositions without proving underlying geometry ourselves.

Ideas to explore later

Miscellaneous notes

Formalizing holdings

Monotonicity idea

Monotonicity can be seen as one way of formalizing "holding": all cases beyond this threshold must be "grant." This captures the intuition that a holding draws a line in fact space and determines outcomes on one side of it. If the holding is monotone in the relevant dimensions, it means that making the facts "stronger" (moving further along the dimension) cannot flip the outcome.

Neighborhood / decaying binding force (Holger idea)

Instead of a holding binding globally, assume that cases only bind in a neighbourhood of their fact pattern, or that binding force decays with distance in fact space. This captures the intuition that a precedent is most constraining for similar cases and loses force as new cases become more factually distant. Could be formalized as a kernel or distance-weighted constraint.

Norm enforcement among judges

Kandori-style enforcement

For judges, the "viral tweet moment" — where some deviation from the law is made public — could serve as the Kandori marking device. Judges might shy away from deviation to avoid being associated once that moment comes. Or they can just claim to be shocked (like colleagues of the Stavanger lawyer). If you see something, you are obliged to report.

Low cost of enforcement

When you decide who to invite for dinner, it is not costly to just invite someone else if you think one person is "tainted." So norm enforcement is not necessarily costly — the cost of excluding a norm-violator from social/professional circles is low for any individual enforcer.

Socialization into law

Slippery slope connection

Jan 12 conceptual outline (11 points)

  1. Fact patterns are highly multidimensional and each past case is just one mapping from fact space to a decision. As a consequence there would always be multiple legal theories / models that could fit past cases perfectly. In and of itself, "following precedent" is not binding, since you never see a new case with exactly similar facts.
  2. For precedents to have binding effects they must also provide guidance on how to extrapolate from that case — what really are the facts relevant for the decision (the "holding"). A theory of the binding force of law must include a definition of "holding."
  3. Norm to follow past decisions (for good reasons — legal stability + judges should "just apply the law").
  4. The law must have (and has) a way of dealing with errors (overruling and distinguishing).
  5. Reasonable judges can disagree about which "model" fits the data best.
  6. Cases (and doctrine) have summary statements (or "restatements") about what the law is — a simplified model that fits past cases reasonably well. One must also allow for such statements to be "erroneous."
  7. Judges have preferences among competing models, but also want to make sure it looks like they are "just applying the law." So they are constrained to pick among models that fit the data.
  8. If forces in society are relatively equal, could get a situation where two competing theories coexist.
  9. Hierarchical stare decisis is not our interest — reversals etc provide incentives to lower court judges to follow precedents.
  10. Courts could in theory make the law crystal clear. But tradeoff between efficiency and clearness. Rules vs standards discussion. Many examples of SCOTUS coming up with bright-line rules when standards have failed (e.g., Miranda). But rules are brittle, gameable, non-adaptive, and usually also statically inefficient.
  11. Judges can always engage in fact discretion, but that won't change the law.

Bright-line rules as self-binding

The Supreme Court is acutely aware that lower-court judges differ ideologically, appellate review is imperfect, standards invite discretion, and discretion invites ideological drift.

Bright-line rules reduce interpretive degrees of freedom, make deviations observable, increase reputational and reversal costs, and create focal points for coordination across courts. This is especially important in politically salient domains, constitutional rights, and criminal procedure.

The judge-constraint logic dominates at SCOTUS because:

Why SCOTUS rarely says this explicitly: Openly saying "we adopt this rule to restrain judges" would undermine judicial legitimacy, admit indeterminacy, and invite political attack. So the Court talks about administrability, predictability, fair notice, ease of application — public-facing proxies for judge-constraint.

Synthesis for the project: The Supreme Court adopts bright-line rules when the coordination problem among judges is more severe than the information loss from rigidity. Or more starkly: bright-line rules are a technology for disciplining adjudicators under conditions of disagreement and limited monitoring.

Litigation underenforcement (secondary mechanism)

Standards increase uncertainty → uncertainty raises expected litigation cost → risk-averse plaintiffs don't file → violations go unchallenged → law becomes underenforced. Bright-line rules reverse this chain. But the Court tolerates underenforcement more readily than ideological drift, so this mechanism is secondary.

Interaction between the two mechanisms

Judge constraint → predictable outcomes → more litigation → more data → clearer law. The mechanisms reinforce each other but are not symmetric.

Higher courts and strategic bright-line rules

Higher-court judges can have strategic incentives to announce bright-line rules partly to constrain lower-court judges whose ideological priors they don't trust. When higher courts anticipate noncompliance or "slippage" by ideologically distant lower courts, they have reason to write more constraining doctrine.

What prevents overuse: Error costs and injustice at the margin (bright lines can be badly over/underinclusive); unanticipated future fact patterns (rules age poorly); need to assemble and maintain a majority (broad hard rules can lose swing votes); legitimacy concerns (looking like "legislating"); diminishing returns (even bright lines leave room for disagreement about classification and framing).

Vagueness as design choice / option value

Vagueness is often a feature that preserves equilibrium, not a failure of language or logic.

Law is vague when: coordination is stable without precision; local information matters; adaptation is valuable; monitoring is costly but tolerable.

Law becomes crisp when: coordination is fragile; ex ante guidance is critical; actors have incentives to defect; errors are catastrophic or systematic.

This exactly explains Miranda versus negligence.

Important caveat: Because the world is open-ended, no finite rule system can anticipate all future factual configurations without either becoming infinitely complex or reintroducing discretion via catch-alls. "Crystal clear in all situations" is achievable only relative to a given state of the world. But that's a practical limitation, not an inherent indeterminacy of law.

Framing thoughts

Alternative introduction framings (model-selection / hypothesis-class)

Framing 1: Full model-selection version

"The law" as a hypothesis class of decision rules over a multidimensional case space. Past decisions populate this space with noisy data. An admissible legal theory is any decision rule that fits the core of the precedent set sufficiently well. The threshold of admissibility is determined by professional norms and the risk of sanctions. A judge deciding a new case selects a theory from this admissible set, trading off: (i) fidelity to precedent (adequate fit), (ii) preference for simple, coherent functional forms (bright-line rules, linear thresholds), and (iii) substantive outcome preferences. Past precedents that don't fit are rationalized as "errors," distinguished on narrow grounds, or explicitly overruled. Landmark reversals (like the treatment of Roe) are episodes where a new majority reclassifies a cluster of precedents as errorful data points and re-fits the underlying model.

Three central implications

  1. Reconciles strong constraint with persistent polarization. Law constrains by shrinking the admissible set. Yet within the admissible region, ideology matters. Disagreement is most intense exactly where the law is most underdetermined.

  2. Explains observed simplicity of legal doctrine. Courts prefer low-dimensional, easily communicable decision boundaries (Hand-rule style linear tradeoffs, categorical thresholds) even when more complex mappings would better fit past cases. Simplicity preferences are a central determinant of which theories are selected.

  3. Generates path dependence and regime shifts. Early cases and random judge assignments push the legal system toward one region of model space. Composition changes trigger abrupt doctrinal shifts, as a new majority selects a different admissible theory and recodes existing outlier cases as mistakes.

Framing 2: More modest version

We do not claim to discover a new philosophy of adjudication. Jurisprudence has long emphasized that legal materials underdetermine outcomes, that interpretation trades off "fit" with justification and coherence, and that precedent is often permissive rather than strictly mandatory. Our contribution is to formalize these familiar ideas in a simple model-selection framework and connect them to observable data. Activities that loom large in legal practice — stating general rules (Hand formula), systematizing doctrine, writing restatements — are not mere rhetoric. They are attempts to freeze particular models of the case law into explicit decision rules that narrow the set of admissible interpretations for future judges.

Why EP doctrine is not settled (model-consistent explanation)

Despite decades of litigation and canonical tests, EP doctrine is not settled because precedent constrains the form of adjudication rather than its substance.

Key mechanisms:

  1. Doctrine settles form, not content. The hypothesis class is fixed; the optimal hypothesis is not. Tiers, tests, and named elements are settled. But what counts as compelling, how much fit is enough, and which factual dimensions deserve weight — these are not.
  2. Dense but underdetermining in high-dimensional fact space. EP cases vary along dozens of dimensions (facial vs functional classification, type/degree of stigma, nature of government interest, availability of alternatives, institutional setting, political/historical context). Even hundreds of cases don't densely cover this space.
  3. Tests are elastic by design. "Compelling," "important," "substantially related," "narrowly tailored," "discriminatory purpose" — these are containers that absorb disagreement while preserving the appearance of constraint. This allows adaptation, communicability, and legitimacy across ideological coalitions.
  4. New cases keep introducing new combinations of old dimensions. Race + technology, sex + athletics, sexual orientation + religious exemptions. Precedent rarely resolves how dimensions interact.
  5. EP is a site of ongoing moral and political disagreement. Persistent disagreement is expected where law is used to mediate unresolved normative conflict.
  6. The admissibility threshold prevents convergence. Judges must justify decisions as lawful, but many justifications remain plausible. This creates bounded disagreement rather than convergence.
  7. Composition changes reset which models are focal. A new majority reweights dimensions, reclassifies prior cases as "misapplied" rather than wrong, producing regime shifts without doctrinal collapse.

Summary: Despite decades of litigation, EP doctrine is not settled because precedent constrains the form of adjudication rather than its substance. Courts have fixed the hypothesis class — tiers of scrutiny, required elements, permissible evidentiary dimensions — but not the weights placed on those dimensions or their interaction in novel factual settings. In a high-dimensional fact space, even dense precedent leaves many admissible legal theories.

Model implications (5 big-picture)

  1. Law both constrains and underdetermines. Judges can't pick any rule; they must choose from models that fit precedent "well enough." But within the admissible set, preferences (ideological, policy, distributive) matter.

  2. Disagreement is endogenous and lawful. Liberal and conservative judges can both be "following the law" in a meaningful sense: they pick different admissible models, not pure willfulness. Polarization is strongest where the admissible set is large; consensus where it is small.

  3. Precedent is partly data, partly noise. Some precedents are treated as informative constraints; others as "error" or anomalies. Overruling/distinguishing is literally "data cleaning" in model space.

  4. Doctrinal shape is partly a simplicity prior. Courts prefer simple, stable rules (linear thresholds, bright lines) as long as they fit. When enough anomalies accumulate, they switch to a new, often still simple but differently oriented model.

  5. Path dependence and regime shifts. Early cases and random judge assignment can lock in a region of model space. Composition shocks (new majority) trigger a re-fit that reclassifies earlier cases as errors, causing abrupt doctrinal shifts.

Testable implications

A. Local underdetermination and polarization

Hypothesis: Cases in parts of the feature space where many models fit past decisions equally well will exhibit more frequent ideological splits, more dispersion across judges, and more instability over time.

How to test: Train multiple predictive models on past decisions (different algorithms, regularization, random seeds, feature subsets). For each new case, compute variation in predicted probabilities across models = measure of local underdetermination. Regress indicator for ideological split / vote margin on local model disagreement, controlling for salience, issue area. Prediction: higher model disagreement → higher probability of 5-4 along ideological lines.

B. Simplicity bias in doctrine

Hypothesis: Given the same set of precedents, the doctrine the court articulates will be simpler than what a purely predictive ML model would choose, even at the cost of some fit.

How to test: Pick an area with a crisp stated rule. Estimate a flexible model (random forest, boosted trees) and a simple model (linear threshold, 1-2 variables). Compare the court's stated rule (as a simple model) vs more complex ML models. Look for systematic underuse of available predictive structure — the court's "doctrine boundary" will typically look more linear / low-dimensional than the best predictive boundary.

C. Error-labeling and regime shifts

Hypothesis: When a court's composition shifts, opinions by the new majority will use more language framing old precedents as "errors" / "misreadings" / "departures from principle," and will especially target cases that generate large misfit relative to the new majority's preferred model.

How to test: Identify composition changes (before/after key appointment). Use text analysis on majority opinions: build dictionaries of error/correction language. Correlate increase in error-language with overruling, narrowing, or distinguishing older cases. Optionally link to model fit: fit a model on post-change cases, compute residuals for earlier ones, test whether high-residual precedents are most likely flagged as erroneous.

D. Case density and convergence

Hypothesis: As more cases accumulate in a region of feature space, the admissible model set shrinks and ideological dispersion decreases, overrulings become rarer, and doctrinal shifts require bigger "error corrections."

How to test: For each case, construct measures of local precedent density (number of prior cases with similar fact patterns, nearest neighbors in feature space). Relate precedent density to probability of ideological split, frequency of overruling, size of doctrinal shifts.

E. LLMs as measurement tools

Use an LLM (fine-tuned on past decisions) as a measurement device for the admissible model set. Perturb the prompt or training data slightly and see how often the model's prediction for a case flips. High flip frequency = locally underdetermined region. Then check whether those cases are exactly the ones where human judges most often split along ideological lines. This gives a "computational jurisprudence" angle: LLM instability as a proxy for legal underdetermination.

Requirements for model

Terms that have drifted vs remained stable

Terms that have drifted from original meaning

Terms that have NOT drifted

What makes terms stable

  1. Concrete referents — tied to observable acts ("lying under oath," "taking property")
  2. Institutional continuity — linked to specific legal rituals or procedures (writs, oaths, filings)
  3. Functional persistence — they serve enduring social purposes (property transfer, debt relief, dispute resolution)

Early model sketches

Holger's project description: "A Model of Law"

Three ingredients:

  1. Language is open to interpretation. The sense of words is not given but a matter of convention, which may change over time and depend on context. Interpretation can be modeled as a subjective distribution over other people's likely views of meaning. The distribution is neither flat nor degenerate — law provides both freedom and constraint.
  2. Own-side bias. Well-documented psychological phenomenon that skews individuals' subjective distribution over legal meaning towards their content preferences (political preferences, position in litigation). Explains partisan splits on SCOTUS.
  3. Deference to authority if the alternative is worse. People's relationship to law as a choice between the legally structured status quo and an unknown alternative (anarchy/turmoil). Decision-makers take decisions that a sufficiently large number of people will view as plausibly guided by the law. The binding force of law arises from the threat of upheaval when a decision seems too far out of line.

Key claims:

Pseudo-model setup (from Holger)

Dec 5 2025 model sketch

Model 1: Judges learning from past decisions

Assumptions:

  1. Facts in case $i$ observed by judge $j$ as $X_i + \varepsilon_{ij}$ where $\varepsilon_{ij}$ is iid across judges and cases. They also see the decision in past cases ("grant" or "deny").
  2. Judge $j$ at time $t$ "grants" if $X_i + \varepsilon_{ij} > X^*_{jt}$ for a threshold $X^*_{jt}$ = "that judge's interpretation of the law."
  3. A judge is punished if perceived by other judges to deviate "sufficiently" from their interpretation of the law.
  4. Judges select their threshold to avoid such punishment.

Expected results:

  1. A judge's threshold will roughly be determined by the point at which (the way this judge reads past cases) other judges tend to "grant." Newer cases given higher weights (since other judges' interpretation may change over time).
  2. In the long run, judges' thresholds will converge (since $\varepsilon_{ij}$ is iid) and the meaning of law will stabilize. There will still be uncertainty in factual interpretation of each case, but they will end up using the same threshold.

Adding judge biases: A richer model incorporates judge biases/preferences. Past cases are not perfectly informative of other judges' beliefs. In the long run one might statistically learn other judges' biases. Not clear what the consequences will be.

Issue: In this model, "stare decisis" is not assumed but appears as an endogenous feature. Not sure if that is good or not.

Model 2: Simplest possible model

Expected results:

  1. Judges impose their preferred decision as long as not blocked by precedent. E.g., Judge A prefers "liable" at $x = 0.4$; she holds the defendant liable iff no prior judge held a defendant not liable in a case with $x > 0.4$.
  2. The law is determined by initial cases and their judge assignments (path dependence).
  3. Understanding of the law converges over time.

Model 3: Extension with private information

  1. Each judge sees the case with error: $X + \epsilon$.
  2. Punished if they deviate too much from past cases (that cannot be justified by errors). Need to be precise about "too much" since there could be conflicting jurisprudence.

Expected results:

This model explains: (1) law has binding force in the short term, (2) preferences matter in borderline cases, (3) drift in jurisprudence over time.

Open question: How do judiciaries solve problems of conflicting jurisprudence in practice?