Methodology - Fault Line

Overview

The Fault Line provides a transparent, evidence-based assessment of structural vulnerabilities facing frontier AI laboratories. Rather than predicting outcomes, it monitors and quantifies dependencies and risk factors that could affect an organization's ability to operate, scale, or adapt.

Each lab receives a Fragility Score from 0–10, where higher scores indicate greater systemic fragility. Scores are derived from a simple checklist of binary indicators, each supported by publicly verifiable news events and sources.

Scoring Formula

The total fragility score is calculated as:

Total Score = (Compute + Cloud + Policy + Demand + Societal Impact + Talent & Governance) − Resilience

Each dimension contributes 0–2 points of fragility, except Resilience which reduces the total score by demonstrating risk mitigation.

The final score is clamped to the range 0–10.

Confidence-Weighted Variant

Alongside the binary score, Fault Line displays a confidence-weighted score. Events classified with low confidence carry 0.5× weight; medium confidence carries 0.75×; high confidence carries full weight. This variant helps users understand the uncertainty range — if the weighted score diverges significantly from the binary score, many indicators rest on weaker evidence.

Dimensions Explained

💾 Compute & Chips Dependence (0–2 pts)

Measures reliance on specific hardware vendors and exposure to supply chain disruptions. Labs with single-vendor GPU strategies or documented supply constraints score higher.

☁️ Cloud Concentration (0–2 pts)

Measures dependency on hyperscaler partnerships for training and inference infrastructure. Exclusive partnerships, deep integrations, and high switching costs contribute to fragility.

🏛️ Policy & Geopolitical Exposure (0–2 pts)

Measures sensitivity to regulatory action, export controls, and political shifts. Labs operating across jurisdictions with pending regulations or active investigations score higher.

📈 Demand & Commercialization (0–2 pts)

Measures revenue sustainability and market position risks. Signals of demand weakness, customer churn, or capex/opex overhang relative to adoption increase this score.

🛡️ Resilience Moves (0–2 pts, inverted)

Measures proactive risk mitigation. Multi-sourcing strategies, diversified infrastructure, long-term contracts, and demonstrated redundancy reduce the total fragility score.

🌍 Societal Impact (0–2 pts)

Measures signals of workforce displacement, misinformation amplification, privacy/surveillance concerns, power concentration, and safety incidents attributed to AI deployment.

👥 Talent & Governance (0–2 pts)

Measures leadership instability, key departures, board dysfunction, and organizational governance risks. The historical record contains 15+ high-impact events of this type — the Altman firing, Leike departure, Sutskever exit, and similar episodes — that the original five-dimension checklist could not score.

Checklist Items

Each dimension is scored based on specific, observable indicators:

A) Compute & Chips Dependence

A1

Single GPU Vendor Lock-in

Evidence of tight coupling to a single GPU vendor (e.g., NVIDIA-only training strategy)

+1 pt

A2

Supply Constraints Impact

Evidence of supply constraints or delivery risk impacting roadmap or operations

+1 pt

B) Cloud Concentration

B1

Single Hyperscaler Dependence

Primary dependence on one hyperscaler for training/inference infrastructure

+1 pt

B2

Platform Lock-in Signals

Switching costs, exclusivity signals, or deep integration with single provider

+1 pt

C) Policy & Geopolitical Exposure

C1

Export Control Exposure

Exposure to export controls or cross-border restrictions affecting compute/chips

+1 pt

C2

Regulatory Sensitivity

High sensitivity to regulatory action (antitrust, safety regulation, procurement bans)

+1 pt

D) Demand & Commercialization

D1

Demand Weakness Signals

Credible signals of demand weakness or monetization challenges

+1 pt

D2

Capex/Opex Overhang

Evidence of capex/opex overhang relative to adoption (overbuild, runway strain)

+1 pt

E) Resilience Moves (Mitigations)

E1

Multi-sourcing Demonstrated

Demonstrated multi-sourcing or diversification (multi-cloud, alternative accelerators)

−1 pt

E2

Risk Reduction Actions

Concrete risk-reduction actions (long-term contracts, redundancy, modular deployment)

−1 pt

F) Societal Impact

F1

Workforce & Societal Disruption

Evidence of significant workforce displacement, misinformation amplification, or privacy/surveillance concerns attributed to AI deployment

+1 pt

F2

Power Concentration & Safety Incidents

Evidence of market power concentration, safety incidents, or actions increasing centralization without oversight

+1 pt

G) Talent & Governance

G1

Key Leadership Departure

Departure of CEO, CTO, chief scientist, or other critical leadership figure creating organizational instability

+1 pt

G2

Board & Governance Instability

Board dysfunction, governance disputes, organizational restructuring, or mass talent exodus signaling instability

+1 pt

Evidence Requirements

Each checklist item must be supported by:

At least one verifiable news event with a source URL
Date within the decay window (default: 180 days)
Confidence rating (low/medium/high) based on source reliability

Items without recent supporting evidence automatically expire and are no longer counted in the score.

Decay Window

Evidence expires after 180 days by default unless reaffirmed by a new event. This ensures the tracker reflects current conditions rather than historical snapshots.

When an item is supported by multiple events, the most recent event date determines the expiration.

Contested Items

If contradictory evidence exists for a checklist item (e.g., both signals of lock-in and diversification), the item is marked as "contested" and requires manual review before affecting the score.

Contested items are displayed in the UI with both supporting and contradicting evidence visible.

Data Sources

Fault Line ingests from multiple source tiers to catch signals at different stages:

Tier 1 — Regulatory & Institutional: SEC EDGAR company filings (Microsoft, Alphabet, Meta), FTC press releases, DOJ press releases, EU AI Office publications. These catch regulatory signals before they reach mainstream news.
Tier 1 — Lab Blogs: Official blogs from OpenAI, Anthropic, Google DeepMind, and Meta AI.
Tier 1–2 — Tech & Science News: TechCrunch, VentureBeat, Reuters, MIT Technology Review, Nature, Ars Technica, The Verge, Wired, and Science Magazine.

Data Pipeline

The tracker updates automatically via the following process:

Ingestion: RSS feeds and curated sources are checked for new articles
Classification: Articles are mapped to labs, dimensions, and checklist items
Deduplication: Duplicate events are detected and merged
Scoring: Checklist states are updated and scores recalculated
Publication: JSON data files are committed to the repository

The pipeline runs daily via GitHub Actions. Manual event submissions are accepted via GitHub Issues.

Limitations & Caveats

This tracker has important limitations:

Public information only: Private deals, internal metrics, and unreported events are not captured
Binary indicators: Nuance and degree are not well represented by pass/fail items
Lag: News may lag actual events by days or weeks
Selection bias: Source selection affects what events are captured
Not predictive: Fragility scores measure exposure, not likelihood of negative outcomes

Users should treat scores as a starting point for analysis, not definitive assessments.

Contributing

This project welcomes contributions. You can help by:

Submitting news events via GitHub Issues
Proposing new checklist items or methodology improvements
Reviewing and validating existing event classifications
Improving the data pipeline or frontend code

All contributions are subject to review to maintain data quality and methodological consistency.

📐 Methodology

Overview

Scoring Formula

Confidence-Weighted Variant

Dimensions Explained

💾 Compute & Chips Dependence (0–2 pts)

☁️ Cloud Concentration (0–2 pts)

🏛️ Policy & Geopolitical Exposure (0–2 pts)

📈 Demand & Commercialization (0–2 pts)

🛡️ Resilience Moves (0–2 pts, inverted)

🌍 Societal Impact (0–2 pts)

👥 Talent & Governance (0–2 pts)

Checklist Items

A) Compute & Chips Dependence

B) Cloud Concentration

C) Policy & Geopolitical Exposure

D) Demand & Commercialization

E) Resilience Moves (Mitigations)

F) Societal Impact

G) Talent & Governance

Evidence Requirements

Decay Window

Contested Items

Data Sources

Data Pipeline

Limitations & Caveats

Contributing