Safety & Governance
I avoided AI governance for longer than I'd like to admit. Every time a new regulation headline popped up — the EU AI Act, a US executive order, some new country's "AI strategy" — I'd skim the summary, nod vaguely, and go back to training models. Governance felt like a different profession. Something for lawyers and policy people. Not for someone who spends their day staring at loss curves. But then a colleague's model got flagged by a client's compliance team, and we had no documentation, no risk assessment, no audit trail. We couldn't answer the most basic question: "How does this thing make decisions, and who's responsible when it's wrong?" That was the moment the discomfort of not knowing became unbearable. Here is that dive.
AI safety and governance is the set of practices, frameworks, and regulations that determine how AI systems are built responsibly, deployed safely, and held accountable when things go wrong. It spans everything from technical alignment work to international law to the organizational structures inside companies. The field has exploded since 2023, driven by the rapid proliferation of large language models and a series of high-profile AI failures that made governments pay attention.
Before we start, a heads-up. We're going to be talking about legal frameworks, risk tiers, corporate governance structures, and geopolitics. You don't need a law degree or a policy background. We'll build up every concept as we go, and I'll explain the jargon when it shows up.
This isn't a short journey, but I hope you'll be glad you came.
A Small Company, Three Models, and a Government Letter
Risk Assessment: Figuring Out What Can Go Wrong
The EU AI Act: The Regulation That Changed Everything
The United States: Executive Orders and Their Afterlife
The Global Patchwork
Rest Stop
Organizational Governance: Building the Internal Machine
From Principles to Practice: Impact Assessments
When Things Go Wrong: AI Incident Response
AI Ethics Boards: The Uncomfortable Track Record
The Open vs. Closed Model Debate
Compute Governance: Regulating the Hardware
International Coordination: Herding Cats at Global Scale
Resources and Credits
The AI Safety Landscape
There's a split in the AI safety world that's worth naming upfront, because it shapes every conversation you'll hear on the topic. On one side, people worry about near-term risks — the harms AI systems are causing right now, today. Biased hiring algorithms. Hallucinating chatbots giving medical advice. Deepfakes undermining elections. Models that leak personal data. These are real, documented, and growing. The OECD's AI Incidents Monitor tracked 233 incidents in 2024, a 56% increase from the year before.
On the other side, people worry about long-term risks — sometimes called existential or catastrophic risks. What happens when AI systems become significantly more capable than humans at most cognitive tasks? Could such a system pursue goals misaligned with human interests in ways we can't correct? This sounds like science fiction, but serious researchers at serious institutions are working on it full-time.
I'll be honest — for a long time I dismissed the long-term risk crowd as alarmist. Then I spent time reading their actual arguments, not the headlines. The core concern isn't "a robot will decide to kill us." It's more subtle: an AI system optimizing for a poorly specified objective at superhuman capability could cause enormous damage without any malice at all. We've already seen this in miniature — a recommendation algorithm optimizing for engagement that ends up radicalizing teenagers. Scale that up. Way up.
The practical reality is that both camps are right, and the two concerns aren't in competition. A hiring model that discriminates against women is a near-term safety failure. A frontier model that can synthesize novel bioweapons is a long-term safety concern. Both need governance. The difference is urgency, certainty, and the tools available to address them.
Think of it like building safety for a city. You need fire codes for the buildings that exist today (near-term). You also need earthquake engineering for the big one that might come in fifty years (long-term). Nobody argues you should pick one or the other. The city needs both, and the people working on each inform the other's work.
A Small Company, Three Models, and a Government Letter
Let's ground this with a running example we'll return to throughout. Imagine a small company called MedScreen that builds AI for healthcare. They have three models in production:
Model A — a diagnostic assistant that helps radiologists flag potential tumors in chest X-rays. It highlights suspicious areas and provides a confidence score. A human radiologist always makes the final call.
Model B — a chatbot on their website that answers patient questions about appointment scheduling, insurance coverage, and general health topics. No medical diagnoses, though patients sometimes try to get them.
Model C — an internal tool that predicts which patients are likely to miss appointments, so the admin team can send reminders. It uses demographics, appointment history, and zip code.
One morning, MedScreen's CEO opens a letter from a European health authority. It says: "Under the EU AI Act, please provide documentation of your risk classification, conformity assessment, and ongoing monitoring procedures for all AI systems affecting EU patients."
The CEO stares at it. They have none of these things. They have Python scripts, a few Jupyter notebooks, and a Slack channel called #model-stuff. This is where our story begins — and it's more common than you'd think.
Risk Assessment: Figuring Out What Can Go Wrong
Before you can govern anything, you need to understand what can go wrong. This sounds obvious, but I'm still developing my intuition for how badly organizations misjudge this step. The tendency is to think about risks abstractly — "bias," "privacy," "safety" — without grounding them in specific, concrete failure modes for your specific system.
Let's walk through a risk assessment for MedScreen's three models. For each one, we need to ask four questions: Who is affected? What's the worst realistic outcome? How likely is it? And can the damage be reversed?
Model A, the tumor detector. Who's affected? Patients whose X-rays are being analyzed. Worst realistic outcome? The model misses a tumor (false negative) and the patient's cancer goes undiagnosed for months. Or it flags a healthy patient as suspicious (false positive), triggering unnecessary biopsies. Both happen. Both cause real harm — delayed treatment in one case, invasive procedures and psychological distress in the other. The damage from a missed tumor is irreversible by the time it's discovered. This is a high-risk system.
Model B, the scheduling chatbot. Who's affected? Website visitors. Worst realistic outcome? The chatbot gives wrong insurance information, and a patient shows up expecting coverage they don't have. Annoying, potentially costly, but correctable. Unless the chatbot starts giving medical advice — which it's not supposed to do, but patients push for it, and chatbots are famously bad at staying in their lane. There's a scenario where someone asks "should I go to the ER?" and the bot says "based on your symptoms, you're probably fine." That's a different risk category entirely.
Model C, the appointment predictor. Who's affected? Patients receiving (or not receiving) reminders. Worst realistic outcome? The model learns that patients from certain zip codes miss more appointments, and those zip codes correlate with race. The reminder system becomes a proxy for racial profiling — patients from predominantly minority neighborhoods get more aggressive reminder calls, which might feel intrusive or condescending. Nobody intended this, but the model doesn't care about intentions.
The NIST AI Risk Management Framework — released by the US National Institute of Standards and Technology — provides a structured way to think through this. It has four functions: Govern (set up policies and accountability), Map (identify what can go wrong and who's affected), Measure (quantify the risks with metrics and testing), and Manage (decide what to do about them — mitigate, accept, or avoid). It's not legally binding, but it's increasingly referenced in government procurement contracts and court cases. "We followed NIST AI RMF" is a much stronger legal defense than "we did our best."
The city analogy holds here. A risk assessment is like a building inspection. You walk through the structure before anyone moves in, you identify the weak points, and you decide which ones need reinforcement. You don't skip this step because the building "looks fine from the outside." The risks that matter are almost always the ones you can't see without looking carefully.
The EU AI Act: The Regulation That Changed Everything
The EU AI Act entered into force in August 2024. It's the world's first comprehensive AI regulation, and like GDPR before it, it applies to anyone whose systems affect EU residents — regardless of where the company is headquartered. A startup in San Francisco building a hiring tool used by a London company to screen applicants? That's covered. A model trained in Singapore and deployed via API to a Berlin hospital? Also covered.
I'll be honest, when I first read the Act, I found it overwhelming. It's long, it's legal, and the compliance timelines are staggered in a way that's designed to confuse engineers. So let's break it down the way I wish someone had broken it down for me.
Risk Tiers
The entire Act is organized around a single idea: the higher the risk your AI system poses to people's fundamental rights and safety, the stricter the rules. Four tiers.
The first tier is Unacceptable Risk. These are AI uses that are banned outright. Social scoring by governments — assigning citizens a trustworthiness score based on their behavior, the way China's social credit system works — is prohibited. So is real-time biometric identification in public spaces (with narrow exceptions for law enforcement), AI that manipulates people through subliminal techniques, and systems that exploit vulnerabilities of specific groups like children or disabled people. The deadline for stopping these uses was February 2025. If you're doing any of this, it's already illegal.
The second tier is High Risk. This is where most of the compliance burden lives. AI systems used in hiring and recruitment, credit scoring, medical diagnosis, critical infrastructure management, law enforcement, immigration decisions, and education assessment all fall here. These systems must have a risk management system, data governance procedures, technical documentation, logging and record-keeping, transparency provisions for users, human oversight mechanisms, and they must meet accuracy, robustness, and cybersecurity requirements. They also need a conformity assessment — essentially a formal review proving the system meets all requirements — before deployment. Full compliance deadline: August 2026.
The third tier is Limited Risk. Chatbots, emotion recognition systems, and deepfake generators land here. The main requirement is transparency: users must be told they're interacting with AI or viewing AI-generated content. If your chatbot doesn't clearly identify itself as non-human, that's a violation. Same deadline: August 2026.
The fourth tier is Minimal Risk. Spam filters, AI in video games, inventory management systems. No specific legal obligations, though the EU encourages voluntary codes of conduct.
Let's go back to MedScreen. Model A, the tumor detector, is a medical device — that's High Risk, full compliance required. Model B, the patient chatbot, is Limited Risk — it needs a transparency disclosure. Model C, the appointment predictor, is trickier. If it's making predictions that affect patient access to care (and the zip code correlation creates disparate impact), an argument could be made for High Risk. This is where legal counsel earns their fee.
General-Purpose AI Models
Foundation models and large language models get their own set of rules, separate from the risk tiers. All providers of General-Purpose AI (GPAI) models must provide technical documentation and comply with EU copyright law. But models trained with compute exceeding 1025 FLOPs — think GPT-4 scale and above — are classified as having systemic risk. That triggers additional requirements: adversarial testing (red-teaming), incident reporting to the EU AI Office, cybersecurity measures, and energy consumption reporting. These obligations started in August 2025.
Penalties
The numbers that make executives sit up. Deploying a banned AI system? Up to €35 million or 7% of global annual turnover, whichever is higher. Non-compliance with high-risk requirements? Up to €15 million or 3%. Providing incorrect information to authorities? Up to €7.5 million or 1%. For SMEs and startups, fines are capped at the lower of the two figures. Still enough to sink a small company.
The Brussels Effect
Here's the thing about the EU AI Act that matters even if you never sell a single product in Europe. It's called the Brussels Effect, and we've seen this movie before with GDPR. Companies won't maintain two separate AI governance systems — one for Europe and one for everyone else. The engineering cost of forking your compliance infrastructure is too high. So they build to the strictest standard and ship globally. The EU AI Act will become the de facto global floor for AI governance, the same way GDPR became the global floor for data privacy.
Our city analogy evolves here. The EU has published the building code. Other cities are watching. Some will adopt it wholesale. Others will write their own, but they'll use the EU's as a starting template. The buildings being designed right now will be built to this code whether the architect is in Brussels or not.
The United States: Executive Orders and Their Afterlife
The US approach to AI governance reads like a political thriller, complete with dramatic reversals.
In October 2023, President Biden signed Executive Order 14110, the most comprehensive US action on AI to date. It mandated safety testing and red-teaming for high-risk AI, required federal agencies to conduct risk assessments, established reporting requirements for developers of frontier models, and created a whole-of-government framework for responsible AI development. It was ambitious. It was detailed. It lasted about fifteen months.
On January 20, 2025 — his first day in office — President Trump revoked Biden's AI order. Three days later, he signed Executive Order 14179, "Removing Barriers to American Leadership in Artificial Intelligence." The framing shifted entirely: from safety and oversight to innovation and deregulation. Mandatory red-teaming requirements were dropped. Agency risk monitoring was rolled back. The new order directed a review and rescission of all Biden-era policies seen as regulatory barriers to AI innovation. By December 2025, a further order sought to establish federal preemption of state-level AI regulations, arguing that a "patchwork" of state laws hinders interstate commerce.
I'm still processing what this means long-term. The policy whiplash is real. But a few things survived the change in administration. The NIST AI Risk Management Framework remains intact — it's a voluntary standard, not an executive order, so it can't be revoked by one. Federal procurement increasingly references it. And the fundamental technical challenges of AI safety don't change with the political winds. A hallucinating model doesn't care who signed the latest executive order.
The practical takeaway for engineers: don't build your safety practices around any single executive order. Build them around durable standards (NIST AI RMF, ISO 42001) and your own organizational principles. Executive orders come and go. Good engineering practices compound.
The Global Patchwork
Outside the EU and the US, the regulatory landscape is a patchwork of philosophies, each reflecting the country's broader relationship with technology and governance.
China has the most prescriptive regime after the EU, but with a fundamentally different character. China mandates algorithm registration and pre-launch approval for AI systems. Generative AI must align with "core socialist values." Content control is tight. Penalties can reach 5–10% of revenue. The goal isn't consumer protection in the EU sense — it's state control over a strategically important technology.
The UK has gone the opposite direction: principles-based, decentralized, deliberately light-touch. Rather than a single AI law, existing regulators — the ICO for data, the FCA for finance, the CMA for competition — govern AI within their own domains. The government publishes guidance but avoids prescriptive rules, aiming to foster innovation. Whether this remains sustainable as AI systems grow more powerful is an open question.
Canada is somewhere in the middle with its proposed Artificial Intelligence and Data Act (AIDA), a risk-based approach inspired partly by the EU AI Act but tailored to Canada's legal framework. It classifies systems by risk, introduces transparency and testing obligations, and layers them on top of existing privacy law.
The tension across all of these is the same: how do you govern a technology that evolves faster than legislation can keep up? The EU chose to be prescriptive and accept the risk of overregulation. The UK chose to be flexible and accept the risk of underregulation. The US oscillates depending on who's in office. China chose to optimize for state interests. None of them have it completely figured out.
This is the moment our city analogy starts to strain. We're not building safety codes for one city anymore. We're trying to agree on building codes across countries that can't agree on what "safe" means. That's the fundamental governance challenge.
Congratulations on making it this far. If you want to stop here, you now have a solid mental model of the AI safety landscape. You understand the split between near-term and long-term risks, you know the EU AI Act's risk tiers and compliance timeline, you've seen how the US approach has shifted, and you have a sense of the global patchwork. That's genuinely useful — you can hold your own in a governance conversation.
But it doesn't tell you how to actually do governance inside an organization. Knowing the regulations exist is different from knowing how to build systems that comply with them. If the discomfort of that gap is nagging at you, read on.
Organizational Governance: Building the Internal Machine
Regulations tell you what you must do. Organizational governance is the internal machinery that makes it actually happen. And here's the uncomfortable truth I've learned from watching companies try: the organizations that treat governance as engineering infrastructure ship faster than the ones that treat it as a legal checkbox.
That sounds backwards. More process means slower, right? Not when the alternative is building a model, deploying it, getting a compliance letter, panicking, and then spending six months retroactively documenting everything while the model sits in limbo. MedScreen didn't have governance infrastructure. They're now spending three times as long getting Model A approved as they would have if they'd built the documentation pipeline from the start.
What does organizational AI governance actually look like? Let's walk through the components using MedScreen as our example.
First, ownership and accountability. Every model needs an owner — a specific person, not a team, not a Slack channel. Model A's owner is Dr. Sarah Chen, the lead radiologist who validated its clinical performance. She's accountable for its behavior in production. This doesn't mean she writes the code. It means when the European health authority sends that letter, her name is on the response. This is surprisingly hard to establish in organizations where models are built by one team, deployed by another, and used by a third. But without clear ownership, accountability dissolves into finger-pointing.
Second, a model registry. MedScreen needs a central record of every model in production: what it does, who owns it, what data it was trained on, what its known limitations are, and when it was last reviewed. This is the organizational equivalent of a city's building registry. You can't inspect buildings you don't know exist.
Third, review gates. Before any model goes to production, it passes through a defined review process. For a high-risk model like Model A, this might include an independent validation by someone who didn't build it, a bias audit, a security review, and sign-off from the model owner. For a lower-risk model like Model B, a lighter review might suffice. The key is that the process exists and is documented, not that every model gets the same treatment.
Fourth, ongoing monitoring. Models degrade. Data drifts. The world changes. A model that performed beautifully at deployment might be quietly failing three months later. MedScreen needs automated monitoring on Model A's performance — tracking its sensitivity and specificity on real-world cases, flagging when metrics drift beyond acceptable thresholds, and triggering human review when they do.
Fifth, retirement procedures. Every model needs a defined end-of-life. What triggers retirement? What's the fallback? Model A's retirement criteria might be: sensitivity drops below 0.90 on monthly validation, or a fairness gap exceeds 5 percentage points across demographic groups. Models don't die gracefully on their own. They degrade silently, like a building foundation that looks fine until the floor caves in.
From Principles to Practice: Impact Assessments
Every major tech company publishes responsible AI principles. Fairness. Transparency. Accountability. Privacy. They look beautiful on a website. The gap between principles and practice is where most organizations fail, and it's enormous.
I've seen companies with gorgeous responsible AI frameworks that couldn't answer basic questions about a specific model: Who does this affect? What happens when it's wrong? Has anyone checked whether it performs differently for different demographic groups?
An algorithmic impact assessment (AIA) is the tool that bridges this gap. It's a structured process — run before deployment and periodically after — that forces you to think concretely about a specific system's effects on specific people. Canada's government was an early adopter, requiring AIAs for all federal AI systems.
Let's walk through what an impact assessment looks like for MedScreen's Model C, the appointment predictor. The process starts with scoping: what does this system do, who does it affect, and what decisions does it inform? Model C predicts no-show probability for individual patients. The admin team uses the predictions to decide who gets reminder calls. Patients with high no-show scores get multiple calls; patients with low scores get none.
Next, stakeholder analysis. Who's affected? Patients who receive (or don't receive) reminders. Doctors whose schedules are affected. Admin staff who make the calls. The hospital's bottom line. But also — and this is the part that gets missed — patients who are systemically over-targeted or under-served by the model's predictions.
Then, harm analysis. What could go wrong? We already identified the zip code problem: the model might learn that patients from certain neighborhoods miss more appointments, and those neighborhoods correlate with race. The reminder system becomes a proxy for racial profiling. The harms cascade: over-contacted patients feel surveilled, under-contacted patients miss appointments and lose access to care, and the hospital faces legal liability.
Then, mitigation. Remove zip code as a feature? That helps, but the model might reconstruct the same signal from other correlated features. The deeper fix is to audit the model's predictions disaggregated by demographic group and ensure that no-show rates are predicted with comparable accuracy across groups. Add a fairness constraint to the training objective. Monitor the demographic distribution of reminder calls in production.
Finally, documentation and review. Write all of this down. Schedule a review in six months. When the model is retrained, run the assessment again. This is the part that separates genuine governance from performative governance.
When Things Go Wrong: AI Incident Response
Here's something I wish someone had told me earlier in my career: your AI system will fail. Not might. Will. The question isn't whether you'll have an incident, but whether you'll have a plan for when you do.
The OECD launched its AI Incidents Monitor (AIM) in 2023, tracking AI-related harms across the world. The data is sobering. A lawyer used ChatGPT for legal research and filed court documents citing cases that didn't exist — the model had hallucinated entire judicial proceedings, complete with fake case numbers and fabricated quotations. A deepfake celebrity was used in real estate advertisements without the person's consent. AI-generated content was used to manipulate elections in multiple countries.
Documented incidents grew 56% from 2023 to 2024. The breakdown: hallucination and factual errors (38%), bias and discrimination (24%), privacy violations (18%), harmful content generation (14%), and transparency failures (6%). These aren't theoretical risks. They're the Tuesday afternoon of production AI.
An AI incident response plan is modeled on cybersecurity incident response, but adapted for the unique ways AI systems fail. The core phases:
Detection. How do you know something went wrong? Automated monitoring catches some failures — a sudden drop in accuracy, a spike in user complaints, a content filter triggering more often than usual. But many AI failures are subtle. The model doesn't crash; it slowly becomes more biased, or starts hallucinating slightly more often, or its recommendations shift in ways that take weeks to notice. MedScreen needs both automated alerts and regular manual review.
Triage. How bad is it? Model A misidentifying a tumor type is different from Model B giving a snarky response. Severity classification determines the response speed and the escalation path. A patient-safety incident on Model A triggers an immediate pause and investigation. A bad chatbot response on Model B might be logged and queued for the weekly review.
Containment. Stop the bleeding. For a model producing harmful outputs, this might mean routing traffic to a fallback system, adding a human-in-the-loop requirement, or taking the model offline entirely. The key is having these options pre-planned. You don't want to be designing your rollback procedure while the incident is in progress.
Root cause analysis. Why did this happen? Was it a data quality issue? A concept drift problem? A gap in the training distribution? An adversarial attack? Understanding the root cause determines whether you can patch the problem or need a more fundamental fix.
Communication. Who needs to know? Under the EU AI Act, providers of GPAI models with systemic risk must report incidents to the EU AI Office. But even without a legal obligation, you need a communication plan: internal stakeholders, affected users, and potentially regulators and the public. The companies that handle AI incidents well are the ones that are transparent about what happened and what they're doing about it. The ones that try to hide failures end up in much worse positions when the truth comes out.
AI Ethics Boards: The Uncomfortable Track Record
Many large tech companies have established AI ethics boards — advisory bodies meant to provide oversight and ensure responsible development. The track record is, to put it diplomatically, mixed.
Google's Advanced Technology External Advisory Council (ATEAC) lasted one week. Launched in 2019 with great fanfare, it dissolved almost immediately after backlash over its composition and a petition signed by thousands of Google employees. The board included a member whose views on immigration and trans rights made many employees feel the council couldn't meaningfully represent affected communities.
That wasn't even Google's biggest ethics governance failure. In 2020 and 2021, Google fired two co-leads of its internal Ethical AI team — Timnit Gebru and Margaret Mitchell — after they published research highlighting risks in large language models. The message received by the broader AI community was clear: internal ethics work is tolerated until it conflicts with business interests.
Microsoft's AI, Ethics, and Effects in Engineering and Research (AETHER) Committee has avoided the dramatic headlines, but faces quieter criticism. External observers question whether it has enough authority to influence product decisions, especially given Microsoft's aggressive push into AI products and its multi-billion dollar partnership with OpenAI.
The pattern is consistent across the industry. Ethics boards typically have advisory authority — they can recommend, but not veto. They report to senior leadership, which means they can be overruled by the same executives whose products they're reviewing. Their proceedings are rarely public, making external accountability impossible. And their members are often selected by the company, creating an inherent conflict of interest.
I'm not saying ethics boards are useless. I'm saying they're insufficient without structural power. An ethics board that can be overruled by the CEO is a PR tool, not a governance mechanism. The organizations doing this well tend to have ethics review embedded in the engineering process — review gates that models must pass before deployment, with authority to block launches — rather than a separate advisory body that can be consulted or ignored at management's discretion.
This is where the building inspection analogy comes back. A city doesn't have an "advisory building safety council" that recommends whether buildings should pass inspection. It has inspectors with legal authority to condemn unsafe structures. The building goes up or it doesn't. That's the level of structural power that meaningful AI ethics oversight requires.
The Open vs. Closed Model Debate
Should the most powerful AI models be open-sourced? This is one of the most contentious questions in AI governance, and it doesn't have a clean answer.
The argument for open models is compelling. Transparency enables independent safety research — you can't audit a black box. Open models democratize access, preventing a concentration of AI power in a handful of companies. Collective scrutiny finds bugs and biases that internal teams miss. Meta's Llama models, Mistral's releases, and the broader open-source ecosystem have driven enormous innovation.
The argument against is equally compelling. Once you release model weights, you can't un-release them. A model capable of generating convincing disinformation, or assisting in cyberattacks, or providing detailed instructions for dangerous activities is now in everyone's hands, including the hands of people who want to do harm. You can add licenses restricting misuse, but licenses are only as strong as enforcement, and enforcement of open-source licenses at global scale is effectively impossible.
I've gone back and forth on this. The position I've landed on — and I hold it loosely, because the landscape is shifting fast — is that openness should be tiered by capability. For models below a certain capability threshold, the benefits of openness clearly outweigh the risks. The research community thrives on open models. But for frontier models — the most capable systems, the ones that might have genuinely dangerous emergent capabilities — some form of access control makes sense.
The industry seems to be converging on this view. Several proposals describe "structured access" — not fully open, not fully closed, but tiered. Researchers get access through vetted channels. Companies get API access with usage monitoring. Full model weights for the most powerful systems are restricted to organizations that can demonstrate safety infrastructure. It's a compromise, and like all compromises, it satisfies nobody completely.
Meta keeps releasing increasingly powerful models openly. OpenAI, Anthropic, and Google keep theirs closed (with varying degrees of API access). The EU AI Act applies obligations regardless of whether a model is open or closed — if it has systemic risk, the provider has obligations. The US export control framework, interestingly, exempts open-weight models from AI model weight controls. Whether that exemption survives as models get more capable is anyone's guess.
Compute Governance: Regulating the Hardware
Here's an insight that took me a while to appreciate: you can regulate AI by regulating the hardware it runs on. Training a frontier model requires thousands of the most advanced GPUs running for months. That hardware is manufactured by a small number of companies (primarily NVIDIA, AMD, and to some extent Intel), fabricated in a small number of foundries (primarily TSMC in Taiwan), and controlled by a small number of supply chains. This concentration creates a natural chokepoint for governance.
In January 2025, the US Bureau of Industry and Security released the "Framework for Artificial Intelligence Diffusion" — an export control regime for advanced AI chips and, for the first time, AI model weights. The rules work like this: exports of advanced computing chips are controlled globally, with licenses generally required for sales to any country except 18 designated "AI Authorization Countries" (close US allies). Model weights trained using controlled chips above a threshold of 1026 operations are also controlled. Open-weight models are currently exempt.
The geopolitical implications are enormous. This is the US using its dominance in semiconductor design as a lever for AI governance, specifically targeting China's ability to develop frontier AI systems. China has responded by accelerating domestic chip development, though it remains years behind on the most advanced process nodes.
Compute governance is powerful because compute is physical, quantifiable, and concentrated. You can't easily hide a data center consuming 100 megawatts of power. You can track chip shipments. You can set thresholds: models trained with more than X FLOPs trigger additional requirements. The EU AI Act uses this approach with its 1025 FLOP threshold for systemic risk classification. The US export controls use a similar logic.
The limitation is obvious: thresholds are arbitrary and don't age well. Today's frontier model is next year's mid-tier model. The compute threshold that captures GPT-4 today might be achievable by well-funded startups in three years. Governance frameworks built on compute thresholds need regular recalibration. And as algorithmic efficiency improves — getting more capability from less compute — the entire premise of using compute as a proxy for capability comes under pressure.
I haven't figured out a great way to think about where compute governance ends up long-term. But the short-term picture is clear: governments have discovered that the semiconductor supply chain is the one place where they have real leverage over AI development, and they're using it.
International Coordination: Herding Cats at Global Scale
AI doesn't respect borders. A model trained in the US on data scraped from the global internet, deployed via API to customers in 40 countries, and running on servers in Ireland creates governance obligations under multiple overlapping jurisdictions. This is the coordination problem that keeps policymakers up at night.
Several international initiatives are trying to address this. The OECD AI Principles, adopted in 2019 and updated since, provide a shared vocabulary — five principles covering inclusive growth, human-centered values, transparency, robustness, and accountability. Forty-six countries have endorsed them. They're aspirational, not enforceable, but they serve as a foundation that national regulations can build on.
The G7 Hiroshima AI Process (2023) produced a set of voluntary guiding principles and a code of conduct for advanced AI systems. The AI Safety Summits — first at Bletchley Park in 2023, then in Seoul and Paris — have brought together governments, companies, and researchers. The Bletchley Declaration, signed by 28 countries including the US and China, acknowledged AI's potential for "serious, even catastrophic, harm" and called for international cooperation on safety.
But here's the honest assessment: international AI governance coordination is moving slowly relative to the speed of AI development. The OECD principles are nice. The G7 commitments are voluntary. The summit declarations are non-binding. Meanwhile, the actual regulatory divergence between regions is growing. The EU is prescriptive. The US is deregulatory. China is state-controlled. The UK is laissez-faire. These aren't converging.
The most optimistic scenario is something like what happened with aviation safety. The International Civil Aviation Organization (ICAO) establishes standards that most countries adopt, creating a reasonably consistent global framework while allowing some national variation. The OECD could potentially play this role for AI. But aviation safety took decades to reach its current state of international coordination, and airplanes move a lot slower than AI capabilities are advancing.
For MedScreen, the practical implication is clear: if you operate across borders, you need to comply with the strictest applicable regulation. That's currently the EU AI Act, and by the Brussels Effect, it probably will be for a while. Build to that standard, and you'll be compliant almost everywhere.
Bringing It Together
If you're still with me, thank you. I hope it was worth it.
We started with the uncomfortable realization that building good models isn't enough — you need to know who's affected, what can go wrong, and what laws apply to your specific system. We walked through the AI safety landscape's split between near-term and long-term risks. We traced the EU AI Act's risk tiers, watched the US policy pendulum swing from regulation to deregulation, and surveyed the global patchwork. We built the internal governance machinery — ownership, registries, review gates, monitoring, and retirement procedures. We saw how impact assessments turn abstract principles into concrete actions, why incident response plans need to exist before incidents happen, and why ethics boards without structural power are insufficient. We wrestled with the open vs. closed model debate, discovered that governments are using semiconductor supply chains as governance levers, and acknowledged that international coordination is moving slower than anyone would like.
My hope is that the next time you get a governance question — whether it's a compliance letter, a risk assessment, or a colleague asking "should we be worried about this model?" — instead of freezing or dismissing it as someone else's problem, you'll have a mental map of the landscape and a sense of where to start. Governance isn't something that happens after the engineering is done. It's engineering. And the engineers who understand that will build things the world can trust.
Resources and Credits
EU AI Act full text — The official legislation. Dense but essential if you're building high-risk systems in or for Europe. The European Commission's Q&A pages are more digestible. (artificialintelligenceact.eu)
NIST AI Risk Management Framework — The best practical framework for AI risk assessment I've found. Clear, structured, and mercifully free of legalese. The companion playbook is wildly helpful for implementation. (nist.gov/itl/ai-risk-management-framework)
OECD AI Incidents Monitor — A sobering, fascinating database of real-world AI failures. Browse it when you need a reminder of what goes wrong in practice. (oecd.ai/en/incidents)
"Model Cards for Model Reporting" — Mitchell et al., 2019 — The O.G. paper on model documentation. Short, practical, and the foundation for every model card template in use today.
"On the Dangers of Stochastic Parrots" — Bender, Gebru, et al., 2021 — The paper that cost two Google researchers their jobs. Whether you agree with every claim or not, it raised questions about large language models that the field is still grappling with. Insightful and important.
IAPP Global AI Law and Policy Tracker — The best single resource for tracking AI regulation across countries. Updated regularly, searchable by jurisdiction. Unforgettable once you've used it for compliance research. (iapp.org/resources/article/global-ai-legislation-tracker)