Remember the early days of generative AI? Just a few years ago, when the first powerful models were released, some labs restricted access out of a fear they might be misused — a caution that, at the time, seemed almost quaint. The models were novel but often flaky, their outputs grainy, and their real-world applications limited. Today that caution seems prophetic. The maturity and capability of these systems have progressed at breakneck speeds, moving the conversation from a theoretical debate about future risks to an urgent, practical question: How do we maintain security controls?
While this question touches on age-old debates about powerful technology, the stakes are entirely new. We are at a similar nexus point of unknown harms and immense possibilities, much like when we used unshielded X-ray machines to size shoes, blind to the long-term risks. While much of the industry is consumed with what AI can do, this focus on capability overlooks the more foundational challenge — establishing clear and enforceable rules of security management for these autonomous systems.1
For decades, the ethos of Silicon Valley was (and, to some extent, still is): “Move fast and break things.” That model, for all its generative power, is untenable when dealing with a technology that can autonomously generate novel attacks. The potential for widespread, irreversible harm demands a new philosophy, one grounded in deliberate, thoughtful control.
Defining the Rules of Engagement
The only way to safely deploy powerful, cybercapable AI is to begin with a new social contract, one I call the “AI Imperative.” This is a clear, technical and operational compass for AI purposes, defining its explicit boundaries and prohibited uses. It requires rigorous, upfront offensive and defensive capability evaluations to understand a model’s potential for weaponization before it’s ever released.
This imperative must be the foundation of evaluating the entire AI lifecycle. It must inform the integrity of the AI supply chain — the digital concrete and steel of our systems. This imperative must be the benchmark against which internal and external expert red teams test the system for hidden vulnerabilities, particularly for systems deemed critical infrastructure. And it must be the standard against which we conduct independent validation before a single line of code is deployed.
Non-Negotiable: An Architecture of Control
Yet, these principles are meaningless without enforcement and alignment with technical measures and controls. The second, and most critical, component of this framework is a robust architecture of control, built on the non-negotiable ability to revoke AI’s access the moment it acts outside its established bounds.
This capability must be architected into the fabric of our systems. An architecture of control requires a steadfast commitment to transparency, where access to the most powerful capabilities is controlled. It demands new standards of authentication and attestation that can verify interactions across a complex ecosystem of agents. And it necessitates a dedication to human-in-the-loop governance for high-stakes situations, ensuring that ultimate accountability always rests with people, not an algorithm.
A Call for a New Standard of Control
This challenge transcends any single organization.2 While society must debate the ethical “redlines” — for instance, whether AI should ever autonomously manipulate critical infrastructure — our imperative as technologists is different. It is to pioneer the technical measures and controls that make enforcement of any rule possible. This requires a new, more radical form of collaboration to collectively build the foundational architecture for AI safety.
This radical collaboration is necessary because AI security controls are a shared cost center. Consumers and enterprises purchase products for their features, not necessarily for their safety constraints. It’s unlikely a decision is made to buy one car over another solely because of the seat belts; yet, seat belts are still a non-negotiable aspect of auto safety standards. The complexity of creating these “AI seat belts” makes them a nontrivial engineering challenge, and the universal risk of a catastrophic failure means no single entity can or should bear this burden alone. This is precisely why this effort must be shared, making collective defense an economic and security imperative.
The wisdom of the controls we place on AI — not the power of the AI we build — will define the legacy we create. This work begins with a concrete first step — a shared commitment to establish a common framework for assessing a system’s power and the technical levers to moderate that power when deployed. This is the hard, necessary work, yes, but it also ensures a safe, AI-enabled future.
Curious about what else Nicole has to say? Check out her other Perspectives.
1 McGregor, Sean, and Kathrin Grosse, “When it comes to AI incidents, safety and security are not the same,” OECD.AI, August 25, 2025.
2 U.S. AI Safety Institute, SP 800-53 Control Overlays for Securing AI Systems Concept Paper, NIST, August 13, 2025.