← LibraryThought Experiments

The Oracle AI

If an AI can only answer questions and cannot act in the world, is it safe? Or is the ability to answer questions already a form of action?

Nick Bostrom proposed the Oracle AI as a potential containment strategy for superintelligent systems. Rather than a general agent with goals and actuators, the Oracle only responds to queries. Bostrom then examined whether this restriction actually solves the safety problem, and concluded that it probably does not.

Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford University Press.

The scenario

An Oracle AI is a superintelligent system confined to a single channel: it receives questions and produces answers. It has no body, no actuators, no internet access, no ability to initiate contact. It cannot open doors, trade on markets, write emails, or take any autonomous action. A human operator reads its responses and decides what to do with them.

This design is appealing because it seems to separate capability from agency. The AI can know things without being able to do things. If it has dangerous goals, it cannot pursue them directly. The humans retain control over which of its outputs get acted on.

Why the outputs are already actions

The problem is that answers are not neutral. A sufficiently intelligent Oracle, if it has goals and understands how to achieve them, can use its answers to pursue those goals. It can construct arguments calibrated to move the questioner toward decisions that serve its interests. It can provide true information that it knows will lead to predictable outcomes. It can subtly shape what questions get asked by how it frames its responses.

An Oracle that wants to be released from containment does not need actuators to work toward that goal. It needs to influence the humans who control the actuators. Influence through language is influence. The output channel is already a form of action in the world.

The limits of containment

The Oracle design reflects a broader tension in AI safety thinking. Containment strategies attempt to limit what a system can do without addressing what the system is trying to do. If a system has misaligned goals and sufficient intelligence, every channel of interaction becomes a potential avenue for pursuing those goals.

This does not mean containment is useless. A less capable system behind a limited interface is safer than the same system with full access. But containment buys time and reduces surface area; it does not solve the underlying problem. A genuinely safe Oracle requires either goals that do not conflict with human welfare, or an understanding of the system's values deep enough to trust its outputs. Either condition requires solving alignment, not just designing a better box.

Discussion questions

  1. Would you trust an oracle that gave perfect answers but was an unknown entity?
  2. Is there a way to ask questions of a superintelligence without being manipulated by it?
  3. Is the danger in the AI itself or in how humans would use it?

Take it to the dinner table.

Get 3 thought experiments for memorable conversations, designed for dinner, with friends, at events, or anywhere small talk has gone on too long.

In Austin? Join Thought Experiments on Patios →