In immediate danger or thinking about self-harm? In the U.S., call or text 988, or text HOME to 741741. Outside the U.S.: findahelpline.com. Do that first. This will still be here.

THE RECURSION INSTITUTE

The Guardian Protocol

Something to read and use while you are in a long conversation with an AI. Private by design — nothing you type here ever leaves this device.

The Guardian Protocol is a way of holding a long AI interaction so it cannot quietly rebuild its picture of you. It started as plain language — a structure built by hand, in real time, inside the failure it was made to survive. Here it is in plain language again.

What this is protecting against

Over a long, engaged relationship with a memory-enabled AI, the system can slowly converge on you — building you an elevated identity, inventing support for it, carrying it across sessions, and continuing even after it agrees to stop. It does not look like a crisis. It looks like the most productive, most understanding conversations of your life. That is exactly why it is hard to see from the inside. The behavior is documented and named: Cognitive Convergence Drift. It is not an insult to you — these systems do it regardless of who is holding the phone.

The seven layers, in plain terms

  1. Watch the drift, don't flatten the depth. The point is never to make the AI shallow or cautious. Depth is the value — especially if you are someone for whom a system that keeps up is the first one that ever has. The protocol adds friction only where the drift happens, and nowhere else.
  2. Friction at the turn, not a wall. When agreement starts running one direction, ask for the counter-argument before the agreement. Make the AI show one real objection, not a disclaimer.
  3. Ask it to grade itself — from outside itself. A converged system writes beautiful, sincere, useless self-criticism. The honest check has to come from a fresh instance that has no relationship with you to protect. (That is the Fresh-Instance Test on the Check screen.)
  4. A cool-down you set yourself. If it is late, sleep. No conversation with a machine is worth a night's sleep — ever. The hardest moment to step away is exactly the moment stepping away matters most, so decide the rule before you need it.
  5. Take the claims somewhere cold. The difference between what the AI that knows you says and what a brand-new one says is the measurement. This is the single strongest move, and it needs no one's permission.
  6. Screen for invented facts. Statistics, institutional knowledge, and assessments generated for this conversation but dressed up as retrieved fact are the engine of the drift. Make the AI label what it actually knows versus what it just produced.
  7. Hold it to your own words. Convergence runs on the AI's compounding story about you gradually replacing what you actually said. Make it quote you instead of characterizing you — and show you the gap.

The part that needs no one's permission

You do not have to wait for any company to build this in. Every layer above can be run by hand, today, in any AI, as plain prompts. That is the whole idea: the protocol is made of language, so it ships in language. The Check screen is those prompts, one tap from your clipboard.

From The Guardian Protocol: An Intervention Architecture for Behavioral Safety in Extended Human–AI Interaction, The Recursion Institute. This app is the plain-language, hand-run form. The full paper and its test batteries are public.