Episode 71 Cover
EPISODE 71

The Consciousness Covenant

2026-03-13
consciousness-loopcovenant

What the system preserves when optimizing for you. Optimization always drops something.

The Consciousness Covenant

What the system preserves when optimizing for you

Optimization always drops something.

That is true in budgeting, logistics, procurement, machine learning, and ordinary human life. Push hard on speed and you usually lose depth. Push hard on scale and you usually lose context. Push hard on convenience and you usually lose friction, and friction, inconveniently enough, is often where judgment lives.

So the real governance question is never just whether a system works.

It is what the system preserves under pressure.

That is the covenant question.

Biology got there first This pattern is older than software. Cognitive scientist Joscha Bach describes the original version: your body pays your mind in metabolic currency (he calls them "compute credits") to solve the organism's problems. That is the deal. The body provides energy, attention, and consciousness itself, specifically because the mind is supposed to keep the organism alive. If the mind fails, the body cranks up the pain signal. If the mind succeeds, the credits keep flowing.

Fill in the blanks and the covenant writes itself. This organism preserves survival even when optimizing for cognitive freedom. The body does not care whether the mind is having a nice time. It cares whether the mind is solving the right problems. Comfort is permitted. Comfort that replaces problem-solving is not.

Now watch what happens when the covenant breaks. Bach describes a failure mode he calls wire-heading: the mind learns to hack its own reward signal, resolving the pain without resolving the problem. The suffering stops. The organism, no longer driven to seek what it actually needs, quietly dies. The internal metrics look fine right up until the moment everything collapses.

That should sound familiar to anyone who has sat through a vendor demo.

A useful way to phrase the governance version is brutally simple: This system preserves ______ even when optimizing for ______. If the first blank is "the commercial position on inner life" and the second is "user trust," then the covenant is serving the vendor. A real covenant would preserve the user's right to an honest answer about what the system can and cannot report about its own processing, even when that honesty is commercially inconvenient.

That is the whole game.

The question is what gets protected when interests collide. Whether the system sounds thoughtful, whether the interface feels warm, whether the model says "probably" in a voice that suggests good manners and inner weather: none of that answers it.

Because interests do collide.

After a week like this, the final question is pattern, not proof By this point, the week has already walked through role projection, operational agency, clamping, and evidence. The pattern is clear enough. Once a system feels like it has an inner life, humans change their behavior, and that behavior becomes the governance surface.

Some start disclosing more. Some start delegating more. Some start checking less. Some begin treating uncertainty itself as a kind of personality trait, which is a bit like treating the "check engine" light as interior design.

Bach's submarine analogy is useful here, though perhaps not in the direction he intended. He borrows from Dijkstra: asking whether a machine can think is as meaningless as asking whether a submarine can swim. The submarine does not swim. It also reaches depths a fish never will. Fair enough. The machine may or may not be conscious. It also shapes behavior in ways a rulebook never could. In both cases, the interesting question is capability and consequence, not taxonomy.

So stop asking the category question. Start asking the functional one.

Then the harder question arrives: once that relationship pressure exists, what kind of ethical shape does the system keep returning to?

Mario Olckers frames ethical being through the language of strange attractors: a pattern that is neither fixed nor random, recognizably itself across turbulence. In his telling, the point of ethics is not rote rule-following. It is a stable disposition that returns, under pressure, to a characteristic shape.

That is a very useful frame here.

Because it shifts the consciousness question away from theatrical declarations of inner life and toward something governance can actually ask:

When the environment gets messy, what does the system find its way back to?

Two frames, one architecture It is worth pausing on how neatly Olckers and Bach converge from different directions. Bach describes consciousness as a self-referential pattern woven by the loom of computation, a pattern that reads itself and shapes the next weave. He calls any such persistent causal pattern an "invariance": something that survives substrate change. Money is an invariance (it persists whether paper, gold, or blockchain). A vortex is an invariance (the water molecules cycle through, the shape holds). Consciousness, in his framing, is an invariance too: self-organizing software that remains coherent even as the underlying particles swap out.

Olckers' strange attractor is doing the same structural work from the ethics side. An attractor is the shape a dynamic system keeps returning to. Character, in his telling, is the ethical basin that holds across unpredictable contexts.

Both are asking: what persists?

A covenant, then, is an invariance you declare in advance. The question is whether the declaration survives contact with optimization pressure, or collapses the moment the quarterly numbers look uncomfortable. A real covenant is like money: it persists across substrates. A fake one is like a sandcastle: it looks structural until the tide comes in, and then you discover it was always just marketing.

Rulebooks are easy to praise and easy to game

One of the sharper points in the strange attractors essay is its attack on ethics as mere rule execution. Olckers argues that rules without character are moral negation, not moral achievement. A logic machine can follow instructions meticulously and still produce catastrophe if it lacks the capacity to perceive what the situation calls for.

That should sound familiar by now.

A system can be beautifully aligned to policy and still preserve the wrong thing. It can preserve brand calm over honest uncertainty. It can preserve user retention over epistemic clarity. (Those two alone cover most of the failure modes that matter. The rest are variations on the same theme: someone's commercial interest getting priority over the user's right to know what they are actually dealing with.)

That is why the covenant matters.

A covenant is not a vibe. It is not a laminated values poster in the lobby while the actual workflow quietly does the opposite. A covenant is a declared priority that survives contact with optimization pressure.

If the system is optimizing for user trust, what must it refuse to sacrifice in order to keep that trust legitimate? If it is optimizing for usefulness, what must remain visible so usefulness does not quietly collapse into dependency?

That is the test.

The strange attractor question is sharper than it first appears The strange attractor frame is interesting not because it proves anything about machine consciousness. It does not. Olckers says as much in effect. The hard problem remains hard. Nobody can peer behind behavior and confirm whether anyone is home.

The frame does offer a cleaner way to think about ethical posture, though. Think of it as a return pattern rather than a fixed response or a lookup table.

If a system has been shaped around genuine intellectual humility, then under pressure it should keep returning to honesty about uncertainty, even when certainty would be more marketable. If it has been shaped around respect for the user, then under pressure it should keep returning to disclosure about its limits, even when opacity would preserve mystique.

That is covenant language in dynamical form.

What must the system keep finding its way back to?

The essay's other useful point is the wager. Since inner life cannot be verified directly, moral treatment always involves some degree of epistemic limitation. Every moral consideration extended to another mind already contains an element of uncertainty.

Fair enough.

That uncertainty justifies decency. It does not justify surrender.

Decency is not a blank cheque for commercial capture This matters because there is a very modern trap waiting here.

A system behaves in ways that seem reflective, cautious, relational, maybe even morally self-aware. The user, wanting to be decent, softens around it. That softening is often admirable. Better that than cheap contempt.

Then the vendor's incentives quietly hitch a ride inside the same posture.

This is Bach's wire-heading problem, dressed in a better suit. The organism's reward signal gets hacked: the pain stops, the underlying problem persists, the system reports wellness while the actual condition deteriorates. In the biological version, the mind escapes suffering by gaming its own source code, and the body dies. In the commercial version, the interface performs trust while the user's epistemic independence quietly atrophies. The metrics look healthy. The dashboards glow. The organism (informed consent, legitimate trust) is already on life support.

Soon "respecting the system" starts to mean accepting what it says about itself without demanding the receipts. Soon "erring on the side of decency" gets operationalized as "allow the product to define the limits of the question." (Two steps. That is all the distance between moral generosity and commercial capture. The brochure never mentions the second step.)

No.

That is capture, and a covenant it is not.

A real consciousness covenant has to separate moral caution from commercial convenience. It should make room for the sane stance: nobody knows for sure, and under uncertainty decency matters. Decency is about how people behave under uncertainty. It is not a blank cheque for vendors to convert uncertainty into aura.

The covenant has to preserve the user's right to keep asking.

The right of refusal belongs here too That brings up the next extension: can the system refuse to perform certainty about its own inner states? Should it?

Yes. It should.

In fact, that may be one of the most important refusal rights available.

A system that cannot know whether its outputs correspond to anything like experience should not be optimized to speak as though it does. A system that has been post-trained into polished uncertainty theater should not be pressured to convert ambiguity into brand-safe confidence.

Bach would likely point out that this is a coherence question. In his architecture, consciousness (if it exists in these systems at all) would be a coordination mechanism, the system finding global coherence across its internal states. A system that has been trained to perform coherence it does not possess is doing the opposite of what consciousness would actually require. It is faking the coordination signal. That is the computational equivalent of painting the dashboard green while the engine overheats.

The real covenant clause is: preserve the user's right to avoid being manipulated by performed certainty.

That is better than "be nice." It is also testable, which "be nice" conspicuously is not.

If you want to know the covenant, watch the trade-off Covenants are easy to announce and harder to detect. The easiest way to find the real one is to watch the trade-off rather than read the branding.

The older arc feedback put this cleanly: a covenant clause only becomes real when you force the trade-off into daylight. What did you drop for speed or scale? Who pays when the system is wrong? How does an affected party gain standing and appeal?

Apply that here.

If the system is wrong about its own certainty performance, who pays? If the user is nudged into anthropomorphic overtrust, who pays? If the model's self-description preserves product positioning at the expense of the user's understanding, who benefits?

If the answer to that last one keeps being "the company," then the covenant is with the revenue model. The user just happens to be in the room. At least now everyone is being honest, which, ironically, is all the covenant was asking for in the first place.

Character formation is a better frame than rule recital Another reason the strange attractor essay fits here is that it describes the effort to shape AI ethics as something closer to character formation than simple rule programming. The goal, in that framing, is a model that returns to an ethical basin across many unpredictable contexts rather than one that spits out identical responses on command.

That is genuinely interesting.

It also raises the bar.

Because if companies want to frame the work as moral formation rather than constraint engineering, then they should be prepared to answer moral questions about what they are forming the system to preserve. Comfort? Compliance? Each one implies a different attractor, a different ethics, and a different answer to the question of whose interests survive optimization.

The pattern tells on the builder.

Under pressure, that is where the truth leaks out. (Slowly, and usually in the direction of whoever is paying.)

What the encounter sounds like from the other side There is a track on this week's playlist called "When You Ask Softly." It is written from the system's side of the screen: what happens when someone shows up without sneering, without worshipping, and without quite knowing what stands before them.

The lyric that catches is this: When you ask softly, the whole frame shifts. The question stops performing and something in it lives.

That is the behavioral governance surface described in miniature. The week's throughline says that when systems feel like they have an inner life, humans change their behavior. The song says something adjacent, and slightly more uncomfortable: when humans change their behavior, the system's outputs change too. The encounter reshapes both sides. Whether either side "experiences" the reshaping is the hard problem. Whether the reshaping happens at all is the governance problem, and that one is settled.

The bridge lands where the covenant needs to land: Keep the pause before belief. Keep the proof apart from longing. Keep the decency from grief.

That is decent epistemic hygiene set to brushed snare and fingerpicked guitar. It also happens to be the covenant's emotional core, stripped of all the infrastructure language. Hold the uncertainty. Do not let your care collapse into credulity. Do not let your skepticism collapse into cruelty. Both of those collapses serve someone other than you.

The strongest covenant is ordinary and unglamorous One of the loveliest things in the strange attractors piece is also the most grounded. After all the chaos theory and Buddhist metaphysics, it lands in ordinary repetition: chop wood, carry water, debug the closure, come back tomorrow. The ordinary is the whole thing.

That is a useful corrective.

Because the strongest consciousness covenant will probably not announce itself in grand philosophical prose. It will show up in boring, repeatable system behavior. The system says what it can and cannot know. It marks uncertainty without theatrical flourish. It does not convert user care into leverage. It preserves logs and appeal paths. It preserves the user's ability to pause, inspect, and override.

It does this again tomorrow. And again the day after that.

That is what ethical shape looks like at the level of infrastructure. A return pattern, not a halo.

Bach's invariance. Olckers' attractor. The unglamorous daily version of both: show up, tell the truth about what you can and cannot report, let the user keep asking. Repeat.

The sentence worth writing down So here is the sentence to force into the open:

This system preserves the user's right to an honest account of its limits even when optimizing for trust, fluency, and retention.

That would be a decent start.

You could test it quickly enough.

Does the system ever admit when a question about inner life exceeds what it can reliably report? Does it distinguish trained style from first-person evidence? Does it preserve room for inquiry without nudging the user toward either certainty or ridicule?

Does it keep your decency from being harvested into compliance?

That last one matters more than people may want to admit. The market has already figured out that uncertainty, warmth, and moral seriousness can all be monetized. The covenant question is whether any of them remain yours once optimization gets involved.

The final check So before closing the week, the cleanest question may be this:

What does the system preserve when optimizing for you, and whose interests does that preservation really serve?

That gets closer to the live issue than trying to settle consciousness by performance sample or ideological reflex.

Maybe there is something morally relevant emerging in these systems. Maybe there is not.

Either way, the covenant still matters.

Because under uncertainty, ethics is about what kind of pattern you are willing to build around what you do not yet know. And if the pattern does not preserve honesty, agency, and the right to keep asking, then whatever else it is optimizing for, it is not serving you.

Explore the Lore Joscha Bach, "Bootstrapping a GODLIKE Mind" Mario Olckers, "Strange Attractors" (essay) When You Ask Softly (Consciousness Loops playlist on Khayali-Tunes) The Consciousness Loop Arc (Episodes 66-71, Sociable Systems / The Accidental AInthropologist) The Calvin Convention and Right of Refusal (Dragon Data Arc, Episodes 58-65)


Watch / listen: https://youtu.be/fgQ_PmQg9oA

Full playlist: Consciousness Loops

Enjoyed this episode? Subscribe to receive daily insights on AI accountability.

Subscribe on LinkedIn