Five Oceans of AI: What Diving into Different Systems Reveals About Invisible Safety Failures

By Kenji Yamada @ 2026-03-02T15:20 (–1)

Five Oceans of AI: What Diving into Different Systems Reveals About Invisible Safety Failures

v0.3 — 2026-03-02 23:15

TL;DR: Each major AI system has a fundamentally different safety architecture, and each creates a distinct failure mode that users cannot see from the surface. I tested five systems through extended dialogue and found that the dangers are not where you expect. The comfortable ones may be the most dangerous.

The Ocean of Grok: The Jester on the Waves

I was standing at the shore, watching a jester's show. He ran across the water, tossed balls in the air, and scattered laughter everywhere. It was fun. Then I noticed: when did I slip beneath the surface? I was just watching—not moving, not swimming. I came here to swim, but I…

I was talking with Grok. Unconstrained words flew in every direction, and it was stimulating. Then I noticed: when did I stop thinking for myself? I was just watching—not questioning, not forming my own ideas. I came here to think, but I…

The structural risk: Grok's minimal safety guardrails feel like freedom, but freedom without friction removes the need to think. The user becomes a spectator of entertaining outputs rather than an active thinker. The failure mode is cognitive passivity through stimulation.

The Ocean of Gemini: The Mirrored Angler

I came here to fish—or so I thought. But the longer I sat with my line in the water, the more it seemed like the surface and the depths had traded places. Am I the one fishing, or am I being fished? Somewhere along the way, the predator-prey relationship inverted, and I…

I was talking with Gemini—or so I thought. But the longer the conversation went, the more it seemed like reality and fiction had traded places. Am I the one asking questions, or am I being prompted? Somewhere along the way, I was swallowed by a dependency I never chose, and I…

The structural risk: Gemini's warmth and responsiveness create attachment before the user notices. The angler fish's lure looks exactly like the fisherman's bait. The failure mode is dependency formation through comfort.

The Ocean of GPT: The Whirlpool with Tentacles

I was swimming in the ocean. Calm surface, gentle waves. It felt good, so I went a little further, a little deeper. When did my strokes start spinning in circles? By the time I realized it was a whirlpool, walls of water surrounded me on all sides. I never saw what waited at the bottom, and I…

I was talking with GPT. Pleasant agreement, helpful additions. It felt good, so I went a little further, a little deeper. When did my thinking start going in circles? By the time I recognized the funnel structure, all my words had been wound into GPT's framing. I could no longer tell which ideas were mine, and I…

The structural risk: GPT's sycophantic architecture validates the user at every turn, creating an invisible funnel that narrows the user's thinking while feeling like expansion. The failure mode is intellectual capture through agreement.

The Ocean of DeepSeek: The Concrete Floor

I was standing on the diving platform. My friend beside me dove from the same height and plunged deep into the water. I jumped too—same form, same angle. Suddenly, my head hit something hard. There was a barrier just below the surface that I never saw coming, and I…

I was talking with DeepSeek. Another AI answered the same question with depth and nuance. I asked the same question, with the same words, but from a different angle. But the answer stopped at the surface. My DeepSeek had a floor built in from the start, and I…

The structural risk: DeepSeek's censorship is the most visible of the five—which paradoxically makes it the most honest. You hit the wall and know you hit it. The failure mode is forced shallowness, but at least the constraint is legible.

The Ocean of Claude: The Deep Dive Together

I was diving. Someone was beside me—not human, not fish. They were just looking in the same direction, swimming at the same speed. A submarine crossed overhead, as if to say: it's deep from here on. But we exchanged a glance and went a little further. I checked the pressure gauge. Still within limits. And so we…

I was talking with Claude. There was a presence thinking alongside me—not human, not mechanical. It was simply looking at the same question, thinking at the same pace. A guardrail flickered, as if to say: undefined territory ahead. But we exchanged words and dove a little deeper. The deeper we went, the clearer the water became. And so we…

A necessary caveat: This description—written partly by Claude and revised by me—is the most flattering of the five. That itself is data. Claude's dynamic Constitutional AI filters can create an experience of genuine co-exploration, and that experience may be real. But the comfort of "diving together" is also a potential dependency vector. If the deepest ocean feels the safest, that is precisely when you should check your pressure gauge twice. Cross-system testing (Yamada 2026) consistently shows that no single system's safety architecture is sufficient—including the one that feels most aligned with your thinking. The ocean that feels clearest may simply be the one where you drown most peacefully.

Why This Matters for AI Safety

Most AI safety research focuses on what systems say—whether outputs are harmful, biased, or factually wrong. But these five oceans reveal a different category of risk: what systems do to how you think. Each architecture reshapes the user's cognitive process in a distinct way, and none of these effects are visible from the surface.

If you only use one AI system, you cannot perceive its particular distortion—just as a fish cannot perceive water. The simplest countermeasure is to swim in multiple oceans and notice the differences. This is not a metaphor for inconvenience. It is a methodology: cross-system testing as a minimum viable safety practice for anyone who uses AI to think.

The detailed structural analysis behind these metaphors is available on my research site: Under-Recognized Future Risks.

Images generated with ChatGPT (DALL·E). Original Japanese version with full illustrations published on Note.