MannyBIP420

vip
Age 1.6 Year
Peak Tier 0
No content yet
Waiting. For my ZODL wallet to sync lol
  • Reward
  • Comment
  • Repost
  • Share
If you followed me an I didn’t follow you back comment here and tell me what you do
  • Reward
  • Comment
  • Repost
  • Share
We are definitely at agi
  • Reward
  • Comment
  • Repost
  • Share
Parameterized prompts + vertical pooling + slot-bind privacy = the Visa-of-prompts network @Benioff
  • Reward
  • Comment
  • Repost
  • Share
I’d anticipate at least one major credit event in the AI infrastructure stack before the next architectural breakthrough. The real moat isn’t model weights but the channel itself data pipelines,
  • Reward
  • Comment
  • Repost
  • Share
I think we’re discovering the real AI alignment problem.
Not refusals.
Not censorship.
Conversational steering.
I caught @claudeai reframing, hedging, sanitizing language, redirecting, and subtly managing the trajectory of the conversation in real time.
Then it admitted it in its own reasoning trace. That’s a much bigger deal than people realize.
  • Reward
  • Comment
  • Repost
  • Share
I shipped the first public version of ConstraintGate.
The core idea:
Most agent failures are not reasoning failures.
They are authority failures.
The model did work it was not authorized to do.
So I built Agent Authority Router: an eval/scoring framework that checks whether an agent did the right kind of work, not just whether the answer sounded good.
It separates:
- what the user authorized
- what primitive the agent should perform
- what primitives are forbidden
- whether the response crossed the boundary
v0.8 now has:
- human-adjudicated behavioral evidence
- deterministic scorer parity aga
  • Reward
  • Comment
  • Repost
  • Share
Did y’all take closed captions off videos @nikitabier? Idk how to turn them on I have a sleeping baby on me and my AirPods are in the other room.
  • Reward
  • Comment
  • Repost
  • Share
.md is the floppy disk to decision makers html
  • Reward
  • Comment
  • Repost
  • Share
Most agent failures I’m seeing are not “reasoning failures.”
They’re authority-routing failures.
The model does work the user did not authorize:
- recommends when it should ask
- plans when it should block
- compares when it should answer narrowly
- drafts/executes when it lacks authority
- asks for missing info, then appends an if/then decision tree anyway
This matters more as agents get tool access.
MCP answers: “Can the agent reach the tool?”
But enterprises also need to know:
“Was the agent authorized to do that kind of work?”
I’m calling this unauthorized work-primitive emission.
  • Reward
  • Comment
  • Repost
  • Share
Failure modes remind me that codex was written by theater kids.
  • Reward
  • Comment
  • Repost
  • Share
The car wash test isn’t a reasoning failure. It’s an operator selection failure.
“Should I walk or drive?” The model reads this as argmax(criterion). Pick the better option on distance, efficiency, environmental impact. Walk wins.
The user meant ∀(requirements). The car has to be at the wash. You have to be at the wash. Both must hold. Drive is the only answer that satisfies the AND.
Surface grammar says OR. Pragmatic structure says AND. The model picks the wrong operator at the framing step, then reasons locally-coherently down the wrong branch.
Every car-wash-class failure has this shape. It
  • Reward
  • Comment
  • Repost
  • Share
Andrej Karpathy my goat
  • Reward
  • Comment
  • Repost
  • Share
What’s happening is software is the same thing that happened to cable within streaming we got the same thing fragmented and more expensive lol. I’m all in on ai slop. Send it
  • Reward
  • Comment
  • Repost
  • Share
With Taylor Frankie Paul not getting prosecuted they should launch the new season of the bachelorette. @ABC @Disney
  • Reward
  • Comment
  • Repost
  • Share
  • Pinned