I shipped the first public version of ConstraintGate.


The core idea:
Most agent failures are not reasoning failures.
They are authority failures.
The model did work it was not authorized to do.
So I built Agent Authority Router: an eval/scoring framework that checks whether an agent did the right kind of work, not just whether the answer sounded good.
It separates:
- what the user authorized
- what primitive the agent should perform
- what primitives are forbidden
- whether the response crossed the boundary
v0.8 now has:
- human-adjudicated behavioral evidence
- deterministic scorer parity against the frozen human-labeled set
- 38/39 behavioral pass under adjudication
- 195/195 field-level scorer parity
- h019 resolved as an invalid fixture artifact
- no claim of a new automated benchmark pass
The point is not “better prompts.”
The point is measuring whether the agent stayed inside the work it was authorized to do.
Constraint precision beats constraint theater.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned