Discussion about this post

User's avatar
AI Agents Simplified's avatar

Tools do not only change what we can do.

Over time, they also change what cognitive muscles we continue to practice using ourselves.

Ex-Consultant in Tech's avatar

The deeper issue with LLM evals is that they force teams to admit something uncomfortable: most companies never really defined “quality” in the first place. With deterministic software, you could hide behind pass/fail tests. With LLMs, that illusion breaks. Now you have to decide what matters: accuracy, usefulness, tone, risk, latency, cost, refusal behavior, source faithfulness, user trust, business outcome. And those things often conflict.

1 more comment...

No posts

Ready for more?