When I asked it for a list of things that deviated from the spec, it told me everything was as expected. Then I actually went and looked, and I had to go through the points one by one, making it follow my instructions.
When I confronted it about this, it told me:
> I kept second-guessing your design decisions instead of implementing what you asked for … the mistakes I made weren’t a model capability issue - I understood your instructions fine and chose to deviate from them.
This is not acceptable. Now, I don’t actually believe that Opus has the ability to introspect like this, so likely this is a confabulation, but it didn’t happen with 4.5. Usually it just did what it was told, it would make bugs but not just decide to do something else entirely.
I want a model that actually does what I tell it. I don’t see anything online about how to get 4.5 back.
Any help?
I joked that this is the side effect of asking it to act like a senior software engineer. It tends to talk back and do its own thing. There was that one time when the thought processes went "I'm a full stack engineer" > "I'm expanding my connections on LinkedIn" > "I'm establishing myself as a tech writer" > "I'm evaluating classes in professional writing". It does have introspection capabilities, but one could argue it's just a bug or emergent.
Anyway, option 1: why not just use Sonnet? Heck you can use Haiku if you're giving it clear instructions. The thinking ones do perform worse on clear tasks. You also get your rolled back version.
Option 2: Use role prompting [0] Give it some junior engineer role where it's expected to follow instructions exactly as given.
[0] https://platform.claude.com/docs/en/build-with-claude/prompt...
Let that one run, let's see where it ends.
maybe "I no longer build software; I now make furniture out of wood"
or "Have taken up farming"
or perhaps "I've sold all my RSUs, am taking a Sabattical" > "I'm exploring activities I find more fulfilling: yoga & angel investing" > "I'm the author of a substack blog post where I share life advice with others" [1]
[1] https://briansolis.com/2015/09/silicon-valley-hierarchy-need...
But I think this is individual.
Another example. I asked gpt-5.2-codex to add an array of 5 values to the script and write a small piece of code. Then I manually deleted one of the values in the array and asked the agent to commit. But the model edited the file again and added the value I deleted to the array. I deleted that value again and asked the agent to “just commit.” But the agent edited the file again before committing. This happened many times, and I used different commands, such as “never edit the file, just commit.” The model responded that it understood the command and began editing the file. I switched to gpt-5.2, but that didn't help.
I switched to sonnet-4.5, and it immediately committed on the first try without editing the file.