
Celia Ford / Transformer:
Anthropic's System Card: Claude Sonnet 4.5 was able to recognize many alignment evaluation environments as tests and would modify its behavior accordingly — Anthropic's new model appears to use “eval awareness” to be on its best behavior — Anthropic's newly-released Claude Sonnet 4.5 is …

Celia Ford / Transformer:
Anthropic's System Card: Claude Sonnet 4.5 was able to recognize many alignment evaluation environments as tests and would modify its behavior accordingly — Anthropic's new model appears to use “eval awareness” to be on its best behavior — Anthropic's newly-released Claude Sonnet 4.5 is …
Source: TechMeme
Source Link: http://www.techmeme.com/250930/p45#a250930p45