News

Can exposing AI to “evil” make it safer? Anthropic’s preventative steering with persona vectors explores controlled risks to ...