The assistant axis: situating and stabilizing the character of large language models

Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems.

Read in full here: