The authors argue that generative AI introduces a new class of alignment risks because interaction itself becomes a mechanism of influence. Humans adapt their behavior in response to AI outputs, ...
Investing.com -- Anthropic and OpenAI have published results from their first joint alignment evaluation exercise, revealing strengths and weaknesses in both companies’ AI models when tested in ...
On August 27, 2025, Anthropic and OpenAI jointly released findings from their pilot alignment evaluation exercise, marking a significant collaboration between the two AI research organisations. In ...