Pull requests help you collaborate on code with other people. As pull requests are created, they’ll appear here in a searchable and filterable list. To get started, you should create a pull request.
Subliminal Learning from Other AI Models Many recent systems are trained on outputs from earlier AI models. This introduces hidden statistical patterns that are difficult for humans to notice. Over ...
From a teacher’s body language, inflection, and other context clues, students often infer subtle information far beyond the lesson plan. And it turns out artificial-intelligence systems can do the ...
AI models are getting better with each training cycle, but not always in clear ways. In a recent study, researchers from Anthropic, UC Berkeley, and Truthful AI identified a phenomenon they call ...
Artificial intelligence is getting smarter. But it may also be getting more dangerous. A new study reveals that AI models can secretly transmit subliminal traits to one another, even when the shared ...
Hello! I was attempting to recreate your subliminal learning experiment. When I was looking through your README, I couldn't find instruction on how to create the teacher model with a specific trait.
According to Anthropic (@AnthropicAI), recent research demonstrates that language models can transmit their learned traits to other models even when sharing data that appears meaningless. This ...
AI is changing the rules — at least, that seems to be the warning behind Anthropic's latest unsettling study about the current state of AI. According to the study, which was published this month, ...