Sparse Autoencoders - Search News

DeepMind makes big jump toward interpreting LLMs with sparse autoencoders

Large language models (LLMs) have made remarkable progress in recent years. But understanding how they work remains a challenge and scientists at artificial intelligence labs are trying to peer into ...

Harvard Business School

Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability

Bhalla, Usha, Alex Oesterling, Claudio Mayrink Verdun, Himabindu Lakkaraju, and Flavio Calmon. "Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability." ...

The Scientist

Researchers Decode How Protein Language Models Think, Making AI More Transparent

For large language models (LLMs) like ChatGPT, accuracy often means complexity. To be able to make good predictions, ChatGPT must deeply understand the concepts and features that are associated with ...

Harvard Business School

Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders

Jiaxun Li, Aaron, Suraj Srinivas, Usha Bhalla, and Himabindu Lakkaraju. "Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders." Proceedings of the Conference of the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results