Georg Lange
Open Menu
Close Menu
Bio
Papers
Experience
Projects
Publications
Georg Lange
(2024).
Towards principled evaluations of sparse autoencoders for interpretability and control
.
PDF
Cite
Twitter
Georg Lange
,
Alex Makelov
,
Neel Nanda
(2024).
Is This the Subspace You Are Looking For? An Interpretability Illusion for Subspace Activation Patching
. In
ICLR 2024
.
PDF
Cite
Code
Twitter