-
How Language Models Retrieve Bound Entities In-Context
To reason, LMs must bind together entities in-context. How they do this is more complicated than was first thought.
-
Using Sparse Autoencoders for Knowledge Erasure
Can we leverage SAEs to effectively erase knowledge from LLMs in a targeted way?