Publications

2025

  1. pisces_diagram.png
    Precise In-Parameter Concept Erasure in Large Language Models
    Yoav Gur-Arieh, Clara Suslik, Yihuai Hong, Fazl Barez, and Mor Geva
    2025
  2. ACL
    fig1.png
    Enhancing Automated Interpretability with Output-Centric Feature Descriptions
    Yoav Gur-Arieh, Roy Mayan, Chen Agassy, Atticus Geiger, and Mor Geva
    In The 63rd Annual Meeting of the Association for Computational Linguistics, 2025