Xudong Zhu
I am a PhD student in Computer Science at The Ohio State University, advised by Prof. Zhihui Zhu. My research interests lie in the mechanistic interpretability of large language models, with a current focus on exploring novel tools for feature discovery and representation analysis.
Research Interests
My current research focuses on:
- Mechanistic Interpretability: Understanding internal representations and circuits in large language models
- Linear Representation Hypothesis: Investigating whether model features and behaviors can be explained and manipulated through linear structure in representation space
- Representation Geometry: Studying the structure and organization of learned features in neural networks
- Sparse Autoencoders (SAEs): Feature discovery, disentanglement, and representation analysis
Recent Work
My recent work investigates why the Linear Representation Hypothesis holds widely in large language models, and why linear steering directions can reliably control model behavior. I study how semantic features emerge as linear structure in representation space, enabling both interpretation and intervention through simple linear operations.
Background
I received my B.S. in Computer Science from the University of Electronic Science and Technology of China in 2024, where I graduated with a GPA of 3.98. I am currently pursuing my Ph.D. at The Ohio State University, expected to complete in 2029.
Contact
- Email: zhu.3944@osu.edu
- Location: Columbus, OH
- GitHub: xzAscC
- Google Scholar
- ORCID
