Research
My research focuses on natural language processing, machine learning, large language model, and explainable AI. Our goal is to uncover the mechanisms behind large language model and use that understanding to build trustworthy models that are reliable, truthful, and safe.
Mechanistic Study of LLMs
How can we open the black box of LLMs to uncover their internal mechanisms, especially those enabling complex reasoning?
Mechanism-Guided Trustworthy LLM
How can insights from mechanistic understanding be translated into practice for building LLMs that are reliable, truthful, and safe in real-world applications?
Featured Research















