Eshaan Nichani

Eshaan Nichani

I am an incoming postdoctoral researcher at Microsoft Research NYC. I received my PhD in Electrical and Computer Engineering from Princeton University, advised by Jason D. Lee and Yuxin Chen. During my PhD I spent time at the Flatiron Institute, and my research was supported by the DoD NDSEG Fellowship and the IBM PhD Fellowship. I previously received my B.S. degrees in Math and Computer Science (2020) and my M.Eng degree in EECS (2021) from MIT, where I was advised by Caroline Uhler.

Research

My research is focused on developing the mathematical and scientific foundations of modern AI. I've recently been interested in the following directions:

News

Publications

2026

Sharp Capacity Scaling of Spectral Optimizers in Learning Associative Memory

Juno Kim*, Eshaan Nichani*, Denny Wu, Alberto Bietti, Jason D. Lee

Preprint, 2026 Optimization

Fine-Tuning Dynamics of In-Context Factual Recall in Transformers

Ruomin Huang, Eshaan Nichani, Jason D. Lee, Rong Ge

Preprint, 2026 Transformers

Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

Nicholas Barnfield, Juno Kim, Eshaan Nichani, Jason D. Lee, Yue M. Lu

Preprint, 2026 Misc. Statistics

On the Statistical Query Complexity of Learning Semiautomata: A Random Walk Approach

George Giapitzakis, Kimon Fountoulakis, Eshaan Nichani, Jason D. Lee

COLT 2026 Transformers

Quantitative Bounds for Length Generalization in Transformers

Zachary Izzo*, Eshaan Nichani*, Jason D. Lee

ICLR 2026 OralTransformers

2025

Emergence and Scaling Laws in SGD Learning of Shallow Neural Networks

Yunwei Ren*, Eshaan Nichani*, Denny Wu, Jason D. Lee

NeurIPS 2025 Repr. Learning

Learning Compositional Functions with Transformers from Easy-to-Hard Data

Zixuan Wang*, Eshaan Nichani*, Alberto Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, Denny Wu

COLT 2025 Transformers

Understanding Factual Recall in Transformers via Associative Memories

Eshaan Nichani, Jason D. Lee, Alberto Bietti

ICLR 2025 SpotlightTransformers

Learning Hierarchical Polynomials of Multiple Nonlinear Features with Three-Layer Networks

Hengyu Fu, Zihao Wang, Eshaan Nichani, Jason D. Lee

ICLR 2025 Repr. Learning

2024

How Transformers Learn Causal Structure with Gradient Descent

Eshaan Nichani, Alex Damian, Jason D. Lee

ICML 2024 Transformers

Learning Hierarchical Polynomials with Three-Layer Neural Networks

Zihao Wang, Eshaan Nichani, Jason D. Lee

ICLR 2024 Repr. Learning

Metastable Mixing of Markov Chains: Efficiently Sampling Low Temperature Exponential Random Graphs

Guy Bresler*, Dheeraj Nagaraj*, Eshaan Nichani*

Annals of Applied Probability, 2024 Misc. Statistics

2023

Fine-Tuning Language Models with Just Forward Passes

Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, Sanjeev Arora

NeurIPS 2023 OralOptimization

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

Alex Damian, Eshaan Nichani, Rong Ge, Jason D. Lee

NeurIPS 2023 OralRepr. Learning

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks

Eshaan Nichani, Alex Damian, Jason D. Lee

NeurIPS 2023 SpotlightRepr. Learning

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

Alex Damian*, Eshaan Nichani*, Jason D. Lee

ICLR 2023 Optimization

2022

Causal Structure Discovery between Clusters of Nodes Induced by Latent Factors

Chandler Squires, Annie Yun, Eshaan Nichani, Raj Agrawal, Caroline Uhler

CLeaR 2022 Misc. Statistics

Workshop Papers

Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks

Eshaan Nichani*, Adityanarayanan Radhakrishnan*, Caroline Uhler

ICML 2021 Workshop on Overparameterization: Pitfalls & Opportunities

On Alignment in Deep Linear Neural Networks

Adityanarayanan Radhakrishnan*, Eshaan Nichani*, Daniel Irving Bernstein, Caroline Uhler

ICML 2021 Workshop on Overparameterization: Pitfalls & Opportunities

Adaptive Diagonal Curvature: A Quasi-Newton Method for Stochastic Optimization

David Saxton, Eshaan Nichani

ICML 2020 Workshop on Beyond First Order Methods in ML Systems

Thesis

Older Work

Assessment of Circulant Copy Number Variant Detection for Cancer Screening

Bhuvan Molparia, Eshaan Nichani, Ali Torkamani

PLoS ONE, 2017

* Equal contribution.