Publications

Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren*, Eshaan Nichani*, Denny Wu, Jason D. Lee.
ArXiv preprint, 2025.

Learning Compositional Functions with Transformers from Easy-to-Hard Data
Zixuan Wang*, Eshaan Nichani*, Alberto Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, Denny Wu.
Conference on Learning Theory (COLT), 2025.

Understanding Factual Recall in Transformers via Associative Memories
Eshaan Nichani, Jason D. Lee, Alberto Bietti.
International Conference on Learning Representations (ICLR), 2025 (Spotlight).
Oral presentation at M3L workshop, NeurIPS 2024.

Learning Hierarchical Polynomials of Multiple Nonlinear Features with Three-Layer Networks
Hengyu Fu, Zihao Wang, Eshaan Nichani, Jason D. Lee.
International Conference on Learning Representations (ICLR), 2025.

How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason D. Lee.
International Conference on Machine Learning (ICML), 2024.

Learning Hierarchical Polynomials with Three-Layer Neural Networks
Zihao Wang, Eshaan Nichani, Jason D. Lee.
International Conference on Learning Representations (ICLR), 2024.

Metastable Mixing of Markov Chains: Efficiently Sampling Low Temperature Exponential Random Graphs
Guy Bresler*, Dheeraj Nagaraj*, Eshaan Nichani*.
Annals of Applied Probability, 2024.

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani, Alex Damian, Jason D. Lee.
Neural Information Processing Systems (NeurIPS), 2023 (Spotlight).

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Alex Damian, Eshaan Nichani, Rong Ge, Jason D. Lee.
Neural Information Processing Systems (NeurIPS), 2023 (Oral).

Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, and Sanjeev Arora.
Neural Information Processing Systems (NeurIPS), 2023 (Oral).

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
Alex Damian*, Eshaan Nichani*, Jason D. Lee.
International Conference on Learning Representations (ICLR), 2023.

Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani, Yu Bai, Jason D. Lee.
Neural Information Processing Systems (NeurIPS), 2022.

Causal Structure Discovery between Clusters of Nodes Induced by Latent Factors
Chandler Squires, Annie Yun, Eshaan Nichani, Raj Agrawal, Caroline Uhler.
Conference on Causal Learning and Reasoning (CLeaR), 2022.

Workshop Papers

Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks
Eshaan Nichani*, Adityanarayanan Radhakrishnan*, Caroline Uhler.
ICML 2021 Workshop on Overparameterization: Pitfalls & Opportunities.

On Alignment in Deep Linear Neural Networks
Adityanarayanan Radhakrishnan*, Eshaan Nichani*, Daniel Irving Bernstein, Caroline Uhler.
ICML 2021 Workshop on Overparameterization: Pitfalls & Opportunities.

Adaptive diagonal curvature: a quasi-newton method for stochastic optimization
David Saxton, Eshaan Nichani.
ICML 2020 Workshop on Beyond First Order Methods in ML Systems.

Theses

An Empirical and Theoretical Analysis of the Role of Depth in Convolutional Neural Networks
Thesis for Master of Engineering in EECS, MIT (2021)

Older Work

Assessment of circulant copy number variant detection for cancer screening
Bhuvan Molparia, Eshaan Nichani, Ali Torkamani.
PLoS ONE, 2017.

(* denotes equal contribution or alphabetical authorship)