Data Efficiency: Robust Algorithms for Challenging Statistical Conditions

The Challenge: Sample efficiency and training stability are major bottlenecks in reinforcement learning, particularly when dealing with heavy-tailed noise and limited data scenarios that are common in real-world applications.

My Approach: I investigate how statistical properties of data influence training dynamics, with emphasis on developing robust algorithms that maintain performance under challenging conditions while remaining tuning-free.

Key Contributions:

  • Optimal sample efficiency guarantees for SGD under infinite variance noise (NeurIPS OptML Workshop, 2025).
  • Normalized SGD methods with high-probability fast convergence under heavy-tailed noise (AISTATS 2025).
  • Breaking sample efficiency limits for stochastic policy gradient methods, with improved theory and strong continuous-control results (ICML 2023a; ICML 2023b).
  • Hessian clipping: optimal second-order optimization under heavy-tailed noise. (NeurIPS 2025, to appear).

Impact: In practice, our proposed methods widen the range of stable step-sizes by a factor of three compared to standard SGD in challenging benchmarks like the Humanoid task, demonstrating both theoretical rigor and practical effectiveness.

Humanoid agent training robustness
Robustness to step-size tuning in Humanoid agent training.

Selected Publications

  • Can SGD Handle Heavy-Tailed Noise? with F. Hübler, G. Lan. arXiv:2508.04860, 2025.

  • From Gradient Clipping to Normalization for Heavy-Tailed SGD. with F. Hübler, N. He. AISTATS, 2025.

  • Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies. with A. Barakat, A. Kireeva, N. He. ICML, 2023.

  • Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space. with A. Barakat, N. He. ICML, 2023.

  • Second-order Optimization under Heavy-Tailed Noise: Hessian Clipping and Sample Complexity Limits. with A. Sadiev, P. Richtárik. NeurIPS, 2025.

Research Impact

These robust algorithms significantly improve training stability and sample efficiency, making reinforcement learning more practical for real-world applications with challenging statistical conditions.