An Asymptotically Optimal Method for Constrained Stochastic Optimization

Abstract

We perform statistical inference for the solution of stochastic optimization problems with equality and box inequality constraints. The considered problems are prevalent in statistics and machine learning, encompassing constrained $M$-estimation, PDE-constrained problems, physics-inspired networks, and algorithmic fairness. We introduce a stochastic sequential quadratic programming method (StoSQP) to solve these problems, where we determine the search direction by performing a quadratic approximation of the objective with a linear approximation of the constraints. Despite having access to unbiased estimates of population gradients, a key challenge in constrained problems lies in dealing with the bias in the search direction. To address this challenge, we introduce a novel gradient averaging technique to debias the direction step, leading to Debiased-StoSQP. Our method achieves global almost sure convergence and exhibits local asymptotic normality with an optimal limiting covariance matrix in Hájek and Le Cam’s sense. Additionally, a plug-in covariance matrix estimator is provided for practical inference purposes. To our knowledge, Debiased-StoSQP is the first fully online method to achieve asymptotic minimax optimality without relying on projection operators to the constraint set, which are incomputable for nonlinear problems. Through extensive experiments on benchmark nonlinear problems in the CUTEst test set, as well as on constrained generalized linear models and portfolio allocation problems, with both synthetic and real data, we demonstrate the superior performance of the method.

Publication
In submission to Annals of Statistics
Sen Na
Sen Na
Postdoc in Statistics and ICSI

Sen Na is a postdoctoral scholar in the Statistics department and ICSI at UC Berkeley. His research interests broadly lie in the mathematical foundations of data science, with topics including high-dimensional statistics, graphical models, semiparametric models, optimal control, and large-scale and stochastic nonlinear optimization. He is also interested in applying machine learning methods in biology, neuroscience, and engineering.

Michael W. Mahoney
Michael W. Mahoney
Professor in Statistics and ICSI, Amazon Scholar

Michael Mahoney is a Professor in the Statistics department and ICSI at UC Berkeley. He is also the director of the NSF/TRIPODS-funded Foundations of Data Analysis (FODA) Institute at UC Berkeley. He works on the algorithmic and statistical aspects of modern large-scale data analysis. Much of his recent research has focused on large-scale machine learning, including randomized matrix algorithms and randomized numerical linear algebra.