First Order Stochastic Optimization with Oblivious Noise

Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos

Abstract

We initiate the study of stochastic optimization with oblivious noise, broadly generalizing the standard heavy-tailed noise setup.In our setting, in addition to random observation noise, the stochastic gradient may be subject to independent emph{oblivious noise}, which may not have bounded moments and is not necessarily centered. Specifically, we assume access to a noisy oracle for the stochastic gradient of $f$ at $x$, which returns a vector $nabla f(gamma, x) + xi$, where $gamma$ is the bounded variance observation noise and $xi$ is the oblivious noise that is independent of $gamma$ and $x$. The only assumption we make on the oblivious noise $xi$ is that $Pr[xi = 0] ge alpha$, for some $alpha in (0, 1)$.In this setting, it is not information-theoretically possible to recover a single solution close to the target when the fraction of inliers $alpha$ is less than $1/2$. Our main result is an efficient {em list-decodable} learner that recovers a small list of candidates at least one of which is close to the true solution. On the other hand, if $alpha = 1-epsilon$, where $0< epsilon < 1/2$ is sufficiently smallconstant, the algorithm recovers a single solution.Along the way, we develop a rejection-sampling-based algorithm to perform noisy location estimation, which may be of independent interest.

Others

NeurIPS 2023

First Order Stochastic Optimization with Oblivious Noise

Download Paper

Github Code