We initiate the study of stochastic optimization with oblivious noise, broadly generalizing the standard heavy-tailed noise setup.In our setting, in addition to random observation noise, the stochastic gradient may be subject to independent emph{oblivious noise}, which may not have bounded moments and is not necessarily centered. Specifically, we assume access to a noisy oracle for the stochastic gradient of $f$ at $x$, which returns a vector $nabla f(gamma, x) + xi$, where $gamma$ is the bounded variance observation noise and $xi$ is the oblivious noise that is independent of $gamma$ and $x$. The only assumption we make on the oblivious noise $xi$ is that $Pr[xi = 0] ge alpha$, for some $alpha in (0, 1)$.In this setting, it is not information-theoretically possible to recover a single solution close to the target when the fraction of inliers $alpha$ is less than $1/2$. Our main result is an efficient {em list-decodable} learner that recovers a small list of candidates at least one of which is close to the true solution. On the other hand, if $alpha = 1-epsilon$, where $0< epsilon < 1/2$ is sufficiently smallconstant, the algorithm recovers a single solution.Along the way, we develop a rejection-sampling-based algorithm to perform noisy location estimation, which may be of independent interest.
Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos
Abstract
Others
NeurIPS 2023