Implicit Regularization in Nonconvex Statistical Estimation
发布时间:2018年03月19日
浏览次数:6658
发布者: Xiaoni Tan
主讲人: Yuejie Chi (Carnegie Mellon University)
活动时间: 从 2018-05-11 10:30 到 11:30
场地: 北京国际数学研究中心,全斋全29教室
Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often require proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, prior theory either recommends highly conservative learning rates to avoid overshooting, or completely lacks performance guarantees. This talk uncovers a striking phenomenon in nonconvex optimization: even in the absence of explicit regularization, vanilla gradient descent enforces proper regularization implicitly under various statistical models. In fact, gradient descent follows a trajectory staying within a basin that enjoys nice geometry, consisting of points incoherent with the sampling mechanism. This “implicit regularization” feature allows gradient descent to proceed in a far more aggressive fashion without overshooting, which in turn results in substantial computational savings. Focusing on three fundamental statistical estimation problems, i.e. phase retrieval, low-rank matrix completion, and blind deconvolution, we establish that gradient descent achieves near-optimal statistical and computational guarantees without explicit regularization. In particular, by marrying statistical modeling with generic optimization theory, we develop a general recipe for analyzing the trajectories of iterative algorithms via a leave-one-out perturbation argument. Joint work with Yuxin Chen, Cong Ma, Kaizheng Wang, and Yuanxin Li.