Selected Research Topics (2008-2017)

We study optimization theory and algorithms for

  • Convex optimization and nonlinear programming

  • Eigenvalue optimization, Optimization with orthogonality constraints

  • Matrix optimization, sparse optimization, low-rank matrix factorization

  • Applications in data sciences and material sciences, etc

Sparse and low-rank matrix reconstruction

Compressive sensing tries to recover the sparsest solution from relatively few incomplete measurements by solving a problem involving the l1-norm regularization. The nonsmoothness of the functions and the requirements handling data sets at extremely large scales are the primary challenges. We proposed in [40, 41] a framework which separates the task of identifying the sparsity and the task of recovering the magnitude of the solution by using different optimization tools. Specifically, inexpensive first-order type approaches are used to yield a sparsity pattern, then second-order type methods are used to explore a smaller but smooth subspace problem based on this estimated support. Hence, the computational cost at each stage is always affordable. The corresponding code FPC_AS has been downloaded for more than 2000 times. Extensive numerical comparisons in [2, 21, 13] show that our method is competitive on quite a few cases.

We proposed nonconvex factorization methods for low-rank matrix optimization problems including matrix completion [42] and matrix separation (robust PCA) [28]. Note that solving convex models of these problems typically requires costly iterations such as computing singular value decomposition. The nonconvex factorization models, as well as the alternating direction method, improve our capacity on handling large-scale problems. In particular, the linear least squares subproblems are solved in an equivalent yet cheaper way. The approach has been applied in seismic data reconstruction [47]. It performs well in the comparisons with several other algorithms [32, 22] and etc.

Semidefinite programming (SDP)

Semidefinite programming is concerned with minimizing a linear function of a symmetric positive semidefinite matrix X subject to linear constraints. In [35], we present a row-by-row (RBR) method. By fixing any (n−1)-dimensional principal submatrix of X and using its Schur complement, the positive semidefinite constraint is reduced to a mere second-order cone constraint, and then a sequence of second-order cone programming problems constructed from the primal augmented Lagrangian function are minimized. After some suitable modifications, the method can be used for sparse principal component analysis [51] and phase retrieval [33]. We present and analyze an alternating direction method of multipliers (ADMM) for solving SDP problems in [36]. At each iteration, the algorithm minimizes the dual augmented Lagrangian function on each block of dual variables separately while other blocks are fixed and then updates the primal variables. This approach has been applied to exact and stable recovery of rotations for robust synchronization [34]. The evaluations in [29, 26, 25, 24, 46, 23] demonstrated that our method can be efficient in many cases.

Recently, we developed a second-order type method [45, 14] based on the equivalence between ADMM and the Douglas-Rachford splitting (DRS). They actually reformulate SDP as a system of nonlinear equations. Although these nonlinear equations may be non-differentiable, they are often semi-smooth, and their generalized Jacobian matrix is positive semidefinite due to monotonicity. We propose an adaptive semi-smooth Newton method and establish its convergence to global optimality. Our second-order type algorithms are able to achieve superlinear or quadratic convergence under certain assumptions. Numerical results show that our algorithm can be faster than one of the best available codes SDPNAL+ [46] for achieving comparable accuracy.

Optimization with orthogonality constraints

Minimization under orthogonality constraints and/or spherical constraints [39] arises in many problems in science and engineering, such as polynomial optimization, combinatorial optimization, linear and nonlinear eigenvalue problems, sparse PCA, density functional theory, Bose-Einstein condensates, phase retrieval, Cryo-EM, matrix rank minimization, etc. These problems are difficult because the constraints are not only non-convex but numerically expensive to preserve during iterations. To deal with these difficulties, we apply the Cayley transform — a Crank-Nicolson-like update scheme — to maintain the constraints and based on it, develop curvilinear search algorithms with lower flops compared to those based on projections and geodesics. The globally optimal solutions can even be identified under certain conditions, and they are useful in synchronization and community detection [1]. The extensive numerical comparisons in [8, 10] show that our method is indeed efficient. The algorithms has been demonstrated on a variety of problems mentioned above [39, 6, 31, 11, 50, 44]. In particular, they can provide multi-fold accelerations over SDP algorithms on several problems arising from polynomial optimization, combinatorial optimization, and the correlation matrix estimation. It has also been applied to robust orthogonal complement principal component analysis in [27] and Bayesian group factor analysis with structured sparsity in [52].

Recently, we proposed an adaptive regularized Newton method [37, 44, 7] which approximates the original objective function by the second-order Taylor expansion in Euclidean space but keeps the orthogonality constraints. The regularization term in the objective function of the subproblem enables us to establish a Cauchy-point like condition as the standard trust-region method for proving global convergence. The subproblem can be solved inexactly either by first-order methods or a truncated Riemannian Newton method. In the latter case, it can further take advantage of negative curvature directions. This Newton-type method can be more efficient than the gradient-type method [39] in density functional theory [37] and in Bose-Einstein condensates [44].

Linear eigenvalue optimization

Linear eigenvalue problem is a typical optimization problem with orthogonality constraints. Most iterative algorithms for eigenpair computation consist of two main steps: a subspace update (SU) step that generates bases for approximate eigenspaces, followed by a Rayleigh-Ritz (RR) projection step that extracts approximate eigenpairs. So far the predominant methodology for the SU step is based on Krylov subspaces that build orthonormal bases piece by piece in a sequential manner. In [43, 18, 38, 17], we investigate block methods in the SU step that allow a higher level of concurrency than what is reachable by Krylov subspace methods. A limited memory block Krylov subspace optimization approach is studied in [17]. The subspace optimization technique can significantly accelerate the classic simultaneous iteration method. In [18], we further derive and study a Gauss-Newton method for computing a symmetric low-rank product that is the closest to a given symmetric matrix in Frobenius norm. Our Gauss-Newton method, which has a particularly simple form, shares the same order of iteration-complexity as a gradient method but can be significantly faster on a broad range of problems. Global convergence and a Q-linear convergence rate are established.

To achieve a competitive speed to the state-of-the-art eigensolvers, we propose in [43] an augmented Rayleigh-Ritz (ARR) procedure and analyze its rate of convergence under realistic conditions. Combining this ARR procedure with a set of polynomial accelerators, as well as utilizing a few other techniques such as continuation and deflation, we construct a block algorithm designed to reduce the number of RR steps and elevate concurrency in the SU steps. Extensive computational experiments are conducted in the C-Language on a representative set of test problems to evaluate the performance of our algorithm in comparison to well-established, high-quality eigensolvers ARPACK, FEAST, and SLEPc. Numerical results, obtained on a many-core computer without explicit code parallelization, show that when computing a relatively large number of eigenpairs, the performance of our algorithms is competitive with, and frequently superior to, that of the three state-of-the-art eigensolvers. Moreover, the algorithm possesses a higher degree of concurrency than Krylov subspace methods, thus offering better scalability on modern multi/many-core computers.

Our code LMSVD has been used in [20] for computing symmetric tensor network state, in [12] for estimating the singular values and vectors in tenor train format, and in [3] for ultrasonic beamforming clutter reduction. Our Gauss-Newton method has been extended in [30] for Low-Rank Matrix Optimization.

Nonlinear eigenvalue optimization

The density functional theory (DFT) in electronic structure calculation is an important source of optimization problems with orthogonality constraints. The Kohn-Sham (KS) equations try to identify orthogonal eigenvectors to satisfy the nonlinear eigenvalue problems. On the other hand, the KS minimization problem minimizes the KS total energy functionals under the orthogonality constraints. These two problems are connected by the optimality conditions. In [16, 15], we study a few theoretical issues in the discretized KS density functional theory. The equivalence between either a local or global minimizer of the KS total energy minimization problem and the solution to the KS equation is established under certain assumptions. The nonzero charge densities of a strong local minimizer are shown to be bounded below by a positive constant uniformly. We analyze the self-consistent field (SCF) iteration by formulating the KS equation as a fixed point map respect to potential. The Jacobian of these fixed point maps is derived explicitly. Both global and local convergence of the simple mixing scheme can be established if the gap between the occupied states and unoccupied states is sufficiently large. This assumption can be relaxed if the charge density is computed using the Fermi-Dirac distribution and it is not required if the exchange correlation functional is zero. Although our assumption on the gap is very stringent, our analysis is still valuable for a better understanding of the KS minimization problem, the KS equation, and the SCF iteration.

The SCF method with heuristics such as charge mixing often works remarkably well on many problems, but it is well known that its convergence can be unpredictable. Hence, we have developed a few efficient algorithms from the point view of solving the KS minimization problem. In [31, 50], the gradient type methods are competitive to the SCF iteration. We further regularize the SCF iteration and establish rigorous global convergence to the first-order optimality conditions in [37]. The Hessian of the total energy functional is exploited. By adding the part of the Hessian which is not considered in SCF, our methods can always achieve a highly accurate solution to problems for which SCF fails and exhibit a better convergence rate than SCF in the KSSOLV toolbox under the Matlab environment. It has been mentioned in [5, 53] that our analysis and approaches for DFT are particularly useful.

Pursuing global solutions of orthogonality constrained optimization

We have constructed an efficient algorithm for finding approximate global minimizers for problems with orthogonality constraints [49]. The main concept is based on noisy gradient flow constructed from stochastic differential equations (SDE) on the Stiefel manifold, the differential geometric characterization of orthogonality constraints. We derive the explicit representation of SDE on the Stiefel manifold endowed with a canonical metric and propose a numerically efficient scheme to simulate this SDE based on Cayley transformation with theoretical convergence guarantee. The convergence to global optimizers is proved if stochastic diffusion is sufficiently enough. The effectiveness and efficiency of the proposed algorithms are demonstrated on a variety of problems including homogeneous polynomial optimization, computation of stability number, iterative closest point and 3D structure determination from Common Lines in Cryo-EM.

In [19], we consider a nonlinear least squares model and propose a modified Levenberg-Marquardt (LM) method for phase retrieval. Under similar assumptions as the well-known Wirtinger flow (WF) algorithm [4], global linear convergence and local quadratic convergence to the global solution set are established by estimating the smallest nonzero eigenvalues of the Gauss-Newton matrix, establishing local error bound properties and constructing a modified regularization condition. The computational cost becomes tractable when a preconditioned conjugate gradient (PCG) method is applied to solve the LM equation inexactly. Specifically, the pre-conditioner is constructed from the expectation of the LM coefficient matrix by assuming the independence between the measurements and iteration points. Our numerical experiments show that our algorithm is robust and it is often faster than the WF method on both random examples and natural image recovery.

Problems with combinatorial structures

We have studied first-order type methods for optimization with combinatorial structures. In [9], we relax the permutation matrices to be the more tractable doubly stochastic matrices and add an Lp-norm (0<p<1) regularization term to the objective function. The optimal solutions of the Lp-regularized problem are the same as the original problem if the regularization parameter is sufficiently large. A lower bound estimation of the nonzero entries of the stationary points and some connections between the local minimizers and the permutation matrices are further established. Then we propose an regularization algorithm with local refinements. The algorithm approximately solves a sequence of regularization subproblems by the projected gradient method using a nonmonotone line search method with the Barzilai-Borwein step sizes. Its performance can be further improved if it is combined with certain local search methods, the cutting plane techniques as well as a new negative proximal point scheme. Extensive numerical results on QAPLIB and the bandwidth minimization problem show that our proposed algorithms can often find reasonably high-quality solutions within a competitive amount of time.

In [48], we study the treatment plan optimization in volumetric modulated arc therapy (VMAT) for cancer treatment. It is a complicated integer programming problem due to the physical constraints. We perform minimization on the shapes of the aperture and the beam intensities alternatively. The proposed algorithm is further improved by an incremental random importance sampling of the voxels to reduce the computational cost of the energy functional. Numerical simulations on two clinical cancer date sets demonstrate that our method is highly competitive to the state-of-the-art algorithms regarding both computational time and quality of treatment planning.

References:

  1. Afonso S. Bandeira, Nicolas Boumal, and Vladislav Voroninski. On the low-rank approach for semidefinite programs arising in synchronization and community detection. 02 2016.

  2. Stephen Becker, Jerome Bobin, and Emmanuel J. Candes. Nesta: A fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci., 4(1):1–39, 2011.

  3. Brett Byram, Kazuyuki Dei, Jaime Tierney, and Douglas Dumont. A model and regularization scheme for ultrasonic beamforming clutter reduction. IEEE Trans. Ultrason. Ferroelectr. Freq. Control, 62(11):1913–1927, November 2015.

  4. Emmanuel J. Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval via wirtinger flow: Theory and algorithms. IEEE Trans. Inf. Theory, 61(4):1985–2007, April 2015.

  5. Xiaoying Dai, Zhuang Liu, Liwei Zhang, and Aihui Zhou. A conjugate gradient method for electronic structure calculations. 01 2016.

  6. Donald Goldfarb, Zaiwen Wen, and Wotao Yin. A curvilinear search method for p-harmonic flows on spheres. SIAM J. Imaging Sci., 2(1):84–109, 2009.

  7. Jiang Hu, Andre Milzarek, Zaiwen Wen, and Yaxiang Yuan. Adaptive regularized Newton method for riemannian optimization. Technical report, Peking University, 2017.

  8. Bo Jiang and Yu-Hong Dai. A framework of constraint preserving update schemes for optimization on stiefel manifold. Math. Program., 153(2):535–575, November 2015.

  9. Bo Jiang, Ya-Feng Liu, and Zaiwen Wen. -norm regularization algorithms for optimization over permutation matrices. SIAM J. Optim., 26(4):2284–2313, 2016.

  10. Rongjie Lai and Stanley Osher. A splitting method for orthogonality constrained problems J. Sci. Comput., 58(2):431–449, February 2014.

  11. Rongjie Lai, Zaiwen Wen, Wotao Yin, Xianfeng Gu, and Lok Ming Lui. Folding-free global conformal mapping for genus-0 surfaces by harmonic energy minimization. J. Sci. Comput., 58(3):705–725, 2014.

  12. Namgil Lee and Andrzej Cichocki. Estimating a few extreme singular values and vectors for large-scale matrices in tensor train format. SIAM J. Matrix Anal. Appl., 36(3):994–1014, 2015.

  13. Xudong Li, Defeng Sun, and Kim-Chuan Toh. A highly efficient semismooth newton augmented lagrangian method for solving lasso problems. 07 2016.

  14. Yongfeng Li, Zaiwen Wen, Chao Yang, and Yaxiang Yuan. A Semi-smooth Newton Method for Solving Semidefinite Programs in Electronic Structure Calculations. Technical report, Peking University, 2017.

  15. Xin Liu, Xiao Wang, Zaiwen Wen, and Yaxiang Yuan. On the convergence of the self-consistent field iteration in Kohn-Sham density functional theory. SIAM J. Matrix Anal. Appl., 35(2):546–558, 2014.

  16. Xin Liu, Zaiwen Wen, Xiao Wang, Michael Ulbrich, and Yaxiang Yuan. On the analysis of the discretized Kohn-Sham density functional theory. SIAM J. Numer. Anal., 53(4):1758–1785, 2015.

  17. Xin Liu, Zaiwen Wen, and Yin Zhang. Limited memory block Krylov subspace optimization for computing dominant singular value decompositions. SIAM J. Sci. Comput., 35(3):A1641–A1668, 2013.

  18. Xin Liu, Zaiwen Wen, and Yin Zhang. An efficient Gauss-Newton algorithm for symmetric low-rank product matrix approximations. SIAM J. Optim., 25(3):1571–1608, 2015.

  19. Chao Ma, Xin Liu, and Zaiwen Wen. Globally convergent levenberg-marquardt method for phase retrieval. 2016.

  20. Jia-Wei Mei, Ji-Yao Chen, Huan He, and Xiao-Gang Wen. Gapped spin liquid with -topological order for kagome heisenberg model. 06 2016.

  21. Andre Milzarek and Michael Ulbrich. A semismooth newton method with multidimensional filter globalization for l(1)-optimization. SIAM J. Optim., 24(1):298–333, 2014.

  22. B. Mishra, G. Meyer, F. Bach, and R. Sepulchre. Low-rank optimization with trace norm penalty. SIAM J. Optim., 23(4):2124–2149, 2013.

  23. Renato D. C. Monteiro, Camilo Ortiz, and Benar F. Svaiter. A first-order block-decomposition method for solving two-easy-block structured semidefinite programs. Math. Program. Comput., 6(2):103–150, 2014.

  24. Renato D. C. Monteiro, Camilo Ortiz, and Benar F. Svaiter. Implementation of a block-decomposition algorithm for solving large-scale conic semidefinite programming problems. Comput. Optim. Appl., 57(1):45–69, January 2014.

  25. Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl., 169(3):1042–1068, June 2016.

  26. Neal Parikh and Stephen Boyd. Block splitting for distributed optimization. Math. Program. Comput., 6(1):77–102, 2014.

  27. Yiyuan She, Shijie Li, and Dapeng Wu. Robust orthogonal complement principal component analysis. J. Am. Stat. Assoc., 111(514):763–771, June 2016.

  28. Y. Shen, Z. Wen, and Y. Zhang. Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization. Optim. Methods Softw., 29(2):239–263, 2014.

  29. Defeng Sun, Kim-Chuan Toh, and Liuqin Yang. A convergent 3-block semiproximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Optim., 25(2):882–915, 2015.

  30. Quoc Tran-Dinh and Zheqi Zhang. Extended gauss-newton and gauss-newton-admm algorithms for low-rank matrix optimization. 06 2016.

  31. Michael Ulbrich, Zaiwen Wen, Chao Yang, Dennis Klöckner, and Zhaosong Lu. A proximal gradient method for ensemble density functional theory. SIAM J. Sci. Comput., 37(4):A1975–A2002, 2015.

  32. Bart Vandereycken. Low-rank matrix completion by riemannian optimization. SIAM J. Optim., 23(2):1214–1236, 2013.

  33. Irene Waldspurger, Alexandre d’Aspremont, and Stephane Mallat. Phase recovery, maxcut and complex semidefinite programming. Math. Program., 149(1-2):47–81, February 2015.

  34. Lanhui Wang and Amit Singer. Exact and stable recovery of rotations for robust synchronization. Inf. Inference, 2(2):145–193, 2013.

  35. Zaiwen Wen, Donald Goldfarb, and Katya Scheinberg. Block coordinate descent methods for semidefinite programming. In Handbook on semidefinite, conic and polynomial optimization, volume 166 of Internat. Ser. Oper. Res. Management Sci., pages 533–564. Springer, New York, 2012.

  36. Zaiwen Wen, Donald Goldfarb, and Wotao Yin. Alternating direction augmented Lagrangian methods for semidefinite programming. Math. Program. Comput., 2(3-4):203–230, 2010.

  37. Zaiwen Wen, Andre Milzarek, Michael Ulbrich, and Hongchao Zhang. Adaptive regularized self-consistent field iteration with exact Hessian for electronic structure calculation. SIAM J. Sci. Comput., 35(3):A1299–A1324, 2013.

  38. Zaiwen Wen, Chao Yang, Xin Liu, and Yin Zhang. Trace-penalty minimization for large-scale eigenspace computation. J. Sci. Comput., 66(3):1175–1203, 2016.

  39. Zaiwen Wen and Wotao Yin. A feasible method for optimization with orthogonality constraints. Math. Program., 142(1-2, Ser. A):397–434, 2013.

  40. Zaiwen Wen, Wotao Yin, Donald Goldfarb, and Yin Zhang. A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation. SIAM J. Sci. Comput., 32(4):1832–1857, 2010.

  41. Zaiwen Wen, Wotao Yin, Hongchao Zhang, and Donald Goldfarb. On the convergence of an active-set method for minimization. Optim. Methods Softw., 27(6):1127–1146, 2012.

  42. Zaiwen Wen, Wotao Yin, and Yin Zhang. Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput., 4(4):333–361, 2012.

  43. Zaiwen Wen and Yin Zhang. Accelerating Convergence by Augmented Rayleigh–Ritz Projections For Large-Scale Eigenpair Computation. SIAM J. Matrix Anal. Appl., 38(2):273–296, 2017.

  44. Xinming Wu, Zaiwen Wen, and Weizhu Bao. A regularized newton method for computing ground states of bose-einstein condensates. Journal of Scientfic Computing, 2017.

  45. Xiantao Xiao, Yongfeng Li, Zaiwen Wen, and Liwei Zhang. A regularized semi-smooth newton method with projection steps for composite convex programs, 2016.

  46. Liuqin Yang, Defeng Sun, and Kim-Chuan Toh. SDPNAL+: a majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints. Math. Program. Comput., 7(3):331–366, 2015.

  47. Yi Yang, Jianwei Ma, and Stanley Osher. Seismic data reconstruction via matrix completion. Inverse Probl. Imaging, 7(4):1379–1392, November 2013.

  48. Yu Yang, Bin Dong, and Zaiwen Wen. Randomized algorithms for high quality treatment planning in volumetric modulated arc therapy. Inverse Probl., 33(2), February 2017.

  49. Honglin Yuan, Xiaoyi Gu, Rongjie Lai, and Zaiwen Wen. Global optimization with orthogonality constraints via stochastic diffusion. Technical report, Peking University, 2017.

  50. Xin Zhang, Jinwei Zhu, Zaiwen Wen, and Aihui Zhou. Gradient type optimization methods for electronic structure calculations. SIAM J. Sci. Comput., 36(3):C265–C289, 2014.

  51. Youwei Zhang and Laurent El Ghaoui. Large-scale sparse principal component analysis with application to text data. In Proceedings of the 24th International Conference on Neural Information Processing Systems, NIPS’11, pages 532–539, USA, 2011. Curran Associates Inc.

  52. Shiwen Zhao, Chuan Gao, Sayan Mukherjee, and Barbara E. Engelhardt. Bayesian group factor analysis with structured sparsity. J. Mach. Learn. Res., 17(1), 2016.

  53. Zhi Zhao, Zheng-Jian Bai, and Xiao-Qing Jin. A riemannian newton algorithm for nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl., 36(2):752–774, 2015.