Pymanopt is a Python toolbox for optimization on manifolds, that computes gradients and Hessians automatically. It builds upon the Matlab toolbox Manopt but is otherwise independent of it. Pymanopt aims to lower the barriers for users wishing to use state of the art techniques for optimization on manifolds, by relying on automatic differentiation for computing gradients and Hessians, saving users time and saving them from potential calculation and implementiation errors.

Pymanopt is modular and hence easy to use. All of the automatic differentiation is done behind the scenes, so that the amount of setup the user needs to do is minimal. Usually only the following steps are required:

  1. Instantiate a manifold $\mathcal{M}$ to optimise over
  2. Define a cost function $f:\mathcal{M}\to \mathbb{R}$ to minimise
  3. Instantiate a Pymanopt solver

Experimenting with optimization on manifolds is simple with Pymanopt. The minimum working example below demonstrates this. The steps will be discussed in more detail in the subsequent three sections. Please refer to this example for a cool application of Riemannian optimization using Pymanopt for Inference in MoG models!

We encourage users and developers to report problems, request features, ask for help, or leave general comments either on github, gitter, or via email to one of the maintainers. Please refer to the dev documentation and the file if you wish to extend Pymanopt's functionality and/or contribute to the project. Pymanopt is distributed under the open source 3-clause BSD license. Please cite this JMLR paper if you publish work using Pymanopt (BibTeX).

import autograd.numpy as np

from pymanopt.manifolds import Stiefel
from pymanopt import Problem
from pymanopt.solvers import SteepestDescent

# (1) Instantiate a manifold
manifold = Stiefel(5, 2)

# (2) Define the cost function (here using autograd.numpy)
def cost(X): return np.sum(X)

problem = Problem(manifold=manifold, cost=cost)

# (3) Instantiate a Pymanopt solver
solver = SteepestDescent()

# let Pymanopt do the rest
Xopt = solver.solve(problem)



Pymanopt is compatible with Python 2.7 and Python 3.4+, and depends on NumPy and SciPy. Additionally, to use Pymanopt's built-in automatic differentiation, which we strongly recommend, you need to setup your cost functions using either Autograd, Theano or TensorFlow. If you're unfamiliar with these packages and not sure which to go for, it's probably best to start with Autograd. Autograd wraps thinly around NumPy, and is very simple to use, particularly if you're already familiar with NumPy (see below).

Instructions for installing NumPy, SciPy, and Theano on different operating systems can be found here, for installing Autograd here and for installing TensorFlow here.

Installing Pymanopt

Pymanopt can be installed with the following command:

pip install --user pymanopt


The rigorous mathematical definition of manifold is abstract and too technical for this tutorial. However if you're unfamiliar with the idea, it's fine just to visualize it as a smooth subset of Euclidean space. Simple examples include the surface of a sphere or of a torus, or the Möbius strip. If you need to solve an optimization problem with a search space which is constrained in some smooth way, then performing optimization on manifolds may well be the natural approach to take.

The manifolds that we currently support are listed below. We plan to implement more depending on the needs of users, so if there's a particular manifold you'd like to optimize over, please let us know. If you wish to implement your own manifold for Pymanopt, you will need to inherit from the Manifold abstract base class.

Euclidean manifold
Euclidean(n1, n2, ..., nk)
Unconstrained Euclidean space of shape (n1, n2, ..., nk) tensors. Useful for unconstrained problems or for unconstrained hyperparameters, as part of a product manifold.
from pymanopt.manifolds import Euclidean

# Space of shape (3, ) vectors
manifold = Euclidean(3)

# Space of (4 x 2) matrices
manifold = Euclidean(4, 2)
Symmetric matrices
Symmetric(n, k=1)
If k = 1, this is the manifold of n x n symmetric matrices. If k > 1 then this is a product of k symmetric matrices, arranged as an array of shape (k, n, n).
from pymanopt.manifolds import Symmetric

# Manifold of 3 x 3 symmetric matrices
manifold = Symmetric(3)
Sphere manifold
Sphere(n1, n2, ..., nk)
Shape (n1, n2, ..., nk) tensors with unit 2-norm.
from pymanopt.manifolds import Sphere

# The 'usual' sphere S^2, the set of points lying
# on the surface of a ball in 3D space:
manifold = Sphere(3)
Stiefel manifold
Stiefel(n, p, k=1)
Manifold $\mathrm{St}(n, p)$ of n x p matrices whose columns are orthonormal. That is $\mathrm{St}(n, p):=\{X\in \mathbb{R}^{n \times p}:X^TX=I_p\}$. If k > 1 then this instantiates a product $\mathrm{St}(n, p)^k$ of k Stiefel manifolds.
from pymanopt.manifolds import Stiefel

# Manifold of 3 x 2 matrices with orthonormal
# columns.
manifold = Stiefel(3, 2)
Grassmann manifold
Grassmann(n, p, k=1)
Manifold of p-dimensional subspaces of $\mathbb{R}^n$, denoted $\mathrm{Gr}(n, p)$. If optional argument k > 1, instantiates the product $\mathrm{Gr}(n, p)^k$ of k Grassmann manifolds.
If k = 1 then elements (subspaces) are represented by n x p Stiefel matrices whose columns span the subspace (see above for definition). If k > 1 then elements are represented by a k x n x p array containing k (n x p) Stiefel matrices.
from pymanopt.manifolds import Grassmann

# Manifold of planes through the origin in R^3
manifold = Grassmann(3, 2)
Oblique manifold
Oblique(m, n)
Manifold of m x n matrices with unit-norm columns. See this paper for an example doing Independent Components Analysis by optimizing over this manifold.
from pymanopt.manifolds import Oblique

# Two 3-vectors, each with norm 1
manifold = Oblique(3, 2)
Positive-definite matrices
PositiveDefinite(n, k=1)
If k = 1, this is the manifold of n x n positive-definite matrices. That is, symmetric matrices whose eigenvalues are all strictly positive. If k > 1 then this is a product of k positive definite matrices, arranged as an array of shape (k, n, n). This is useful in the Mixture of Gaussians example.
from pymanopt.manifolds import PositiveDefinite

# Manifold of 3 x 3 positive-definite matrices
manifold = PositiveDefinite(3)
Elliptope manifold
Elliptope(n, k)
Manifold of n x n positive-semidefinite matrices with rank k and unit diagonal elements. Note, a correlation matrix is of this form. Used in this paper for rank reduction of correlation matrices.
from pymanopt.manifolds import Elliptope

# Manifold of 3 x 3 correlation matrices
# of rank 2
manifold = Elliptope(3, 2)
Fixed rank positive-semidefinite matrices
PSDFixedRank(n, k)
PSDFixedRankComplex(n, k)
Manifold of real (or complex) valued n x n symmetric (or hermitian) positive-semidefinite matrices of rank k.
See this paper for details.
from pymanopt.manifolds import PSDFixedRank, PSDFixedRankComplex

# 3 x 3 rank 2 p.s.d. matrices
# Real
manifold = PSDFixedRank(3, 2)

# Complex
manifold = PSDFixedRankComplex(3, 2)
Fixed rank matrices
FixedRankEmbedded(m, n, k)
Manifold of real valued m x n matrices of rank k. This follows the embedded geometry described in this paper.
Note: this manifold is a work in progress, see here.
from pymanopt.manifolds import FixedRankEmbedded

# 5 x 4 rank 2 matrices
manifold = FixedRankEmbedded(5, 4, 2)
Product manifold
Product((man1, man2, ..., manN))
Product manifold of any of the manifolds listed above. This enables you to optimize over multiple manifolds simultaneously. Useful for problems with multiple parameters that have different constraints.
from pymanopt.manifolds import Product, Stiefel, Euclidean

# Product of Euclidean 3-vector with
# 5 x 2 Stiefel manifold.
manifold = Product((Euclidean(3), Stiefel(5, 2)))

Cost Functions

Together, a manifold and cost function fully describe a manifold optimization problem that is to be solved. They are combined into a Pymanopt Problem object that is then passed to a Pymanopt solver.

from pymanopt import Problem
problem = Problem(manifold=manifold, cost=cost)

The cost function passed to Pymanopt should take a single input (a point on the manifold), and return a scalar. You have three options for how to define the cost function:

Method Additional requirements
(a)Use AutogradNone
(b)Use TheanoNone
(c)Use TensorFlowNone
(d)Use a regular Python functionCalculate and implement derivatives (gradient and Hessian) by hand.

The first three options are recommended – indeed, they are what makes manifold optimization with Pymanopt so easy!

(a/b/c) Use Autograd, Theano or TensorFlow

Currently Pymanopt supports Autograd, Theano and TensorFlow as autodiff backends.

Setting up the cost function using Autograd

If you already know how to use NumPy, then this approach will be easy. Just import autograd.numpy and setup your cost as a Python function, using the autograd numpy to perform the computation.

import autograd.numpy as np

def cost(X):
    return np.exp(-np.sum(X**2))

problem = Problem(manifold=manifold, cost=cost)

Setting up the cost function using Theano

This approach requires you to setup your cost as a Theano graph. A tutorial on using Theano can be found here.

import theano.tensor as T

X = T.matrix()
cost = T.exp(-T.sum(X**2))

problem = Problem(manifold=manifold, cost=cost, arg=X)

Setting up the cost function using TensorFlow

This approach requires you to setup your cost as a TensorFlow graph. A tutorial on using TensorFlow can be found here.

import tensorflow as tf
import numpy as np

X = tf.Variable(tf.placeholder(tf.float32))
cost = tf.exp(-tf.reduce_sum(tf.square(X)))

problem = Problem(manifold=manifold, cost=cost, arg=X)

(d) Use a regular Python function

If you wish to use one of the derivative-free solvers (perhaps your cost function is nowhere smooth), then this approach makes sense. If you want to use a solver which requires derivatives (these solvers generally perform far better than derivative-free methods) this approach enables you to calculate and implement gradients and Hessians by hand.

Using Pymanopt in this way is similar to Manopt. You can refer to the Manopt tutorial and forum for advice on calculating gradients/hessians by hand. However, we would like to stress that there is little or no advantage to doing things in this way. The gradients and Hessians calculated by Pymanopt should be both correct and efficient.

problem = Problem(manifold=manifold, cost=cost, egrad=egrad, ehess=ehess)


The Pymanopt Solver classes provide the algorithms for optimization. Once a Pymanopt Problem object has been set up and a solver instantiated the optimization is run as follows:

xoptimal = solver.solve(problem)

The solvers' parameters are specified when instantiating the solver object. The following parameters are shared by all solvers (argumentname=defaultvalue):

Maximum time in seconds for the solver to run. You can set maxtime=float('inf') for no time limit.
Maximum number of iterations to run.
Terminate optimization if the norm of the gradient is below this.
Terminate optimization if the stepsize is below this.
Maximum number of allowed cost evaluations, especially important if cost evaluation is computationally expensive.
Level of information logged by the solver while it operates, 0 is silent, 2 is most information. If set to a non-zero value, the solver will return both the final point on the manifold as well as a dictionary that holds the log information, otherwise the solve routine only returns the final point, i.e.,
xoptimal = solver.solve(problem, logverbosity=0)
xoptimal, optlog = solver.solve(problem, logverbosity=2)
See below for a minimalistic example.

The solvers implemented in Pymanopt are listed below. Note, all of these are currently based on solvers from Manopt, and more details may be found on the Manopt website. Solvers may have individual parameters to adjust their behaviour. These are documented in the respective source files. If you wish to implement your own solver, you must inherit from the Solver abstract base class. The steepest_descent solver is reasonably simple and could be used as a model for custom solvers.

Trust regions
Second-order method that approximates the objective function by a quadratic surface and then updates the current estimate based on that.
from pymanopt.solvers import TrustRegions
solver = TrustRegions()
Xopt = solver.solve(problem)
Steepest descent
Classical first-order steepest descent algorithm. By default uses a back-tracking linesearch. To set linesearch parameters you can instantiate the LineSearchBackTracking object with your desired parameters and pass it to the SteepestDescent solver using the optional linesearch parameter: SteepestDescent(linesearch=LinesearchObject). You can also implement your own linesearch class, just take this code as a starting point.
from pymanopt.solvers import SteepestDescent
solver = SteepestDescent()
Xopt = solver.solve(problem)
Conjugate gradient
Classical first-order conjugate gradient descent algorithm. By default uses an adaptive linesearch. To set linesearch parameters you can instantiate the LineSearchAdaptive object with your desired parameters and pass it to the Conjugate gradient solver using the optional linesearch parameter: ConjugateGradient(linesearch=LinesearchObject). You can also implement your own linesearch class, just take this code as a starting point.
from pymanopt.solvers import ConjugateGradient
solver = ConjugateGradient()
Xopt = solver.solve(problem)
Derivative-free optimization
from pymanopt.solvers import NelderMead
solver = NelderMead()
Xopt = solver.solve(problem)
Particle swarm
Derivative-free optimization
from pymanopt.solvers import ParticleSwarm
solver = ParticleSwarm()
Xopt = solver.solve(problem)


Several examples that demonstrate how to use Pymanopt can be found here. Below are some examples that demonstrate certain use-cases that may be of general interest and give an idea of what can be done with Pymanopt.

Example Notes Links
Log optimization behaviour Demonstrates how to log and inspect the optimization behaviour, i.e. how the cost function's value changes over iterations, which of the stopping criterions caused the solver to stop, etc.
Riemannian optimization using Pymanopt for Inference in MoG models Given samples of a Mixture of Gaussians model, infer the parameters using optimization on manifolds instead of expectation maximisation (EM). MoG.html / MoG.ipynb
Linear regression as optimization over the product manifold Optimises the weights $w$ and the offset $b$ in the linear regression problem $\min_{w,b} ||Y-w^\top X-b||_2^2$, for given label vector $Y$ and covariates matrix $X$ over the product manifold $\mathbb{R}^3 \times \mathbb{R}$ to demonstrate optimization over product manifolds. / /
Solve the same optimization problems for several data instances Demonstrates how to solve the same optimization problems for several data instances, i.e., in this case solving a regression problem for five different datasets. /