123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394 |
- .. highlight:: c++
- .. default-domain:: cpp
- .. cpp:namespace:: ceres
- .. _chapter-nnls_covariance:
- =====================
- Covariance Estimation
- =====================
- Introduction
- ============
- One way to assess the quality of the solution returned by a non-linear
- least squares solver is to analyze the covariance of the solution.
- Let us consider the non-linear regression problem
- .. math:: y = f(x) + N(0, I)
- i.e., the observation :math:`y` is a random non-linear function of the
- independent variable :math:`x` with mean :math:`f(x)` and identity
- covariance. Then the maximum likelihood estimate of :math:`x` given
- observations :math:`y` is the solution to the non-linear least squares
- problem:
- .. math:: x^* = \arg \min_x \|f(x) - y\|^2
- And the covariance of :math:`x^*` is given by
- .. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{-1}
- Here :math:`J(x^*)` is the Jacobian of :math:`f` at :math:`x^*`. The
- above formula assumes that :math:`J(x^*)` has full column rank.
- If :math:`J(x^*)` is rank deficient, then the covariance matrix :math:`C(x^*)`
- is also rank deficient and is given by the Moore-Penrose pseudo inverse.
- .. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{\dagger}
- Note that in the above, we assumed that the covariance matrix for
- :math:`y` was identity. This is an important assumption. If this is
- not the case and we have
- .. math:: y = f(x) + N(0, S)
- Where :math:`S` is a positive semi-definite matrix denoting the
- covariance of :math:`y`, then the maximum likelihood problem to be
- solved is
- o.. math:: x^* = \arg \min_x f'(x) S^{-1} f(x)
- and the corresponding covariance estimate of :math:`x^*` is given by
- .. math:: C(x^*) = \left(J'(x^*) S^{-1} J(x^*)\right)^{-1}
- So, if it is the case that the observations being fitted to have a
- covariance matrix not equal to identity, then it is the user's
- responsibility that the corresponding cost functions are correctly
- scaled, e.g. in the above case the cost function for this problem
- should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`,
- where :math:`S^{-1/2}` is the inverse square root of the covariance
- matrix :math:`S`.
- Gauge Invariance
- ================
- In structure from motion (3D reconstruction) problems, the
- reconstruction is ambiguous up to a similarity transform. This is
- known as a *Gauge Ambiguity*. Handling Gauges correctly requires the
- use of SVD or custom inversion algorithms. For small problems the
- user can use the dense algorithm. For more details see the work of
- Kanatani & Morris [KanataniMorris]_.
- :class:`Covariance`
- ===================
- :class:`Covariance` allows the user to evaluate the covariance for a
- non-linear least squares problem and provides random access to its
- blocks. The computation assumes that the cost functions compute
- residuals such that their covariance is identity.
- Since the computation of the covariance matrix requires computing the
- inverse of a potentially large matrix, this can involve a rather large
- amount of time and memory. However, it is usually the case that the
- user is only interested in a small part of the covariance
- matrix. Quite often just the block diagonal. :class:`Covariance`
- allows the user to specify the parts of the covariance matrix that she
- is interested in and then uses this information to only compute and
- store those parts of the covariance matrix.
- Rank of the Jacobian
- ====================
- As we noted above, if the Jacobian is rank deficient, then the inverse
- of :math:`J'J` is not defined and instead a pseudo inverse needs to be
- computed.
- The rank deficiency in :math:`J` can be *structural* -- columns
- which are always known to be zero or *numerical* -- depending on the
- exact values in the Jacobian.
- Structural rank deficiency occurs when the problem contains parameter
- blocks that are constant. This class correctly handles structural rank
- deficiency like that.
- Numerical rank deficiency, where the rank of the matrix cannot be
- predicted by its sparsity structure and requires looking at its
- numerical values is more complicated. Here again there are two
- cases.
- a. The rank deficiency arises from overparameterization. e.g., a
- four dimensional quaternion used to parameterize :math:`SO(3)`,
- which is a three dimensional manifold. In cases like this, the
- user should use an appropriate
- :class:`Manifold`. Not only will this lead to better
- numerical behaviour of the Solver, it will also expose the rank
- deficiency to the :class:`Covariance` object so that it can
- handle it correctly.
- b. More general numerical rank deficiency in the Jacobian requires
- the computation of the so called Singular Value Decomposition
- (SVD) of :math:`J'J`. We do not know how to do this for large
- sparse matrices efficiently. For small and moderate sized
- problems this is done using dense linear algebra.
- :class:`Covariance::Options`
- .. class:: Covariance::Options
- .. member:: int Covariance::Options::num_threads
- Default: ``1``
- Number of threads to be used for evaluating the Jacobian and
- estimation of covariance.
- .. member:: SparseLinearAlgebraLibraryType Covariance::Options::sparse_linear_algebra_library_type
- Default: ``SUITE_SPARSE`` Ceres Solver is built with support for
- `SuiteSparse <http://faculty.cse.tamu.edu/davis/suitesparse.html>`_
- and ``EIGEN_SPARSE`` otherwise. Note that ``EIGEN_SPARSE`` is
- always available.
- .. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type
- Default: ``SPARSE_QR``
- Ceres supports two different algorithms for covariance estimation,
- which represent different tradeoffs in speed, accuracy and
- reliability.
- 1. ``SPARSE_QR`` uses the sparse QR factorization algorithm to
- compute the decomposition
- .. math::
- QR &= J\\
- \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1}
- The speed of this algorithm depends on the sparse linear algebra
- library being used. ``Eigen``'s sparse QR factorization is a
- moderately fast algorithm suitable for small to medium sized
- matrices. For best performance we recommend using
- ``SuiteSparseQR`` which is enabled by setting
- :member:`Covariance::Options::sparse_linear_algebra_library_type`
- to ``SUITE_SPARSE``.
- ``SPARSE_QR`` cannot compute the covariance if the
- Jacobian is rank deficient.
- 2. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the
- computations. It computes the singular value decomposition
- .. math:: U D V^\top = J
- and then uses it to compute the pseudo inverse of J'J as
- .. math:: (J'J)^{\dagger} = V D^{2\dagger} V^\top
- It is an accurate but slow method and should only be used for
- small to moderate sized problems. It can handle full-rank as
- well as rank deficient Jacobians.
- .. member:: double Covariance::Options::column_pivot_threshold
- Default: :math:`-1`
- During QR factorization, if a column with Euclidean norm less than
- ``column_pivot_threshold`` is encountered it is treated as zero.
- If ``column_pivot_threshold < 0``, then an automatic default value
- of `20*(m+n)*eps*sqrt(max(diag(J’*J)))` is used. Here `m` and `n`
- are the number of rows and columns of the Jacobian (`J`)
- respectively.
- This is an advanced option meant for users who know enough about
- their Jacobian matrices that they can determine a value better
- than the default.
- .. member:: int Covariance::Options::min_reciprocal_condition_number
- Default: :math:`10^{-14}`
- If the Jacobian matrix is near singular, then inverting :math:`J'J`
- will result in unreliable results, e.g, if
- .. math::
- J = \begin{bmatrix}
- 1.0& 1.0 \\
- 1.0& 1.0000001
- \end{bmatrix}
- which is essentially a rank deficient matrix, we have
- .. math::
- (J'J)^{-1} = \begin{bmatrix}
- 2.0471e+14& -2.0471e+14 \\
- -2.0471e+14& 2.0471e+14
- \end{bmatrix}
- This is not a useful result. Therefore, by default
- :func:`Covariance::Compute` will return ``false`` if a rank
- deficient Jacobian is encountered. How rank deficiency is detected
- depends on the algorithm being used.
- 1. ``DENSE_SVD``
- .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}}
- where :math:`\sigma_{\text{min}}` and
- :math:`\sigma_{\text{max}}` are the minimum and maximum
- singular values of :math:`J` respectively.
- 2. ``SPARSE_QR``
- .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J)
- Here :math:`\operatorname{rank}(J)` is the estimate of the rank
- of :math:`J` returned by the sparse QR factorization
- algorithm. It is a fairly reliable indication of rank
- deficiency.
- .. member:: int Covariance::Options::null_space_rank
- When using ``DENSE_SVD``, the user has more control in dealing
- with singular and near singular covariance matrices.
- As mentioned above, when the covariance matrix is near singular,
- instead of computing the inverse of :math:`J'J`, the Moore-Penrose
- pseudoinverse of :math:`J'J` should be computed.
- If :math:`J'J` has the eigen decomposition :math:`(\lambda_i,
- e_i)`, where :math:`\lambda_i` is the :math:`i^\textrm{th}`
- eigenvalue and :math:`e_i` is the corresponding eigenvector, then
- the inverse of :math:`J'J` is
- .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i'
- and computing the pseudo inverse involves dropping terms from this
- sum that correspond to small eigenvalues.
- How terms are dropped is controlled by
- `min_reciprocal_condition_number` and `null_space_rank`.
- If `null_space_rank` is non-negative, then the smallest
- `null_space_rank` eigenvalue/eigenvectors are dropped irrespective
- of the magnitude of :math:`\lambda_i`. If the ratio of the
- smallest non-zero eigenvalue to the largest eigenvalue in the
- truncated matrix is still below min_reciprocal_condition_number,
- then the `Covariance::Compute()` will fail and return `false`.
- Setting `null_space_rank = -1` drops all terms for which
- .. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number}
- This option has no effect on ``SPARSE_QR``.
- .. member:: bool Covariance::Options::apply_loss_function
- Default: `true`
- Even though the residual blocks in the problem may contain loss
- functions, setting ``apply_loss_function`` to false will turn off
- the application of the loss function to the output of the cost
- function and in turn its effect on the covariance.
- .. class:: Covariance
- :class:`Covariance::Options` as the name implies is used to control
- the covariance estimation algorithm. Covariance estimation is a
- complicated and numerically sensitive procedure. Please read the
- entire documentation for :class:`Covariance::Options` before using
- :class:`Covariance`.
- .. function:: bool Covariance::Compute(const std::vector<std::pair<const double*, const double*> >& covariance_blocks, Problem* problem)
- Compute a part of the covariance matrix.
- The vector ``covariance_blocks``, indexes into the covariance
- matrix block-wise using pairs of parameter blocks. This allows the
- covariance estimation algorithm to only compute and store these
- blocks.
- Since the covariance matrix is symmetric, if the user passes
- ``<block1, block2>``, then ``GetCovarianceBlock`` can be called with
- ``block1``, ``block2`` as well as ``block2``, ``block1``.
- ``covariance_blocks`` cannot contain duplicates. Bad things will
- happen if they do.
- Note that the list of ``covariance_blocks`` is only used to
- determine what parts of the covariance matrix are computed. The
- full Jacobian is used to do the computation, i.e. they do not have
- an impact on what part of the Jacobian is used for computation.
- The return value indicates the success or failure of the covariance
- computation. Please see the documentation for
- :class:`Covariance::Options` for more on the conditions under which
- this function returns ``false``.
- .. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
- Return the block of the cross-covariance matrix corresponding to
- ``parameter_block1`` and ``parameter_block2``.
- Compute must be called before the first call to ``GetCovarianceBlock``
- and the pair ``<parameter_block1, parameter_block2>`` OR the pair
- ``<parameter_block2, parameter_block1>`` must have been present in the
- vector covariance_blocks when ``Compute`` was called. Otherwise
- ``GetCovarianceBlock`` will return false.
- ``covariance_block`` must point to a memory location that can store
- a ``parameter_block1_size x parameter_block2_size`` matrix. The
- returned covariance will be a row-major matrix.
- .. function:: bool GetCovarianceBlockInTangentSpace(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const
- Return the block of the cross-covariance matrix corresponding to
- ``parameter_block1`` and ``parameter_block2``.
- Returns cross-covariance in the tangent space if a local
- parameterization is associated with either parameter block;
- else returns cross-covariance in the ambient space.
- Compute must be called before the first call to ``GetCovarianceBlock``
- and the pair ``<parameter_block1, parameter_block2>`` OR the pair
- ``<parameter_block2, parameter_block1>`` must have been present in the
- vector covariance_blocks when ``Compute`` was called. Otherwise
- ``GetCovarianceBlock`` will return false.
- ``covariance_block`` must point to a memory location that can store
- a ``parameter_block1_local_size x parameter_block2_local_size`` matrix. The
- returned covariance will be a row-major matrix.
- Example Usage
- =============
- .. code-block:: c++
- double x[3];
- double y[2];
- Problem problem;
- problem.AddParameterBlock(x, 3);
- problem.AddParameterBlock(y, 2);
- <Build Problem>
- <Solve Problem>
- Covariance::Options options;
- Covariance covariance(options);
- std::vector<std::pair<const double*, const double*> > covariance_blocks;
- covariance_blocks.push_back(make_pair(x, x));
- covariance_blocks.push_back(make_pair(y, y));
- covariance_blocks.push_back(make_pair(x, y));
- CHECK(covariance.Compute(covariance_blocks, &problem));
- double covariance_xx[3 * 3];
- double covariance_yy[2 * 2];
- double covariance_xy[3 * 2];
- covariance.GetCovarianceBlock(x, x, covariance_xx)
- covariance.GetCovarianceBlock(y, y, covariance_yy)
- covariance.GetCovarianceBlock(x, y, covariance_xy)
|