Optimization

namespace: optimization

As mentioned in Loop Detection, tracking will have accumulated errors, and we need closed-loop detection to provide constraints to optimize and eliminate errors. Optimization is the content of this part.

Back-end optimization generally uses an optimizer, libraries like g2o, ceres are very good, but unfortunately the author is only familiar with Eigen. The optimization part is implemented by Eigen, which should be more helpful for learning. OnePiece implements two Two back-end optimization methods: Fast Bundle Adjustment and Bundle Adjustment.

Fast BundleAdjustment

Frame correspondence is defined in class Correspondence:

class Correspondence
{
    public:
    int source_id = -1;
    int target_id = -1;
    float average_disparity = 1e6;
    geometry::PointCorrespondenceSet correspondence_set;
};

A Frame Correspondence means a pair of matched frames, where the source id is the id of the source frame, and the target id is the id of the target frame in the optimized variable. And average_disparity is used to measure the distance of these matching points in a 2-dimensional picture, and to determine whether to start a new keyframe. geometry::PointCorrespondenceSet is the set of matching points contained in this pair of matching frames.

Fast BundleAdjustment (FBA) is faster than the traditional Bundle Adjustment. In fact, FBA is called only in OnePiece and does not have a very standard name. The difference between it and BundleAdjustment is the optimization variables and objective functions. FBA only optimizes the camera pose, The optimization objective function is:

\[ E(T) = \sum_{i}^{\vert\Omega\vert}\sum_{j}^{\vert C_{i}\vert}\Vert T_sp_{s(j)} - T_tp_{t(j)} \Vert^2 \]

$\Omega$ refers to all frame correspondences, $C_{i}$ refers to the $i$ frame correspondence, which is composed of multiple point correspondences. $T_s and T_t$ represent the poses of the source frame and target frame in the frame correspondence, respectively. $p_(s(j))$ represents the source point in the $j$th point correspondence. For the optimization of this part, you can refer to the article fastgo, you can also find it in related_paper/fastgo.pdf. This article contains more optimization techniques to improve system efficiency, while OnePiece only implements the basic parts. The functions are as follows:

void FastBA(const std::vector<Correspondence> &correspondences, geometry::SE3List &poses, int max_iteration =5);

Bundle Adjutment

n addition to optimizing the pose, the traditional BundleAdjustment (BA) also optimizes the 3D point coordinates in the world coordinate system, and the optimization goal is the 2D reprojection error. Generally speaking, the input of the traditional BA is the 3d point in the world coordinate system, camera poses, the projected 2D coordinates of 3D points in each frame. The objective function is:

\[ E(T, p) = \sum_i^{N_f}\sum_{j}^{N_p}\Vert u_{i,j} - \omega(T_i^{-1} p_j) \Vert, \]

$N_f$ is the number of camera poses (frame), $N_p$ is the number of world coordinate points, and $u_{i,j}$ is the actual projection position (uv coordinates during feature matching), and $\omega $ is a function for calculating reprojection.

For this optimization, in addition to optimizing the camera pose, it also optimizes the position of the point in the world, which means the optimization of the noise in the projection process. However, because the number of world points is often very large, the final optimization matrix dimension is tens of thousands, even millions, which highly limits the solving speed. BA is a very classic method, which was proposed long before the emergence of SLAM, and there are many variants of it, acceleration methods. This part is just an implementation of the basic idea.

void BundleAdjustment(geometry::Point3List &world_points, 
    std::vector<ProjectedPointsOnFrame> &projected_points, 
    geometry::SE3List &poses,
    const camera::PinholeCamera &camera, int max_iteration = 20);

When implementing FastBA, OnePiece uses the Gauss-Newton optimization method, and BA must use the Levenberg-Marquardt algorithm, because it is easy to appear singular matrices, which causes Gauss-Newton to become very fragile and makes the result unable to converge. For the detailed derivation of BA, refer to related_paper/BARevisited.pdf.

Look the examples: bundle adjustment of two frames.