This project implements a projective ICP based visual odometry, to estimate both the trajectory of a robot and the 3D map of the visual features. The robot is equipped with a monocular camera with known intrinsic and extrinsic parameters.
Set of 120 measurements where each measurement contains a variable set of pairs (image_point, appearance)
, where:
image_point
: \([r,c]\) 1x2 vector.appearance
: \([f_1, ..., f_{10}]\) 1x10 vector (feature descriptor).The initial pose (the one relative to the measurement 0) is set to the identity. Then the estimate of the first pose is computed with the following steps:
These steps provide a first estimate of the pose of the camera.
Then a first estimate of the map is computed by triangulating the image points of the first two measurements, using the estimated pose.
The update step takes one measurement at a time and performs the projective ICP algorithm between the current measurement and the current estimated 3D map, to recover the relative pose of the camera with these steps:
Using the new estimated pose of the camera from the projective ICP, a new set of 3D points are triangulated and added to the estimated map.
The update step is repeated for each measurement.
A single step of the projective ICP is divided into two parts:
The linearization part takes as input the reference image points (from the measurement) and the current world points from the estimated map, already matched, and the current pose of the camera with respect to the world \( {}^wT_{c_0}\). Then, it calculates the matrix \(H\) and the vector \(b\) by computing for each pair of matched points \( ({}^rp, {}^wp) \) (reference image points, estimated world point) the error \(e\) and the Jacobian \(J\) in this way:
\[ \begin{align} \text{World point in camera coordinates (hom): } &{}^{c_0}p_{hom} = inv({}^wT_{c_0}) {}^wp_{hom}\\ \text{World point in camera coordinates: } &{}^{c_0}p_{norm} = {}^{c_0}p_{hom}[:3]/{}^{c_0}p_{hom}[3]\\ \text{World point on image plane (hom): } &{}^{c_0}p_{cam}=K {}^{c_0}p_{norm}\\ \end{align} \]
\[ \begin{align} e = {}^rp-{}^{c_0}p_{cam} \end{align} \]
\[ \begin{align} J &= J_{proj}({}^{c_0}p_{cam}) K J_{icp}\\ J_{icp} &= [I_{3\times3} | ⌊-{}^{c_0}p_{norm}⌋_\times] \\ J_{proj}({}^{c_0}p_{cam}) &= \begin{bmatrix}\frac{1}{z} & 0 & -\frac{x}{z^2} \\ 0 & \frac{1}{z} & \frac{y}{z^2} \end{bmatrix} \end{align} \]
Then the error is used to compute \(chi = e^T e\), and:
Then, if the point is an inlier, the error and Jacobian are used to update \(H\) and \(b\) as:
\[ \begin{align} H &= H + J^T J \\ b &= b + J^T e \end{align} \]
Finally, a 6D vector describing the relative pose of the camera with respect to the previous pose is calculated by solving:
\[ \begin{align} dx &\leftarrow slove_{lstq}(H dx = -b) \\ {}^{c_0}T_{c_1} &= v2T(dx) \\ {}^wT_{c_1} &= {}^wT_{c_0} {}^{c_0}T_{c_1} \end{align} \]
The following results has been obtained using the following parameters:
Moreover, techniques to adapt dinamically the kernel threshold and the dumping factor have been implemented:
The number of iterations has been set to 250, but the algorithm stops when the error becomes less than \(10^{-2}\) or the it is stuck for 100 iterations. Then, regardingless the last result obtained from the algorithm, the one with the lowest error is selected to compute the new transformation.
Rotation Errors | Translation Errors | Translation Errors Ratio | |||
---|---|---|---|---|---|
Max rotation error | 0.313 [rad] | Max translation error | 0.343 [m] | Max translation error ratio | 5.987 |
Min rotation error | 0.000 [rad] | Min translation error | 0.002 [m] | Min translation error ratio | 4.962 |
Mean rotation error | 0.074 [rad] | Mean translation error | 0.135 [m] | Mean translation error ratio | 4.962 |
Map Results | |
---|---|
Number of points estimated map | 395 |
RMSE estimated map | 0.194 [m] |
Scale | 0.202 |
\[ \begin{align} T_{rel}^{est} &= {}^{r_i}T_{r_{i+1}}^{est} = inv({}^wT_{r_i}^{est}) {}^wT_{r_{i+1}}^{est} \\ T_{rel}^{gt} &= {}^{r_i}T_{r_{i+1}}^{gt} = inv({}^wT_{r_i}^{gt}) {}^wT_{r_{i+1}}^{gt} \\ T_{error} &= {}^{r_{i+1}^{est}}T_{r_{i+1}^{gt}}^{error} = inv(T_{rel}^{est}) T_{rel}^{gt} \\ R_{error} &= T_{error}[0:3,0:3] \\ t^{est}_{r_i} &= {}^wT_{r_i}^{est}[0:3,3] \\ t^{gt}_{r_i} &= {}^wT_{r_i}^{gt}[0:3,3] \\ \end{align} \] \[ \begin{align} rot\_error \space [rad] &= arccos((trace(R_{error})-1)/2) \\ translation\_ratio &= norm(t^{gt}_{r_i})/norm(t^{est}_{r_i}) \\ translation\_error \space [m] &= norm(translation\_ratio \cdot t^{est}_{r_i}-t^{gt}_{r_i}) \\ \end{align} \]