ORB_SLAM2 code reading and summary use

Learning orb_ Start with OPENCV before slam2

OpenCV3] detailed explanation of cv::Mat class member functions

There are many ways to obtain digital images from the real world: digital camera, scanner, computed tomography or magnetic resonance imaging are some of them. In each case, we see what images are. However, when converting an image to our digital device, our record is the value of each point of the image.
Q: How to get and store pixel values
Mat is essentially a class composed of two data parts: the matrix header and a pointer containing information, such as the size of the matrix, the method used for storage, the address of the matrix storage, etc.

cv::Mat A,C;//Create head only
A = imread(argv[1],CV_LOAD_IMAGE_COLOR);//Read for image
//Mat imread(const string& FileName, int flags = 1)
//FileName: picture name
//flags: load flag, which specifies the color type of a loaded image,
//The default value is 1, which means loading three channel color images;
//      -1. imread reads the image according to the decoding method;
//      0, imread reads the image in a single channel, i.e. gray image.

• assignment operators and copy constructors (constructors) copy only headers.
• use the clone() or copyTo() function to the base matrix of the copied image.

cv::Mat F= A.clone(); //Copy the matrix itself
cv::Mat G;

How to store pixel values. You can select the color space and the type of data used.
The smallest data type may be char, which means one byte or 8 bits. This may be signed (values - 127 to + 127) or unsigned (so that values from 0 to 255 can be stored). Although the case of these three components has given 160000 possible colors to represent (as in the case of RGB), we may obtain even finer control by using each component of floating-point number (4 bytes = 32 bits) or double (8 bytes = 64 bits) data type. However, remember that increasing the size of the component also increases the size of the entire picture in memory.

• Mat() constructor

Mat M(2,2, CV_8UC3, Scalar(0,0,255));
//For two-dimensional images, we first define the size: count 2,2 by row and column
//Then we need to specify the data type to store the elements and the number of channels per matrix point
//CV_  [number of bits per item] [signed or unsigned] [type prefix] [number of channels]

For instance, CV_8UC3 means we use unsigned char types that are 8 bit long and each pixel has three of these to form the three channels.
The cv::Scalar is four element short vector. Specify this and you can initialize all matrix points with a custom value

Mat L(3,3, CV_8UC3, Scalar::all(0))
// With Two dimension

Summery: The Scalar::all(0) or Scalar(0,0,255), the number of element in the bracket is depend on the custom channel we set up before. The value of each matrix point will be filled out to 0 or 0,0,255.

//cv::Mat::create function:
    M.create(4,4, CV_8UC(2));
    cout << "M = "<< endl << " "  << M << endl << endl;

Summery: The difference between cv::Mat::create with cv::Mat(,Scalar(x)) is that you cannot initialize the matrix values with this construction.

Mat_ (3,3) a method of constructing explicit mat

cv::Mat A=( Mat_<double> (3,3) <<1,2,3,4,5,6,7,8,9);
std::cout<<"A is"<<endl<<A<<endl

Summery: There are some small difference. remember the space behind A and
Mat_ (space)(3,3)(space)<<1,2,3,4,5,6,7,8,9

**Note:**You can fill out a matrix with random values using the cv::randu() function. You need to give the lower and upper value for the random values:

cv::Mat A = Mat(3,3 CV_8UC3)
cv::randu(A, Scalar::all(0), Scalar::alll(255));

ORB-SLAM is mainly divided into three threads: Tracking, LocalMapping and LoopClosing. The three threads are stored in three corresponding files: Tracking.cpp, LocalMapping.cpp and LoopClosing.cpp.
In ORB_SLAM2 completely encapsulates the whole System and defines a System class as the entry of the System.
In the constructor of System class, mainly perform the following operations:

1.Determine sensor type
2.Inspection profile
3.Load word bag
4.Instantiate related member variables: including instantiation of local drawing thread, loop detection thread and view thread. notes:The main thread is the tracking thread!
5.Cross set pointers between threads

The constructor of the System class prepares all the conditions for the System to run and waits for the image sequence to be passed in.
SLAM.TrackStereo(imLeft,imRight,tframe) incoming picture sequence

cv::Mat System::TrackStereo(const cv::Mat &imLeft, const cv::Mat &imRight, const double &timestamp)
        cerr << "ERROR: you called TrackStereo but input sensor was not set to STEREO." << endl;

    // Check mode change 
        unique_lock<mutex> lock(mMutexMode);  //Lock the mMutexMode within the declaration cycle, and other threads are not allowed to modify the positioning mode
        if(mbActivateLocalizationMode)        //Activate positioning mode

            // Wait until Local Mapping has effectively stopped

            mbActivateLocalizationMode = false;
        if(mbDeactivateLocalizationMode)    //Disable positioning mode
            mbDeactivateLocalizationMode = false;

    // Check reset
    unique_lock<mutex> lock(mMutexReset);
        mbReset = false;

    cv::Mat Tcw = mpTracker->GrabImageStereo(imLeft,imRight,timestamp);
    //Input image data for the tracker and return the transformation matrix from the world coordinate system to the camera coordinate system

    unique_lock<mutex> lock2(mMutexState);
    mTrackingState = mpTracker->mState;//Update tracking status
    mTrackedMapPoints = mpTracker->mCurrentFrame.mvpMapPoints;//Update map point vector
    mTrackedKeyPointsUn = mpTracker->mCurrentFrame.mvKeysUn;//Update key vector
    return Tcw;//Returns the transformation matrix from the camera coordinate system to the world coordinate system

The following operations are mainly performed in TrackStereo() function:

1.First verify the sensor type. If the sensor type is wrong, exit directly.
2.Verify the change of positioning mode. If the positioning mode is activated, turn off the local mapping function, and the system only performs positioning. If the positioning mode is turned off, the previously established local map is released (cleared).
3.Verify that the system is restarted (yes UI Interface control), if the system restarts, restart the tracker.
4.Input image data for the tracker and return the transformation matrix from the world coordinate system to the camera coordinate system.
5.Update variable

Save track function

void System::SaveTrajectoryKITTI(const string &filename)
    cout << endl << "Saving camera trajectory to " << filename << " ..." << endl;
        cerr << "ERROR: SaveTrajectoryKITTI cannot be used for monocular." << endl;

    vector<KeyFrame*> vpKFs = mpMap->GetAllKeyFrames();//Get all keyframes
    sort(vpKFs.begin(),vpKFs.end(),KeyFrame::lId);//Sort keys

    // Transform all keyframes so that the first keyframe is at the origin.
    // After a loop closure the first keyframe might not be at the origin.
    cv::Mat Two = vpKFs[0]->GetPoseInverse();

    ofstream f;
    f << fixed;

    // Frame pose is stored relative to its reference keyframe (which is optimized by BA and pose graph).
    // We need to get first the keyframe pose and then concatenate the relative transformation.
    // Frames not localized (tracking failure) are not saved.

    // For each frame we have a reference keyframe (lRit), the timestamp (lT) and a flag
    // which is true when tracking failed (lbL).
    // The linked list mlppreferences saves the reference key frames of all image frames
    // The linked list mlrelativeframepositions stores the attitude transformation relationship of all image frames relative to the reference frame
    list<ORB_SLAM2::KeyFrame*>::iterator lRit = mpTracker->mlpReferences.begin();
    list<double>::iterator lT = mpTracker->mlFrameTimes.begin();
    for(list<cv::Mat>::iterator lit=mpTracker->mlRelativeFramePoses.begin(), lend=mpTracker->mlRelativeFramePoses.end();lit!=lend;lit++, lRit++, lT++)
        ORB_SLAM2::KeyFrame* pKF = *lRit;

        cv::Mat Trw = cv::Mat::eye(4,4,CV_32F);

          //  cout << "bad parent" << endl;
            Trw = Trw*pKF->mTcp;
            pKF = pKF->GetParent();

        Trw = Trw*pKF->GetPose()*Two;//Calculate the transformation relationship from the global coordinate system to the reference key frame coordinate system

        cv::Mat Tcw = (*lit)*Trw;              //Calculate the transformation relationship from the global coordinate system to the camera coordinate system. Note: Tcw is the transformation from the global coordinate system to the camera coordinate system
         *                       Tcw Is the transformation from the global coordinate system to the camera coordinate system
         *                      Tcw The inverse of is the transformation from camera coordinate system to global coordinate system
         *                         The inverse of the transformation matrix can be calculated by
         *                          T^-1 =[ R^T    -R^T*t   
         *                                   0^T      1   ]
         * **********************************************************************/
        cv::Mat Rwc = Tcw.rowRange(0,3).colRange(0,3).t();//Calculate the rotation matrix from the camera coordinate system to the global coordinate system according to Tcw
        cv::Mat twc = -Rwc*Tcw.rowRange(0,3).col(3);         //Calculate the translation vector from the camera coordinate system to the global coordinate system according to Tcw
        //Write rotation matrix and translation vector to file
        f << setprecision(9) << Rwc.at<float>(0,0) << " " << Rwc.at<float>(0,1)  << " " << Rwc.at<float>(0,2) << " "  << twc.at<float>(0) << " " <<
             Rwc.at<float>(1,0) << " " << Rwc.at<float>(1,1)  << " " << Rwc.at<float>(1,2) << " "  << twc.at<float>(1) << " " <<
             Rwc.at<float>(2,0) << " " << Rwc.at<float>(2,1)  << " " << Rwc.at<float>(2,2) << " "  << twc.at<float>(2) << endl;
    cout << endl << "trajectory saved!" << endl;

Two important linked lists are maintained in the Tracker - mlpReferences and mlrelativeframepositions. Mlppreferences saves the reference key frames of all image frames; Mlrelativeframepositions saves the attitude transformation relationship of the image frame relative to the reference frame, that is, the transformation matrix between the image and the reference key frame. It is precisely because of the existence of these two linked lists that we can save the whole motion trajectory.
The flow of calculating the pose of each frame image relative to the origin is as follows:

1.from mpMap Get all keys at the variable and press ID Number.
2.Get the pose information of the first key frame.
3.According to the linked list mlpReferences and mlRelativeFramePoses The attitude of each frame is calculated in an ergodic manner. In this process:
1).Firstly, the transformation relationship between the global coordinate system and the reference key frame is calculated Trw. 
2).Calculates the transformation relationship of the current frame relative to the global coordinate system Tcw. Note: Tcw It is the transformation from the global coordinate system to the camera coordinate system, and the final required result is the transformation relationship from the camera coordinate system to the global coordinate system. The two operations are inverse to each other.
3).according to Tcw Calculates the rotation matrix from the camera coordinate system to the global coordinate system Rwc Peaceshift vector twc. 

So, in previous articles, there is a mind map that can intuitively reflect the structure of the whole code
It consists of four parts:

  1. **Load image: * * load the image and timestamp in the relevant folder according to the parameters entered on the command line. LoadImages(string(argv[3]), vstrImageLeft, vstrImageRight, vTimestamps);* Note: this process only saves the name of the image in the corresponding vector variable.
    2. * * create slam system: * * create and initialize the slam system according to the parameters entered on the command line. ORB_SLAM2::System SLAM(argv[1],argv[2],ORB_SLAM2::System::STEREO,true); Where: argv [1] represents the path of the word bag file, argv [2] represents the path of the configuration file, and orb_ Slam2:: System:: stereo indicates that the created slam system is a binocular slam system, and true indicates visual operation.
    3. * * incoming image tracking: * * the image is transmitted to the slam system for tracking through a circular process. SLAM.TrackStereo(imLeft,imRight,tframe);
    4. * * stop the SLAM system and save the track: * * stop all threads in the SLAM system through the Shutdown() function and save the track by using the SaveTrajectoryKITTI() function

In the main interface System.cc

C + + knowledge supplement
Vector is a vector type that can hold many types of data, so it is also called a container (which can be understood as an encapsulated dynamic array)
Add header file #include < vector > before vector operation

There are five ways to initialize a vector

  • //Defines a vector with 10 integer elements (angle brackets are the element type name, which can be any legal data type), does not have an initial value, and its value is uncertain
  • //Define a vector with 10 integer elements, and the given initial value of each element is 1
  • //Assign a value to vector a with vector b, and the value of a is completely equivalent to the value of b
  • vectora(b.begin(),b.begin+3);
  • //Get initial value from array
    int b[7]={1,2,3,4,5,6,7};
    vector a(b,b+7);

The common built-in functions of the vector object are used

vector<int> a,b;
//b is a vector, and 0-2 elements of b are assigned to vector a
//A contains four elements with a value of 2
//Returns the last element of a
//Returns the first element of a
//Returns the ith element of a if and only if a exists
//Empty elements in a
//Judge whether a is empty. If it is empty, it returns true. If it is not empty, it returns false
//Deletes the last element of the a vector
//Delete the first (from the 0) to the second element in a, that is, the deleted element starts from a.begin()+1 (including it) to a.begin()+3 (excluding it)
//Insert an element after the last vector of a with a value of 5
//Insert the value 5 at the position of the first element of a (calculated from the 0th),
//Insert 3 numbers at the position of the first element of a (calculated from the 0), and their values are all 5
//b is an array. Insert the third element of b to the fifth element (excluding b+6) at the position of the first element of a (calculated from the 0th element)
//Returns the number of elements in a
//Returns the total number of elements a can hold in memory
//Adjust the number of existing elements of a to 10, delete more and supplement less, and its value is random
//Adjust the number of existing elements of a to 10, delete more and supplement less, and its value is 2
//Expand the capacity of a to 100,
//b is a vector, and the elements in a and b are exchanged as a whole
//b is the vector, and the vector comparison operation is also! = > = > < =<

mono_kitti sample code

* This file is part of ORB-SLAM2.
* Copyright (C) 2014-2016 Raúl Mur-Artal <raulmur at unizar dot es> (University of Zaragoza)
* For more information see <https://github.com/raulmur/ORB_SLAM2>
* ORB-SLAM2 is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
* ORB-SLAM2 is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* GNU General Public License for more details.
* You should have received a copy of the GNU General Public License
* along with ORB-SLAM2. If not, see <http://www.gnu.org/licenses/>.




using namespace std;

void LoadImages(const string &strSequence, vector<string> &vstrImageFilenames,
                vector<double> &vTimestamps); //Picture reading function declaration

int main(int argc, char **argv)
    if(argc != 4)
        cerr << endl << "Usage: ./mono_kitti path_to_vocabulary path_to_settings path_to_sequence" << endl;
        return 1;

    // Retrieve paths to images
    vector<string> vstrImageFilenames;
    vector<double> vTimestamps;
    LoadImages(string(argv[3]), vstrImageFilenames, vTimestamps);//Image extraction, strsequence address pointed to by argv[3], and pass the image name into vsterimagefilenames (string) of vector vector type

    int nImages = vstrImageFilenames.size();//Read the number of vsterimagefilenames array number of Images

    // Create SLAM system. It initializes all system threads and gets ready to process frames.
    ORB_SLAM2::System SLAM(argv[1],argv[2],ORB_SLAM2::System::MONOCULAR,true);
    //ORB_SLAM2::System SLAM(argv[1],argv[2],ORB_SLAM2::System::STEREO,true); Where: argv [1] represents the path of the word bag file, and argv [2] represents the path of the configuration file

    // Vector for tracking time statistics
    vector<float> vTimesTrack;
    vTimesTrack.resize(nImages);//Expand the capacity of vector vector array vTimesTrack to the number of pictures

    cout << endl << "-------" << endl;
    cout << "Start processing sequence ..." << endl;
    cout << "Images in the sequence: " << nImages << endl << endl;
//So far, the picture load is complete

    // Main loop
    cv::Mat im; //Construct a matrix
    for(int ni=0; ni<nImages; ni++)
        // Read image from file
        im = cv::imread(vstrImageFilenames[ni],CV_LOAD_IMAGE_UNCHANGED);//Store picture information in the matrix
        double tframe = vTimestamps[ni];

            cerr << endl << "Failed to load image at: " << vstrImageFilenames[ni] << endl;
            return 1;

        std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();
        std::chrono::monotonic_clock::time_point t1 = std::chrono::monotonic_clock::now();

        // Pass the image to the SLAM system

        std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
        std::chrono::monotonic_clock::time_point t2 = std::chrono::monotonic_clock::now();

        double ttrack= std::chrono::duration_cast<std::chrono::duration<double> >(t2 - t1).count();


        // Wait to load the next frame
        double T=0;
            T = vTimestamps[ni+1]-tframe;
        else if(ni>0)
            T = tframe-vTimestamps[ni-1];


    // Stop all threads

    // Tracking time statistics
    float totaltime = 0;
    for(int ni=0; ni<nImages; ni++)
    cout << "-------" << endl << endl;
    cout << "median tracking time: " << vTimesTrack[nImages/2] << endl;
    cout << "mean tracking time: " << totaltime/nImages << endl;

    // Save camera trajectory

    return 0;

void LoadImages(const string &strPathToSequence, vector<string> &vstrImageFilenames, vector<double> &vTimestamps)
    ifstream fTimes;
    string strPathTimeFile = strPathToSequence + "/times.txt";
        string s;
            stringstream ss;
            ss << s;
            double t;
            ss >> t;

    string strPrefixLeft = strPathToSequence + "/image_0/";

    const int nTimes = vTimestamps.size();

    for(int i=0; i<nTimes; i++)
        stringstream ss;
        ss << setfill('0') << setw(6) << i;
        vstrImageFilenames[i] = strPrefixLeft + ss.str() + ".png";

Tags: OpenCV slam Autonomous vehicles

Posted on Mon, 18 Oct 2021 00:39:03 -0400 by pkmleo