Creation time: November 1, 2021
abstract
-
Nowadays, with the rapid development of science and technology, many fields, whether driverless or virtual reality, need to build maps and develop them on this basis. Maps are mainly divided into sparse map, dense map and semantic map. They have different advantages, disadvantages and applications. Dense map occupies a very important position, which can be applied to navigation, obstacle avoidance, interaction and so on.
-
This paper focuses on the topic of dense point cloud mapping. RGBD dense mapping is carried out by using TUM dataset under the operating system of Ubuntu-18.04. This paper first briefly explains the principle of dense mapping, then briefly describes the RGBD camera, then summarizes the TUM dataset, explains some of its characteristics and how to download and use the dataset. Then, the article introduces the open source libraries needed for drawing construction, such as OpenCV, Eigen, etc., from the four aspects of introduction, features, installation and tutorial, and gives the websites to learn these libraries. After that, the article describes a simple drawing code based on fixed frame selection method based on C + + 1 And some codes are analyzed in detail, including explaining which open source libraries above need to be used, and showing the effect of drawing. Finally, the paper briefly summarizes the drawing code written by myself, and gives the direction of future exploration.
-
It is hoped that readers will have some harvest after reading this report and become a map builder. At the same time, I am very glad that readers can point out the shortcomings of the article and give valuable suggestions.
Drawing principle
- Dense mapping is mainly divided into monocular dense mapping and RGBD dense mapping. RGBD camera is used for dense mapping 2 For shooting, it can directly measure the depth of pixels without calculating the depth through feature point matching, triangulation, epipolar search and block matching like monocular dense mapping.
- Real time dense mapping is to start from the collected pictures one by one, use the RGB information and depth information of the pixel points, convert the pixel points from the pixel coordinates to the camera coordinates, and then convert the camera coordinates to the world coordinates according to the camera pose, so as to generate three-dimensional map points.
- The quality of point cloud generated by mapping is affected by the internal parameters of camera, the quality of picture, the accuracy of camera pose and so on.
- Mapping can not and does not need to make full use of the information of each picture, but often only processes the key frames.
RGBD camera
1. Introduction
-
RGB-D camera depth camera, also known as depth camera, is a kind of camera rising around 2010. It can measure the distance between the object and the camera through infrared structured light or ToF principle (the distance between the object and the camera is very important in drawing).
-
At present, commonly used RGB-D cameras include Kinect/Kinect V2, xton Pro live, RealSense, etc. people also use it to recognize faces on some mobile phones.
2. Features
- Small amount of calculation: RGB-D camera measures the depth of pixels by physical means, saving a lot of computing resources.
- Technology is not yet mature: at present, most RGB-D cameras still have many problems, such as narrow measurement range, large noise, small field of vision, easy to be disturbed by sunlight, unable to measure transmission materials, etc. in terms of drawing construction, RGB-D cameras are mainly used indoors, but it is difficult to be used outdoors.
TUM dataset
1. Introduction
-
The TUM dataset was published by the Computer Vision Lab of Munich University of technology.
-
The RGB map and depth map of the TUM dataset are generated by the RGBD camera 2 Shoot it.
-
The TUM dataset contains data of different structures and textures, moving objects and 3D object reconstruction.
-
The frame number and size of the TUM data set are different, the motion is fast or slow, and there is a handheld camera.
2. Features
- The data set is very large and there are many scenes.
- Camera pose accuracy: the TUM data set provides accurate camera pose through the motion capture system.
- A set of python scripts are provided to process data sets: the scripts provided can align files in time, draw track diagrams, compare test results with real tracks, and so on.
3. Download
-
Enter the web page:
-
Download the required dataset:
4. * tutorial
- TUM dataset file format
- Camera internal parameters of TUM dataset
- TUM dataset scripting tool
- Unofficial QuickPass tutorial for TUM dataset
Open source library
1. OpenCV
1.1. Introduction
- OpenCV provides a large number of open element image algorithms. It is a widely used image processing algorithm library in computer vision.
- OpenCV consists of a series of C functions and a small number of C + + classes, and provides interfaces to Python, Ruby, MATLAB and other languages.
- OpenCV is an open source computer vision and machine learning software library that can run on Linux, Windows, Android and Mac OS operating systems.
1.2. Features
- Lightweight.
- Fast execution and processing speed.
1.3. Installation
- First install the OpenCV dependencies:
sudo apt-get install build-essential libgtk2.0-dev libvtk7-dev libjpeg-dev libtiff5-dev libjasper-dev libopenexr-dev libtbb-dev
- If this error occurs:
errorE: unable to locate libjasper-dev
- resolvent:
sudo apt-get update sudo apt-get install software-properties-common sudo apt install libjasper1 libjasper-dev
-
Enter the OpenCV official website and select the OpenCV for Linux version to download:
- To install OpenCV:
sudo make install make -j4
1.4. * tutorial
2. Eigen
2.1. Introduction
- Eigen is an open source library based on C + + templates. It provides fast linear algebraic operations related to matrices, and also supports numerical analysis and related algorithms.
- Eigen supports multiple compilation environments.
2.2. Features
- Wide scope of application: it supports all matrix operations, all standard numerical types, a variety of matrix decomposition and the solution of geometric features.
- Fast: expression templates allow you to intelligently delete temporary variables and enable deferred evaluation when appropriate.
- High reliability: the algorithm has been carefully selected and passed a large number of test suites.
- Eigen is a library built purely with header files. When in use, you only need to import eigen header files.
2.3. Installation
- Open the terminal and enter:
sudo apt-get install libeigen3-dev sudo apt-get install libeigen3-dev
2.4. * tutorial
3. PCL
3.1. Introduction
- PCL is a large cross platform open source C + + programming library. It implements a large number of general algorithms and efficient data structures related to point clouds, including point cloud acquisition, filtering, segmentation, registration, retrieval, feature extraction, recognition, tracking, surface reconstruction, visualization, etc.
- PCL supports a variety of operating system platforms and can run on Windows, Linux, Android, Mac OS X and some embedded real-time systems.
3.2. Features
- High real-time performance: PCL uses high-performance computing technology to improve the real-time performance of the program through parallelization.
- Using shared pointers: all modules and algorithms in PCL transmit data through Boost shared pointers, avoiding the need to copy existing data in the system for many times.
3.3. Installation
- Open the terminal and enter:
sudo apt-get install libpcl-dev pcl-tools
3.4. * tutorial
- PCL official introduction and tutorial
- PCL unofficial QuickPass tutorial (1)
- PCL unofficial QuickPass tutorial (2)
- PCL unofficial QuickPass tutorial (3)
Mapping Practice
1. Description
- In order to highlight the libraries used, the libraries mentioned below are listed in Empty link The format indicates that the real link can be viewed by [^ footnote].
- The drawing in this paper is TUM dataset3.
- The construction of this paper uses iostream library ,OpenCV Library4,Eigen Library5,PCL Library6.
- In this paper, the key frame selection method is fixed frame selection method.
2. Drawing framework
- The drawing construction in this paper is divided into four steps, as shown in the figure below:
3. * framework analysis
3.1. Downloading data sets
- Download the TUM dataset.
- Open the terminal and use associate.py 7 Align RGB map, depth map and camera pose:
python associate.py rgb.txt depth.txt > associate.txt python associate.py associate.txt groundtruth.txt > associate_with_groundtruth.txt
3.2. Data set preprocessing
- Read Associate_ with_ Information in the groundtruth.txt file( iostream):
ifstream fin("~/associate_with_groundtruth.txt"); // Read RGB map file name, depth map file name and camera pose alignment information for (int i = 0; i < 12; i++) fin >> data[i]; // Enter alignment information
- Read aligned pictures and camera pose:
vector<cv::Mat> colorImgs, depthImgs; // Color map and depth map vector<Eigen::Isometry3d> poses; // Camera pose
colorImgs.push_back(cv::imread("RGB Title")); // Read aligned RGB map depthImgs.push_back(cv::imread("Depth map name")); // Read aligned depth map poses.push_back(T); // Read aligned camera pose
3.3. Calculate and splice point clouds
- Defines the format of points and point clouds( PCL):
typedef pcl::PointXYZRGB PointT; //Point type (world coordinates) typedef pcl::PointCloud<PointT> PointCloud; //Point cloud type
- New point cloud ([ PCL]()):
PointCloud::Ptr pointCloud(new PointCloud); // Create a new point cloud PointCloud::Ptr slide_win(new PointCloud); // Create a new point cloud compression window
- Select and process keys( Eigen):
- Select pictures evenly as keyframes:
for (int i = 0; i < NImage; i+=IStep) {~} //Select keys evenly
- Calculate the camera coordinates of pixels( Eigen):
Eigen::Vector3d point; // Camera coordinates of pixels point[2] = double(d) / depthScale; // Calculate camera coordinates point[0] = (u - cx) * point[2] / fx; point[1] = (v - cy) * point[2] / fy;
- Calculate the world coordinates of pixels( Eigen):
Eigen::Vector3d pointWorld = T * point; // Convert camera coordinates to world coordinates
- Inserts pixels into the point cloud of the current keyframe( PCL):
current->points.push_back(p); //Insert points into the current point cloud
- Perform depth filtering on the point cloud of the current key frame( PCL):
pcl::StatisticalOutlierRemoval<PointT> statistical_filter; // depth filter and statistical removal statistical_filter.setInputCloud(current); // Depth filtering and statistical removal of current point cloud statistical_filter.filter(*tmp); // Assign the result to temp point cloud
- Insert the filtered point cloud into the compression window( PCL):
(*slide_win) += *tmp; // Insert the filtered point cloud into the compression window slide_win
- Voxel filtering is performed on the compressed window that meets the conditions and stuffed into the point cloud( PCL):
pcl::VoxelGrid<PointT> voxel_filter; // voxel filter (voxel compression) voxel_filter.setInputCloud(slide_win); // Voxel filtering the slide_win point cloud voxel_filter.filter(*slide_win); // Assign the result to slide_win point cloud (*pointCloud) += *slide_win; // Insert the filtered point cloud into the point cloud
3.4. Display video and point cloud
- Show video( OpenCV):
cv::imshow("img",colorImgs[i]); // Show video
- Show point cloud( PCL):
pcl::visualization::CloudViewer viewer ("Viewer"); // Window showing point cloud viewer.showCloud(pointCloud); // Show the generated point cloud
3.5. Final treatment
- Perform the final voxel filtering on the point cloud( PCL):
voxel_filter.setInputCloud(pointCloud); //Perform the final global compression on the point cloud voxel_filter.filter(*pointCloud); //Assign the result to the point cloud
- Save point cloud file( PCL):
pcl::io::savePCDFileBinary("map.pcd", *pointCloud); //Save point cloud file
- Wait until the display window is closed and end the program:
while (!viewer.wasStopped ()){} //Make the viewer window not exit automatically
- If you want to see the cloud after the program, open the terminal:
pcl_viewer map.pcd
4. Total code of drawing construction based on fixed frame selection method
#include <iostream> #include <fstream> using namespace std; #include <opencv2/core/core.hpp> #include <opencv2/highgui/highgui.hpp> #include <Eigen/Geometry> #include <boost/format.hpp> #include <pcl/point_types.h> #include <pcl/io/pcd_io.h> #include <pcl/filters/voxel_grid.h> #include <pcl/visualization/pcl_visualizer.h> #include <pcl/filters/statistical_outlier_removal.h> #include <pcl/visualization/cloud_viewer.h> int NImage = 2486; //Number of pictures after two alignments double NSlide = 1000000; //Point cloud compression window int IStep = 360; //Select the step size of the key frame double resolution = 0.005; //Point cloud resolution int main(int argc, char **argv) { vector<cv::Mat> colorImgs, depthImgs; // Color map and depth map vector<Eigen::Isometry3d> poses; // Camera pose // Use fin to read the file information of associate_with_ground truth.txt (RGB map file name, depth map file name, camera pose alignment information) ifstream fin("/home/lujunying/rgbd_dataset_freiburg3_long_office_household/associate_with_groundtruth.txt"); if (!fin) { cerr << "cannot find pose file" << endl; return 1; } //Read the information of aligned pictures (RGB map, depth map, camera pose) for (int i = 0; i < NImage; i++) { // Process each picture string data[12];// Enter the alignment information of a drawing for (int i = 0; i < 12; i++) fin >> data[i]; colorImgs.push_back(cv::imread("/home/lujunying/rgbd_dataset_freiburg3_long_office_household/"+data[1])); // Read aligned RGB map (path to dataset) depthImgs.push_back(cv::imread("/home/lujunying/rgbd_dataset_freiburg3_long_office_household/"+data[3])); // Read aligned depth map (path to dataset) Eigen::Quaterniond q(atof(data[11].c_str()), atof(data[8].c_str()), atof(data[9].c_str()), atof(data[10].c_str())); // Read aligned camera rotation matrix [quaternion (w, x, y, z)] Eigen::Isometry3d T(q); T.pretranslate(Eigen::Vector3d(atof(data[5].c_str()), atof(data[6].c_str()), atof(data[7].c_str()))); // Reads the translation vector of the aligned camera poses.push_back(T); // The pose of the aligned camera } // Calculate point cloud and splice // Camera internal parameters (see camera internal parameters of TUM dataset) double cx = 320.1; double cy = 247.6; double fx = 535.4; double fy = 539.2; double depthScale = 5000.0; cout << "Converting image to point cloud..." << endl; // Define the format used by the point cloud: XYZRGB is used here typedef pcl::PointXYZRGB PointT; //Point type (world coordinates) typedef pcl::PointCloud<PointT> PointCloud; //Point cloud type // Create a new point cloud PointCloud::Ptr pointCloud(new PointCloud); // Create a new point cloud compression window PointCloud::Ptr slide_win(new PointCloud); // Select and process the key frame, generate and display the point cloud after twice filtering pcl::visualization::CloudViewer viewer ("Viewer"); // Open the window showing the point cloud for (int i = 0; i < NImage; i+=IStep) { // Evenly select the picture as the key frame and process it PointCloud::Ptr current(new PointCloud); cout << "Convert in keyframes: " << i + 1 << endl; cv::Mat color = colorImgs[i]; cv::Mat depth = depthImgs[i]; Eigen::Isometry3d T = poses[i]; for (int v = 0; v < color.rows; v++) // Processing pixels on a picture for (int u = 0; u < color.cols; u++) { unsigned int d = depth.ptr<unsigned short>(v)[u]; //Get depth value if (d == 0) continue; // If the depth of the pixel is not measured, jump to the next pixel Eigen::Vector3d point; // Camera coordinates of pixels point[2] = double(d) / depthScale; // d divided by scale becomes the unit of meters point[0] = (u - cx) * point[2] / fx; point[1] = (v - cy) * point[2] / fy; Eigen::Vector3d pointWorld = T * point; // Change camera coordinates to world coordinates PointT p; // Point (XYZRGB format) p.x = pointWorld[0]; p.y = pointWorld[1]; p.z = pointWorld[2]; p.b = color.data[v * color.step + u * color.channels()]; p.g = color.data[v * color.step + u * color.channels() + 1]; p.r = color.data[v * color.step + u * color.channels() + 2]; current->points.push_back(p); //Insert points into the current point cloud } // depth filter and statistical removal cout << "current Point cloud common" << current->size() << "Points." << endl; PointCloud::Ptr tmp(new PointCloud); pcl::StatisticalOutlierRemoval<PointT> statistical_filter; statistical_filter.setMeanK(50); statistical_filter.setStddevMulThresh(1.0); statistical_filter.setInputCloud(current); // Depth filtering and statistical removal of current point cloud statistical_filter.filter(*tmp); // Assign the result to temp point cloud current.reset(); // Freeing current point cloud memory cout << "tmp Point cloud common" << tmp->size() << "Points." << endl; (*slide_win) += *tmp; // Insert the filtered point cloud into the compression window slide_win tmp.reset(); // Free temp point cloud memory cout << "slide_win Point cloud common" << slide_win->size() << "Points." << endl; if(slide_win->size() > NSlide || slide_win->size() == NSlide || i == 0 || (i+IStep)>( NImage-1)){ //If slide_win satisfies the voxel compression condition // voxel filter (voxel compression) pcl::VoxelGrid<PointT> voxel_filter; voxel_filter.setLeafSize(resolution, resolution, resolution); voxel_filter.setInputCloud(slide_win); // Voxel filtering the slide_win point cloud voxel_filter.filter(*slide_win); // Assign the result to slide_win point cloud cout << "After compression slide_win Point cloud common" << slide_win->size() << "Points." << endl; (*pointCloud) += *slide_win; // Insert the filtered slide_win point cloud into the point cloud slide_win.reset(new PointCloud); // Free slide_win point cloud memory } cout << "pointCloud Point cloud common" << pointCloud->size() << "Points." << endl; cout << "-------------------------------------------------------------------------------" << endl; // Show video and generated point cloud cv::imshow("img",colorImgs[i]); // Show video cv::waitKey(30); viewer.showCloud(pointCloud); // Show point cloud } // Perform the final global compression on the point cloud cout << "After two filtering, the point clouds share" << pointCloud->size() << "Points." << endl; pointCloud->is_dense = false; // This is useless pcl::VoxelGrid<PointT> voxel_filter; // Final global compression voxel_filter.setLeafSize(resolution, resolution, resolution); voxel_filter.setInputCloud(pointCloud); voxel_filter.filter(*pointCloud); viewer.showCloud(pointCloud); //Save the generated point cloud file cout << "After the final two filtering, the point cloud has" << pointCloud->size() << "Points." << endl; pcl::io::savePCDFileBinary("map.pcd", *pointCloud); //End program while (!viewer.wasStopped ()){} //Keep the viewer window and do not exit automatically return 0; }
5. Drawing effect
- The final generated point cloud is shown below:
Present and future
- In this paper, the method of window compression is used to build point cloud, which greatly reduces the unnecessary memory of point cloud and increases the scale of drawing.
- The construction of this paper is based on TUM dataset For the real-time establishment, in the future, the knowledge of ROS will be used to build maps according to the videos taken by hand-held cameras.
- In this paper, the fixed frame selection method is adopted for the selection of key frames, which has the advantages of easy to understand and simple programming, but has the disadvantages of high randomness and average quality. For example, when the motion is very slow, a large number of similar key frames will be selected, redundant, and many important key frames will be lost when the motion is fast. In the future, we will learn ORB-SLAM2[5] and GCNv2[6] The related algorithms of key frame extraction and dense reconstruction are improved.
reference
[1] Gao Xiang, Zhang Tao. Fourteen lectures on visual SLAM: from theory to practice. (2nd Edition). Beijing: Electronic Industry Press, 2019
[2] Stephen Prata. C++ Primer Plus. (6th Edition, Chinese version). Beijing: People's Posts and Telecommunications Press, July 2020
[3] T. Barfbot, State estimation fbr robotics: A matrix lie group approach, 2016.
[4] R. Mur-Artal, J. M. M. Montiel, and J. D. Tards, "ORB-SLAM: A versatile and accurate monocular slam system," IEEE Trans. Robot., vol. 31, no. 5, pp. 1147–1163, Oct. 2015.
[5] R. Mur-Artal and J. D. Tardós, "ORB-SLAM2: An open-source SLAM system for monocular, stereo and RGB-D cameras," IEEE Trans. Robot., vol. 33, no. 5, pp. 1255–1262, Oct. 2017.
[6] J. Tang, J. Folkesson, and P. Jensfelt, "GCNv2: Efficient Correspondence Prediction
for Real-Time SLAM," IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 4, NO. 4, OCTOBER 2019.
See point 4 of drawing practice for the code. ↩︎