The camera calibration algorithms we developed allow the semi-automatic registration of laser range data and intensity images. The registered intensity images are used as texture maps to enhance the 3D models computed from range imagery, but all 3D geometry is still extracted only from range data analysis.
A logical evolution consists in the use of registered intensity images not only to texture-map but also to improve 3D range segmentation and model geometry. This will give the possibility to improve the quality of the models, without acquisition of extra range data, but using the already available 2D intensity information as an additional source of 3D geometry. We are going to use two or more registered images to improve the geometry and segmentation of 3D models. In this case, 3D information can be triangulated directly from the already registered intensity images providing a second source of 3D information.
In the following sections, we will present the passive triangulation method implemented to extract 3D information from two or more images. This information will be used for several purposes:
Calibration tuning: to improve the camera model by comparing passive triangulated points with range points;
Dense mapping: used to introduce additional information/points into the original range cloud of points by fusing the information of laser scanners and digital cameras into a single cloud of points.
Depth extraction from intensity images is not a new topic of investigation. Many techniques such as stereoscopic vision or photogrammetry already deal extensively with this problem. The algorithms used by our system to compute correspondences over intensity images are based in typical tools such as epipolar constraints, fundamental matrix, image rectification and cross correlation. All these techniques are already well known and well documented [Faugeras93, Pollefeys00, Hartley00].
The main difficulty to overcome when working with intensity images in 3D reconstruction is the matching problem: how to get accurate corresponding points over different images? The technique we adopted uses the fundamental matrix to link intensity images through epipolar geometry, according to the following equation:
where
X (x,y,1) is a homogeneous co-ordinate in the first image, X'(x',y',1)
is the corresponding co-ordinate in the second image and F is the 3×3
Fundamental matrix that links the two images through epipolar geometry.
Our matching technique also uses the initial camera calibrations (see camera calibration algorithms) to guide a cross correlation-matching algorithm over the intensity images. The correspondences feed a Ransac 8-point algorithm that computes the fundamental matrix between images [Torr97, Fischler81]. This matrix is used to find corresponding epipolar lines and build the rectified images where rows correspond to the same epipolar lines. Using rectified images to compute correspondences between images reduces the matching problem from two to one dimension.
Figure 1 present two digital photographs from a church in Laveno/Italy used to test our algorithm. In this figure, we also display the epipolar lines obtained through the fundamental matrix estimation step.

Figure 1: Two images of a church in Laveno (Italy) and the epipolar lines.
The rectified images computed with this epipolar lines appear in Figure 2:

Figure 2: Rectified images of the Laveno church model.
Cross correlation is used on the rectified images to find corresponding points along the epipolar lines. Once more, the seed for the search space comes from the existing calibrations starting from points of interest extracted in the first image. Sub-pixel accuracy is obtained by fitting a second-degree curve to the cross correlation coefficients in the neighbourhood of the disparity [Sun02, Anadan89].
The
matching points in the images are used to compute the camera projective rays by
inverting the perspective projection equations from Tsai model and obtain the
parametric equation of the two rays. A
user threshold defines the maximum distance between two rays above which
triangulated points are not considered. Otherwise, the triangulated point is
computed as the centre of the segment between the two closest points in the rays
[Eberly00].
Figure
3
illustrates the triangulation process. It presents the range cloud of
points of the church of Laveno, the position of the camera for the two
considered images (as two cones) and the projective rays for a few points.

Figure 3: The passive triangulation process
The process presented in the previous section gives, for each point of interest, the 3D position of the corresponding triangulated point. It is possible to use this information to evaluate the quality of the camera calibration, by measuring the distance from the triangulated points to the 3D cloud acquired with the laser. To optimise this computation, the range cloud of points is bucketed into referenced small cubes. This permits a fast navigation inside the cloud of points and a fast computation of the closest range points
For each triangulated point, the closest orientation discontinuity point in the range image is found. Figure 4a presents, in the same image, the cloud of points from the range image and the 3D triangulated points obtained from the initial calibrations. The segments in the figure indicate the triangulated points and their closest orientation discontinuity in the range data. Points for which the distance between the triangulated and range discontinuity is larger than a given distance are not considered in the iteration to compute the new camera model.
To improve the
calibration and force triangulated points to converge into the 3D cloud, the
closest orientation discontinuity in the range image is used with the matching
points to perform a new Tsai camera calibration for each image. The process is
iterative and points are triangulated and compared with the range points in the
3D cloud, in each loop of the cycle.
This process continues
as long as the average distance between triangulated and range points decreases.
At this stage an additional optimisation is introduced in the loop to correct
matching errors in the intensity images. In these new cycles of the process,
pixel positions of the correspondences are updated using the current camera
model. The closest 3D range orientation discontinuity (used to calibrate the
cameras) are re-projected into the images, and the new match coordinates in
every intensity image are computed as the centre between the re-projected point
and the original position of the matching. Figure
4b
presents the results of the triangulation at the end of the process
using the optimised camera parameters.

(a) (b)
Figure 4: Range points (black) and the intensity triangulated points (black) before and after the optimisation process
The tools developed to fine tune the camera calibration ensure that the cameras are well registered with the range data. These tools offer a new possibility when computing our 3D models: they permit to add new 3D points to the original range cloud. This can be particularly useful in areas where range data is missing (non reflective areas, occlusions, missing scans, etc.) or in parts of the models highly textured and rich in 3D content. In these situations, additional 3D data can be computed from pairs of intensity images to compensate for the lack of range data or to increase the 3D point density. Intensity images can be a valuable source of data since it is possible to acquire them easily, fast and with a high resolution.
The process here is approximately the same used to fine tune the camera models, but applied to as many points as possible. This corresponds to the dense depth-mapping step in stereo/photogrammetry techniques. The main difference is due to the fact that the range is not used to guide the matching anymore since we are trying to add data in region where range information may not be available.
The matching is done along epipolar lines considering that the images were taken from close viewpoints (such as in stereo techniques) meaning that matching points are spatially close in both images. As in the previous section, thresholds for cross correlation as well as the symmetry condition are used during the matching phase. In addition, we also consider an ordering condition to guarantee that the order of detected correspondences in the two images is the same. These conditions are widely used in dynamic matching techniques [Cox96].
Practically, all points where the variation of the gradient is significant are triangulated. Only uniform areas where the matching is not reliable are not considered (e.g. large areas of same colour such as white walls in the Barza data set, see Figure 6 ).
The triangulated points can then be introduced into the range cloud of points before entering a 3D reconstruction process, but with data coming from both range and intensity sensors
In Figure 5 we present some results obtained on the Laveno church. In this example, the range data was acquired with a Zoller and Fröhlich IMAGER 5003 laser scanner [Frölich98] and the photographs were acquired with a Canon PowerShot70 digital camera. Similar results are presented with a model of a church in Barza, Italy (figure 6). In this case, the range image was acquired a Riegl LMS-Z210 Laser Scanner [Riegl99] and the photographs were taken with a Canon PowerShot70 digital camera. The intensity images cover only a reduced part of the model in order to demonstrate how photographs of high resolution can be used to increase 3D point density in the range cloud (the “rosace” of the church in this example).

Figure 5: Addition of triangulated points (black) in the range data (light grey) of the Laveno church, Italy.

Figure 6: Addition of triangulated points (black) in the range data (light grey) of a church in Barza, Italy.
We
demonstrate the possibility to use passive triangulation of intensity
images to improve 3D models computed from range data. A method was presented to
improve initial camera calibrations by measuring the error between range data
and points that are triangulated from two or more intensity images. Once the
data (range and intensity) is fully registered, the 3D information can come from
the two sources allowing to select the best data, for a given area of the model
or for a given application. The intensity data can be used to add 3D points in
some areas of the model, as shown with several examples acquired with different
sensors. In these examples, the resolution has been significantly increased in
some parts of the final model using digital photographs.
The
quality of the process depends mainly on two factors. The original parameters of
the cameras can influence the process used to refine the camera calibrations
since they are used as initial estimation. This means that if images are badly
registered initially, the algorithm cannot compensate and the result is of poor
quality. The other important factor on the process is the quality of the 3D edge
detection in range data, since orientation discontinuities in the cloud of
points are used in the calibration refining process to select the closest 3D
point for the next calibration. Fortunately the resolution of new lasers has
increased making detection of discontinuities easier and much more reliable.
Finally,
the acquisition of intensity data is also an important matter. Because of the
dense mapping step, it is important to acquire data from close viewpoints and
rich in texture information, in order to ensure a reliable matching between
features.
This research was funded by the Portuguese Foundation for Science and Technology through the Ph.D. grant PRAXIS XXI/BD/19555/99.
| [Anadan89] |
P. Anadan, A
Computational Framework and an Algorithm for the Measurement of Visual
Motion, International Journal of Computer Vision, Vol. 2, No. 3, pp. 283-310, January 1989.
|
| [Cox96] |
I. Cox, S. Hingorani, and S. Rao, A Maximum Likelihood Stereo Algorithm, Computer Vision and Image Understanding, Vol. 63, No.3, pp. 542-567, May 1996.
|
| [Eberly00] |
D.H. Eberly, 3D Game Engine
Design: A Practical Approach to Real-Time Computer Graphic. Morgan Kaufmann Publishers, September 2000.
|
| [Faugeras93] |
O. Faugeras, Three-Dimensional
Computer Vision. MIT Press, 1993.
|
| [Fischler81] |
M.A. Fischler, R.C Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, Vol. 24, No. 6, pp. 381-395, 1981.
|
| [Frölich98] |
C. Fröhlich C., M. Mettenleiter, F. Haertl, Imaging Laser
Radar for High-Speed Monitoring of the Environment. Zoller&Fröhlich
technical report 1998. |
| [Haralick93] |
R.M. Haralick, L.G. Shapiro, Computer and
Robot Vision. Addison-Wesley,1992 and 1993.
|
| [Hartley00] |
R.I. Hartley, A. Zisserman, Multiple
View Geometry in Computer Vision, Cambridge University Press, June 2000.
|
| [Pollefeys00] |
M. Pollefeys, 3D Modelling from Images, Tutorial notes, in conjunction with ECCV 2000, Dublin, Ireland, June 2000.
|
| [Riegl99] |
RIEGL GmbH, Laser Mirror Scanner LMS-Z210- Technical document & user’s instruction manual. November
1999. |
| [Sun02] |
C. Sun Fast Stereo
Matching Using Rectangular Subregioning and 3D Maximum-Surface Techniques. International Journal of Computer Vision, Vol 47, No.1-3, pp. 99-117, May 2002.
|
| [Torr97] |
Torr, P.H.S., Murray, D.W., The Development and Comparison of Robust Methods for Estimating the Fundamental Matrix. International Journal of Computer Vision, Vol. 24, No. 3, pp. 271-300, September October 1997.
|
Author: Paulo Dias at IEETA/Universidade de Aveiro, Portugal - 15/01/2004