This page presents some techniques developed to perform an automatic matching of points between reflectance (acquired with a laser scanner) and intensity images (acquired with digital cameras). We consider an un-calibrated situation: no previous knowledge is available about the position or internal parameters of the sensors. The choice of using reflectance images rather than three-dimensional information for the matching was motivated by the fact that reflectance images are 2D images such as photographs. The matching can then be performed in 2D using techniques already existing for matching intensity images [Heipke96, Lang95, Brown92].
In our case, an additional difficulty appears compared to traditional matching techniques, since the images to match are not coming from the same sensor. The camera is a high-resolution passive sensor whereas the laser scanner is an active sensor of lower resolution. This leads to images with different field of view, resolution, and properties (illumination and colour for instance).
Usually, a matching procedure uses either features or area-based techniques. Feature-based techniques use pre-processed features to match the images (edges for example). Area-based techniques work directly on the grey values of the images to extract matching information. Our system uses a mixed feature and area based matching technique to get corresponding points between the images to be registered. The technique can be divided in two steps.
Resizing of the reflectance image based on the intensity image. This step uses edges to solve the problem of the difference of resolution and field of views between the images to match. The output is a resized reflectance image with the same size and with the main features located in the same area than the intensity image.
The goal of the resizing algorithm is to apply a planar affine transformation to the reflectance images to make them “fit” as well as possible to the intensity images. The resulting images will be used in the matching process. The whole algorithm uses the red channel of the intensity image because of its higher similarity with the infrared reflectance image.
The transformation between the images is a 3D transformation characterised by a rotation, a translation and a distortion due to the sensor used (camera and scanning device parameters). Depending on the devices used, this transformation can be characterised by up to 15 parameters [Luong94]. In this phase, a first approximation is going to be performed using only five parameters: the planar translation co-ordinates in both axis, the X and Y scaling factors and the planar rotation angle between both images. These parameters are modelled with a 3x2 affine matrix.
Obviously, the result will not be exact but an approximation of the transformation between the two images. Still, the main features of both images will be overlapping enough to allow the use of normal area based matching algorithms to extract matching points.
The whole affine resizing process is illustrated in figure 1:
Figure
1:
The algorithm starts by transforming the reflectance image from spherical to Cartesian co-ordinates . Once the transformation is computed, the user must select a part of the Cartesian reflectance image that corresponds approximately to the same area of the intensity image. The position, size and orientation of this window will be used to compute the first affine transformation.
An iterative algorithm will improve the initial approximation. In each step, several affine transforms are computed within an interval around the actual estimates. The transform that minimizes the average distance between edges is selected as the new transformation for the next iteration. The average distance between edges is computed by applying the affine transformed edges as a mask over the distance transform image. The distance to the closest edge is determined for each edge point (as the pixel value in the distance transform image) and used to compute the average distance between edges in the two images.
The algorithm converges when the average distance does not change between two iterations. In addition a maximum number of ten iterations is set to avoid situations where the algorithm can oscillate between two values.
Figure
2 presents some results of the resizing process with images from an indoor
environment: a laboratory at the Joint Research Centre in Italy. For
each image, the figure presents the superposition of edges (in green, from the
fixed intensity images and in red from the resized reflectance image) at the
beginning (Figure
2a) and at the end (Figure
2b) of the resizing algorithm, as well as the evolution of the average
distance between the edges in the images (Figure
2c).
(a) (b) (c)
Figure 2:
A limitation of the algorithm is the importance of the first approximation depending on the area of interest defined by the user. A large error in this approximation can influence the algorithm and make it perform poorly. Still, this user selection is necessary for the system to work with un-calibrated images taken from unknown viewpoints. If some restrictions are imposed in the acquisition system (for example a fixed camera on the top of the laser scanner), additional information about the relative position of the sensors can be used to make this selection automatic.
The resizing algorithm presented in the previous section provides two images with the same resolution and with overlapping global features (edges). The next step of our matching procedure uses classical area based techniques to extract matching points from these images. Matching between reflectance and intensity images is a difficult task because of the different origins of the data. First of all, the light in reflectance images comes only from the laser instead of the natural light for photographs. Secondly, the wavelength of the laser will interfere with the colours on the scene and will affect the final colours of the object in the reflectance image. Finally, orientation and reflecting properties of the scene can prevent the laser beam from returning to the laser scanner and thus holes can appear in the reflectance image.
The implemented matching algorithm starts by selecting points of interest in both reflectance and intensity images. We decide to use corners since they correspond to areas of the images with a large gradient variation and thus, easier to identify using area based techniques. They also are more robust to errors coming from the resizing algorithm. Once corners are identified in the images, the displaced frame difference is computed between these points to select corresponding corners.
The corner detector implemented is the one presented in [Haralick93]. This corner detector first selects windows of interest and afterwards computes more precisely the position of the points inside the selected windows of interest. The similarity between intensity images can be evaluated using different measures. The most popular one is the cross correlation but other measures have been developed. In our case, we use the displaced frame difference (dfd). This measure gives approximately the same results as cross correlation but is computationally more efficient, making the matching process faster.
The algorithm performs as follows: for each corner in the reflectance image, the dfd is computed with the corners in the intensity image located inside a search space. The matching pixel is selected as the intensity corner, within the given window, with the lowest dfd. Different thresholds are used in reflectance and intensity corner detection to ensure that more corners will be detected in intensity images.
Figure 3 presents the final correspondences computed over the Laboratory image.
Given the matching between reflectance and intensity images, and since the reflectance image is fully registered with range data, it is easy to associate the 3D co-ordinates in range data with the corresponding pixel positions in the digital photograph. A Tsai camera calibration technique [Tsai86, Tsai87] is used to find the extrinsic and intrinsic parameters of the camera. The Tsai model is based on a pinhole perspective projection and the following eleven parameters are to estimate:
f - focal length of camera,
k - radial lens distortion coefficient,
Cx, Cy - co-ordinates of centre of radial lens distortion,
Sx - scale factor to account for any uncertainty due to imperfections in hardware timing for scanning and digitisation,
Rx, Ry, Rz - rotation angles for the transformation between the world and camera co-ordinates,
Tx, Ty, Tz - translation components for the transformation between the world and camera co-ordinates.
An online implementation of the Tsai calibration algorithm proposed by Reg Willson is available here.

Figure 4: Tsai camera re-projection model with perspective projection and radial lens distortion
Automatic matching systems return normally a significant number of wrong correspondences. Sometimes a few correspondences are completely wrong and can prevent any precise estimation if considered: these points are known as outliers. Outlier handling is widely studied in the literature and robust regression methods used to deal with this effect can be found, for example, in [Rousseeuw87, Torr97, Gracias97]. These works show that random techniques can be a valid alternative to achieve good estimation of the camera’s parameters, for datasets containing a large percentage of outliers. Given their different natures, reflectance and intensity images are difficult to match. As a result, the output of the matching algorithm (see previous section) will be corrupted with many outliers. To ensure good calibration estimation even in these conditions, a well-known random method has been selected: the RANdom SAmple Consensus estimation technique (RANSAC) [Fischler81].
After the calibration procedure, a full model for the camera is available. Using this information, each 3D co-ordinate in the range image can be re-projected into the intensity colour image according to the camera model. Since range and reflectance are directly registered, it is possible to establish an association between pixels in reflectance and intensity images, and compute a new reflectance image based on the intensity colour values.
The final image is useful to evaluate the quality of the registration in an easy and fast way. It can also be used directly to texture map the 3D models, giving a much more realistic impression than for a model only textured with the reflectance image. The complete procedure is summarized in Figure 5 with the example acquired at the laboratory example.

Figure 5: The re-projection procedure with an image of the laboratory
Typically, the field of view of a range sensor is much larger than the one from a normal camera: It is usual nowadays to find laser scanners capable of acquiring 270° or even 360° images. In these cases, more than one photograph is necessary to cover the whole reflectance image. To solve this problem, after each image is re-projected into the laser co-ordinate frame, a feathering operation is applied. This process consists in blending the data values in the zone where they overlap. The operation will result in a gradual transition from one image to another and will hide small misalignments between images. The blending technique that was implemented uses the Euclidean distance to the closest border of the re-projected area as measure of the importance of the pixel in the final texture map. The technique leads to good feathering results as presented in Figure 6 with smooth transitions between images.
Figure 7 and Figure 8 present the different steps of the re-projection and feathering process for two scenes that have been used to illustrate our algorithms. The figures present an indoor scenes (the laboratory at JRC) and an outdoor scenes (a Farm house in Laveno). The scenes were acquired with a RIEGL LMS-Z210 laser scanner and a Canon Proshot 70 digital camera.
For each example we present: the final texture map obtained from the re-projection and feathering of the intensity images (a) and snapshots of the model after applying the texture map (b,c).
Figure 7: (a) texture map (12 images) and two snapshots (b,c) of the textured model of the laboratory
Figure
8: (a) texture map (3 images) and two snapshots (b,c) of the textured model of the
Laveno farmhouse
The main characteristics of the whole registration algorithm presented in this chapter are:
· Independence from the acquisition sensors. The technique has been successfully used with different experimental set ups (laser scanners and cameras from different manufacturers with different resolutions and options);
· No need for previous calibration;
· Simple evaluation of the quality of the registration with the computed re-projected images;
· Possibility for the user to interact with the system and, if necessary, guide the process;
· Polyvalent system that allows the use of several configurations to optimise the registration step. For example, if internal parameters of the camera are known, these parameters can be directly fixed and only external camera parameters will be evaluated.
The algorithm performance depends on the quantity and quality of the 2D features and on the similitude between the images (better results are obtained with images taken from close viewpoints).
The main innovations introduced are the colour feathering of the merged digital photographs, the occlusion handling for selecting the pixels to be used in the final texture map and a calibration-tuning algorithm to refine the first calibration computed based on robust estimation.
A weak point of the calibration procedure is still the difficulty to get automatically fully reliable correspondences over the reflectance and intensity images.
This research was partly funded by the EC CAMERA TMR (Training and Mobility of Researchers) network, contract number ERBFMRXCT970127 and by the Portuguese Foundation for Science and Technology through the Ph.D. grant PRAXIS XXI/BD/19555/99.
| [Brown92] |
L. G. Brown, A
Survey of Image Registration Techniques. Surveys Vol. 24, No. 4, pp. 325-376, December 1992
|
| [Fischler81] |
M.A. Fischler, R.C Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, Vol. 24, No. 6, pp. 381-395, 1981
|
| [Gracias97] |
N. Gracias, Application
of Robust Estimation to Computer Vision: Video Mosaics and 3D-
Reconstruction Master Degree thesis, IST-Technical University of Lisbon, Portugal, 1997
|
| [Haralick93] |
R.M. Haralick, L.G. Shapiro, Computer and Robot Vision. Addison-Wesley,1992 and 1993.
|
| [Heipke96] |
C. Heipke, Overview of image matching techniques. OEEPE Workshop on the application of Digital Photogrammetric Workstations, 1996.
|
| [Lang95] |
F. Lang, W. Förstner, Matching
Techniques. Proc. 2nd Course in Digital Photogrammetry, Bonn, Feb. 1995
|
| [Luong94] |
Q.T. Luong, O.D. Faugeras, The
Fundamental Matrix: Theory, Algorithms, and Stability Analysis, International Journal of Computer Vision, 17, pp. 43-75, 1996.
|
| [Rousseeuw87] |
P.J. Rousseeuw, A.M. Leroy, Robust
Regression and outlier detection. John Wiley & Sons New York, 1987.
|
| [Torr97] |
Torr, P.H.S., Murray, D.W., The Development and Comparison of Robust Methods for Estimating the Fundamental Matrix. International Journal of Computer Vision, Vol. 24, No. 3, pp. 271-300, September October 1997.
|
| [Tsai86] |
R.Y. Tsai,
|
| [Tsai87] |
R.Y. Tsai,
|
| [Wilson94] |
Reg G. Willson
|
Author: Paulo Dias at IEETA/Universidade de Aveiro, Portugal - 15/01/2004