General

A robust multi-camera 3D ellipse fitting for contactless measurements

Description
A robust multi-camera 3D ellipse fitting for contactless measurements
Categories
Published
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A Robust Multi-Camera 3D Ellipse Fitting for Contactless Measurements Filippo Bergamasco, Luca Cosmo, Andrea Albarelli and Andrea TorselloDipartimento di Scienze Ambientali, Informatica e StatisticaUniversit´a Ca’ Foscari - Venice, ItalyEmail: bergamasco@dais.unive.it lcosmo@dais.unive.it albarelli@unive.it torsello@dais.unive.it Abstract  Ellipses are a widely used cue in many 2D and 3D ob- ject recognition pipelines. In fact, they exhibit a number of useful properties. First, they are naturally occurring inmany man-made objects. Second, the projective invarianceof the class of ellipses makes them detectable even without any knowledge of the acquisition parameters. Finally, theycan be represented by a compact set of parameters that canbe easily adopted within optimization tasks. While a largebody of work exists in the literature about the localization of ellipses as 2D entities in images, less effort has been put inthe direct localization of ellipses in 3D, exploiting imagescoming from a known camera network. In this paper we propose a novel technique for fitting elliptical shapes in 3Dspace, by performing an initial 2D guess on each image fol-lowed by a multi-camera optimization refining a 3D ellipsesimultaneously on all the calibrated views. The proposed method is validated both with synthetic data and by mea-suring real objects captured by a specially crafted imaginghead. Finally, to evaluate the feasibility of the approachwithin real-time industrial scenarios, we tested the perfor-mance of a GPU-based implementation of the algorithm. 1. Introduction Among all the visual cues, ellipses offer several advan-tages that prompt their adoption within many machine vi-sion tasks. To begin with, the class of ellipses is invariant toprojective transformations, thus an elliptical shape remainsso when it is captured from any viewpoint by a pinholecam-era [4]. This property makes easy to recognize objects thatcontain ellipses [11, 8] or partially elliptical features [18]. When the parameters of one or more coplanar 3D ellipsesthat srcinated the projection are known, the class of homo-graphies that make it orthonormal to the image plane can beretrieved. This is a useful step for many tasks, such as therecognition of fiducial markers [1, 13], orthonormalizationof playfields [7], forensic analysis of organic stains [20] or any other planar metric rectification [2]. Furthermore, el- Figure 1. Schematic representation of a multi-camera system forindustrial in-line pipes inspection. lipses (including circles) are regular shapes that often ap-pear in manufactured objects and can be used as opticallandmarks for tracking and manipulation [22] or measuredfor accurate in-line quality assurance [16].Because of their usefulness and broad range of applica-bility, it is not surprising that ellipse detection and fittingmethods abound in the literature. In particular, when pointsbelonging to the ellipse are known, they are often fittedthrough ellipse-specific least square methods [6]. In orderto find co-elliptical points in images, traditional parameter-space search schemas, such as RANSAC or Hough Trans-form, can be employed. Unfortunately, the significantlyhigh dimensionality of 2D ellipse parametrization (whichcounts 5 degrees of freedom) makes the direct applicationof those techniques not feasible. For this reason a lot of ef-ficient variants have appeared. Some try to reduce the num-ber of samples for a successful RANSAC selection [17, 21].Others attempt to escape from the curse of dimensionalitythat plagues the Hough accumulator [12, 3]. If high accu-racy is sought, point-fitted ellipses can be used as an initialguess to be refined through intensity-based methods. Thoseapproaches allow to obtain a sub-pixel estimation by ex-ploiting the raw gradient of the image [14] or by preserv-1  ing quantities such as intensity moments and gradients [9].Multiple view geometry has also been exploited to get abetter 3D ellipse estimation. In [19], multiple cameras areused to track an elliptical feature on a glove to obtain the es-timation of the hand pose. The ellipses fitted in the imagesare triangulated with the algorithm proposed in [15] and thebest pair is selected. In [10], holes in metal plates and in-dustrial components are captured by a couple of calibratedcamerasandtheresultingconicsarethenusedtoreconstructthe hole in the Euclidean space. Also in [5] the intersectionof two independently extracted conics is obtained througha closed form. All these approaches, however, exploit 3Dconstraints in an indirect manner, as triangulation alwayshappens on the basis of the ellipses fitted over 2D data.In this paper we present a rather different technique thatworks directly in 3D space. Specifically, we adopt a para-metric level-set appraoch, where the parameters of a sin-gle elliptical object that is observed by a calibrated net-work of multiple cameras (see Fig.1) are optimized withrespect to an energy function that simultaneously accountsfor each point of view. The goal of our method is to bindthe 2D intensity and gradient-based energy maximizationthat happens within each image to a common 3D ellipsemodel. The performance of the solution has been assessedthrough both synthetic experiment and by applying it to areal world scenario. Finally, to make the approach feasibleregardless of the high computational requirements, we pro-pose a GPU implementation which performance has beencompared with a well optimized CPU-based version. 2. Multiple Camera Ellipse Fitting In our approach we are not seeking for independent op-tima over each image plane, as is the case with most el-lipse fitting methods. Rather, our search domain is theparametrization of an ellipse in the 3D Euclidean space,and the optimum is sought with respect to its concurrent2D reprojections over the captured images. In order to per-form such optimization we need to sort out a number of issues. The first problem is the definition of a 3D ellipseparametrization that is well suitable for the task (that is,it makes easy to relate the parameters with the 2D projec-tions). The second one, is the definition of an energy func-tion that is robust and accounts for the usual cues for curvedetection (namely the magnitude and direction of the in-tensity gradient). The last issue is the computation of thederivative of the energy function with respect to the 3D el-lipse parameters to be able to perform a gradient descent. 2.1. Parameterization of the 3D Ellipse In its general case, any 2-dimensional ellipse in the im-age plane is defined by 5 parameters, namely: the length of the two axes, the angle of rotation and a translation vectorwith respect to the srcin.In matrix form it can be expressed by the locus of points x  =  x 1  x 2  1  T  in homogeneous coordinates for whichthe equation  x T  Ax T  = 0  holds, for A  =  a b db c f d f g   (1)with  det( A )  <  0  and  ac − b 2 >  0 .In the 3-dimensional case it is subjected to  3  more de-grees of freedom (i.e. rotation around two more axes andthe z-component of the translation vector). More directly,we can define the ellipse by first defining the plane  T   itresides on and then defining the 2D equation of the el-lipse on a parametrization of such plane. In particular, let c  = ( c 1 ,c 2 ,c 3 , 1) T  ∈  T   be the srcin of the parametriza-tion, and  u  = ( u 1 ,u 2 ,u 3 , 0) T  ,  v  = ( v 1 ,v 2 ,v 3 , 0) T  be thegeneratorsofthelinearsubspacedefining T  , theneachpointon the 3D ellipse will be of the form  o  +  αu  +  βv  with  α and  β   satisfying the equation of an ellipse.By setting the srcin  o  to be at the center of the ellipseand selecting the directions  u  and  v  appropriately, we cantransform the equation of the ellipse on the plane coordi-nates in such a way that it will take the form of the equa-tion of a circle. Hence, allowing the 3D ellipse to be fullydefined by the parametrization of the plane on which theellipse resides. However, this representation has still onemore parameter than the actual degrees of freedom of theellipse. To solve this we can, without any loss of generality,set  u 3  = 0 , thus, by defining the matrix U c  =  u 1  v 1  c 1 u 2  v 2  c 2 0  v 3  c 3 0 0 1  (2)and the vector   x  = ( α,β, 1) T  , we can express any point  p in the 3D ellipse as:   p  = U c  x  subject to  x T   1 0 00 1 00 0  − 1  x  = 0 .  (3)Even if  U c  embeds all the parameters needed to describeany 3d ellipse, it is often the case that an explicit represen-tation through center c  and axes   a 1 ,  a 2  ∈ R 3 is needed. Let U be the  3 × 2  matrix composed by the first two columnsof   U C . The two axes   a 1 ,  a 2  can be extracted as the twocolumns of the matrix: K =   a 1   a 2  = U φ T where  φ T is the matrix of left singular vectors of   U T U computed via SVD decomposition. The vector c  is triviallycomposed by the parameters  c 1  c 2  c 3  T  .2  Conversely, from two axes   a 1 ,  a 2 , the matrix U can beexpressed as: U = K  α  − β β α  by imposing that  αK  31  +  βK  32  = 0 α 2 +  β  2 = 1 . Finally, once U has been computed, the 3D ellipse matrix can be composedin the following way: U c  =  U  c  0 1  Finally, withthisparametrizationitisveryeasytoobtaintheequation of the ellipse projected onto any camera. Givena projection matrix  P , the matrix  A P  describing the 2-dimensional ellipse after the projection can be expressed as: A P  = ( PU c ) − T   1 0 00 1 00 0  − 1  ( PU c ) − 1 (4) 2.2. Energy Function over the Image To estimate the equation of the 3D-ellipse we set-up alevel-set based optimization schema that updates the ellipsematrix  U c  by simultaneously taking into account its re-projection in every camera of the network. The advantagesof this approach are essentially threefold. First, the equationof the 3D ellipse estimated and the re-projection in all cam-eras are always consistent. Second, erroneous calibrationsthat affects the camera network itself can be effectively at-tenuated, as shown in the experimental section. Third, theellipse can be partially occluded in one or more camera im-ages without heavily hindering the fitting accuracy.In order to evolve the 3D ellipse geometry to fit the ob-servation, we need to define the level set functions  ϕ i  : R 2 →  R  describing the shape of the ellipse U c  re-projectedto the  i th camera. Given each level set, we cast the multi-view fitting problem as the problem of maximizing the en-ergy function: E  I  1 ...I  n ( U c ) = n  i =1 E  I  i ( U c )  (5)Which sums the energy contributions of each camera: E  I  i ( U c ) =   R 2 ∇ H  ( ϕ ( x )) , ∇ I  i ( x )  2 d x  (6) =   R 2  H   ( ϕ ( x )) ∇ ϕ ( x ) , ∇ I  i ( x )  2 d x,  (7)where  H   is a suitable relaxation of the Heavyside function.In our implementation, we used: H  ( t ) = 11 +  e −  tσ (8)where parameter  σ  models the band size (in pixels) of theellipse region to be considered. By varying  σ  we can man-age the trade-off between the need of a regularization termin the energy function to handle noise in the image gradientand the estimation precision that has to be achieved.The level set for a generic ellipse is rather complicatedand cannot be easily expressed in closed form, however,since it appears only within the Heavyside function and itsderivative, we only need to have a good analytic approxi-mation in the boundary around the ellipse. We approximatethe level set in the boundary region as: ϕ i ( x )  ≈  x T  A i x 2   x T  A i T  I 0 A i x (9)Where  I  0  =  1 0 00 1 00 0 0  and A i  is the re-projection of theellipse U c  into the  i th camera computed using equation (4).The function has negative values outside the boundaries of the ellipse, positive values inside and is exactly  0  for eachpoint  { x | x T  U c x  = 0 } .The gradient of the level set function  ∇ ϕ  :  R 2 →  R 2 can actually be defined exactly in closed form: ∇ ϕ i ( x ) =  A i x   x T  A i T  I 0 A i x (10).Starting from an initial estimation, given by a simpletriangulation of 2d-ellipses between just two cameras, wemaximize the energy function (5) over the plane parameters U c  by means of a gradient scheme. 2.3. Gradient of the Energy Function The gradient of the energy function can be computed asa summation of the gradient of each energy term. This gra-dient can be obtained by analytically computing the partialderivatives of equation (6) with respect to the eight param-eters  (  p 1  ...p 8 ) = ( u 1 ,v 1 ,c 1 ,u 2 ,v 2 ,c 2 ,v 3 ,c 3 ) : ∂ ∂p i E  I  i ( U c ) =  ∂ ∂p i   R 2 E  I  i ( U c ,x ) 2 d x =   R 2 2 E  I  i ( U c ,x )  ∂ ∂p i E  I  i ( U c ,x )d x Where: E  I  i ( U c ,x ) =   H   ( ϕ ( x )) ∇ ϕ ( x ) , ∇ I  i ( x )  and ∂ ∂p i E  I  i ( U c ,x ) =(  ∂ ∂p i H   ( ϕ ( x ))) ∇ ϕ (  x ) , ∇ I  i ( x )  ++  H   ( ϕ ( x ))  (  ∂ ∂p i ∇ ϕ ( x )) , ∇ I  i ( x )  . 3   0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 1 2 3 4 5 6        E     r     r     o     r Focal length error [%] 2ViewMultiView sigma=3.0MultiView sigma=6.0MultiView sigma=9.0  0 0.002 0.004 0.006 0.008 0.01 0.012 0.014130160190210240270        E     r     r     o     r Noise sigma 2ViewMultiView sigma=3.0MultiView sigma=6.0MultiView sigma=9.0  0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 10 20 30 40 50 60 70 80        E     r     r     o     r Perimeter clutter [%] 2ViewMultiView sigma=3.0MultiView sigma=6.0MultiView sigma=9.0  0 0.005 0.01 0.015 0.02 0.025 0.03 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8        E     r     r     o     r Distortion K1 2ViewMultiView sigma=3.0MultiView sigma=6.0MultiView sigma=9.0 Figure 2. Evaluation of the accuracy of the proposed method with respect to different noise sources. The metric adopted is the relative errorbetween the minor axis of the ground truth and of the fitted ellipse. The derivatives of the parametric level set functions canbe computed analytically. At the beginning of each iterationwe compute the derivative of the projected ellipse matrices A i  which are constant with respect to  x : ∂ ∂p i A i  = T + T T  (11)where T   = (  ∂ ∂p i [( P i U c ) − 1 ]) T   1 0 00 1 00 0  − 1  ( P i U c ) − 1 (12)and ∂ ∂p i [( P i U c ) − 1 ] =  − ( P i U c ) − 1 ( P i ∂ ∂p i U c )( P i U c ) − 1 . (13)Then, using (11), we can compute the level set deriva-tives for each pixel: ∂ ∂p i ∇ ϕ ( x ) =(  ∂ ∂p i A i ) x   x T  A i T  I 0 A i x −− A i x ( x T  (  ∂ ∂p i A i ) T  I 0 A i x  + x T  A i T  I 0 (  ∂ ∂p i A i ) x )2( x T  A i T  I 0 A x ) 32 (14) ∂ ∂p i ϕ ( x ) = 12  x, ∂ ∂p i ∇ ϕ ( x )   (15) ∂ ∂p i H   ( ϕ ( x )) =  H   ( ϕ ( x ))  ∂ ∂p i ϕ ( x ) .  (16)By summing the derivative  ∂ ∂p i E  I  i ( U c ,x )  over all im-ages and all pixels in the active band in each image, weobtain the gradient G  =  ∇ E  I  1 ...I  n ( U c ) . At this point, weupdate the 3D ellipse matrix U c  through the gradient step U c ( t +1) = U c ( t ) +  η G  (17)where  η  is a constant step size. 3. Experimental evaluation Weevaluatedtheproposedapproachbothonasetofsyn-thetic tests and on a real world quality control task wherewe measure the diameter of a pipe with a calibrated multi-camera setup. In both cases, lacking a similar 3D basedoptimization framework, we compared the accuracy of ourmethod with respect to the results obtained by triangulatingellipses optimally fitted over the single images. The ratio-nale of the synthetic experiments is to be able to evaluatethe accuracy of the measure with an exactly known groundtruth (which is very difficult to obtain on real objects withvery high accuracy). Further, the synthetically generated4  Figure 3. Examples of images with artificial noise added. Respec-tively additive Gaussian noise and blur in the left image and oc-clusion in the right image. The red line shows the fitted ellipse. imagery permits us to control the exact nature and amountof noise, allowing for a separate and independent evalua-tion for each noise source. By contrast, the setup employ-ing real cameras does not give an accurate control over thescene, nevertheless it is fundamental to asses the ability of the approach to deal with the complex set of distractors thatarise from the imaging process (such as reflections, vari-able contrast, defects of the object, bad focusing and so on).In both cases the ellipse detection is performed by extract-ing horizontal and vertical image gradients with an orientedderivative of Gaussian filter. Edge pixels are then found bynon-maxima suppression and by applying a very permissivethreshold (no hysteresis is applied). The obtained edge pix-els are thus grouped into contiguos curves, which are in turnfitted to find ellipses candidates. The candidate that exhibitthe higher energy is selected and refined using [14]. Therefined ellipses are then triangulated using the two imagesthat score the lower triangulation error. The obtained 3D el-lipse is finally used both as the result of the baseline method(labeled as  2view  in the following experiments) and as theinitialization ellipse for our refinement process (labeled as multiview ). All the experiments have been performed with3Mp images and the processing is done with a modern 3.2Ghz Intel Core i7 PC equipped with Windows 7 OperatingSystem. The CPU implementation was written in C++ andthe GPU implementation uses the CUDA library. The videocard used was based on the Nvidia 670 chipset with 1344CUDA cores. 3.1. Synthetic Experiments For this set of experiments we chose to evaluate the ef-fect of four different noise sources over the optimizationprocess. Specifically, we investigated the sensitivity of theapproach to errors on the estimation of the focal length andof the radial distortion parameters of the camera and theinfluence of Gaussian noise and clutter corrupting the im-ages. In Fig. 3 examples of Gaussian noise and clutter areshown (note that these are details of the images, in the ex-periments the ellipse was viewed in full). For each test wecreated 5 synthetic snapshots of a black disc as seen from 5different cameras looking at the disk from different pointsof view (see Fig. 1 and Fig. 4). The corruption by Gaus-sian noise has been produced by adding to each pixel a nor-mal distributed additive error of variable value of   σ , fol-lowed by a blurring of the image with a Gaussian kernelwith  σ  = 6 . The artificial clutter has been created by oc-cluding the perimeter of the disc with a set of random whitecircles until a given percentage of the srcinal border wascorrupted. This simulates the effect of local imaging effectsuch as the presence of specular highlights that severely af-fect the edge detection process. The focal length error wasobtained by changing the correct focal length of the centralcamera by a given percentage. Finally, the distortion errorwas introduced by adding an increasing amount to the cor-rect radial distortion parameter K1. In Fig. 2 we show theresults obtained using the baseline triangulation and our op-timization with different values of the parameter  σ  used forthe heavyside function (respectively 3, 6 and 9 pixels). Asexpected, in all the tests performed the relative error growswith the level of noise. In general, all the methods seem tobe minimally sensitive to Gaussian noise, whereas the clut-ter has a big effect even at low percentages. The baselinemethod performs consistently worse and, among the multi-view configurations, the one with lower heavyside band ap-pears to be the most robust for almost all noise levels. Thisis probably due to the fact that the images have already beensmoothed by the gradient calculation step, and thus furthersmoothing is not required and, to some degree, leads to amore prominent signal displacement. 3.2. Real World Application For the experiments with real images we built an imag-ing device that hold 5 PointGrey Flea3 3.2Mp MonochromeUSB3 machine vision cameras (see Fig. 4). The 5 cameraswere calibrated for both intrinsic and extrinsic parameters. Figure 4. The experimental Multiple-camera imaging head. 5
Search
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks