Health & Fitness

A General Framework for the Selection of World Coordinate Systems in Perspective and Catadioptric Imaging Applications

Description
A General Framework for the Selection of World Coordinate Systems in Perspective and Catadioptric Imaging Applications
Published
of 25
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  International Journal of Computer Vision 57(1), 23–47, 2004c  2004 Kluwer Academic Publishers. Manufactured in The Netherlands. A General Framework for the Selection of World Coordinate Systemsin Perspective and Catadioptric Imaging Applications JO˜AO P. BARRETO AND HELDER ARAUJO  Institute of Systems and Robotics, Department of Electrical and Computer Engineering,University of Coimbra, Coimbra, Portugal Received January 15, 2002; Revised November 19, 2002; Accepted April 17, 2003 Abstract.  An imaging system with a single effective viewpoint is called a central projection system. The conven-tional perspective camera is an example of central projection system. A catadioptric realization of omnidirectionalvision combines reflective surfaces with lenses. Catadioptric systems with an unique projection center are alsoexamples of central projection systems. Whenever an image is acquired, points in 3D space are mapped into pointsin the 2D image plane. The image formation process represents a transformation from ℜ 3 to ℜ 2 , and mathematicalmodels can be used to describe it. This paper discusses the definition of world coordinate systems that simplify themodeling of general central projection imaging. We show that an adequate choice of the world coordinate referencesystem can be highly advantageous. Such a choice does not imply that new information will be available in theimages.Insteadthegeometrictransformationswillberepresentedinacommonandmorecompactframework,whilesimultaneously enabling newer insights. The first part of the paper focuses on static imaging systems that includeboth perspective cameras and catadioptric systems. A systematic approach to select the world reference frame ispresented. In particular we derive coordinate systems that satisfy two differential constraints (the “compactness”and the “decoupling” constraints). These coordinate systems have several advantages for the representation of thetransformations between the 3D world and the image plane. The second part of the paper applies the derived mathe-matical framework to active tracking of moving targets. In applications of visual control of motion the relationshipbetween motion in the scene and image motion must be established. In the case of active tracking of moving targetsthese relationships become more complex due to camera motion. Suitable world coordinate reference systems aredefined for three distinct situations: perspective camera with planar translation motion, perspective camera withpan and tilt rotation motion, and catadioptric imaging system rotating around an axis going through the effectiveviewpoint and the camera center. Position and velocity equations relating image motion, camera motion and tar-get 3D motion are derived and discussed. Control laws to perform active tracking of moving targets using visualinformation are established. Keywords:  sensor modeling, catadioptric, omnidirectional vision, visual servoing, active tracking 1. Introduction Many applications in computer vision, such as surveil-lance and model acquisition for virtual reality, requirethat a large field of view is imaged. Visual control of motion can also benefit from enhanced fields of view.The computation of camera motion from a sequenceof images obtained with a traditional camera suffersfrom the problem that the direction of translation maylie outside the field of view. Panoramic imaging over-comes this problem making the uncertainty of cameramotion estimation independent of the motion direction(Gluckman and Nayar, 1998). In position based visualservoing keeping the target in the field of view duringmotion raises severe difficulties (Malis et al., 1999).Withalargefieldofviewthisproblemnolongerexists.  24  Barreto and Araujo Oneeffectivewaytoenhancethefieldofviewofacam-era is to use mirrors (Bogner, 1995; Nalwa, 1996; Yagiand Kawato, 1990; Yamazawa et al., 1993, 1995). Thegeneral approach of combining mirrors with conven-tional imaging systems is referred to as catadioptricimage formation (Hecht and Zajac, 1974).The fixed viewpoint constraint is a requirement en-suring that the visual sensor only measures the inten-sity of light passing through a single point in 3D space(the projection center). Vision systems verifying thefixed viewpoint constraint are called central projectionsystems.Centralprojectionsystemspresentinterestinggeometric properties. A single effective viewpoint is anecessaryconditionforthegenerationofgeometricallycorrect perspective images (Baker and Nayar, 1998),and for the existence of epipolar geometry inherent tothe moving sensor and independent of the scene struc-ture(Svobodaetal.,1998).Itishighlydesirableforanyvision system to have a single viewpoint. The conven-tional perspective CCD camera is widely used in com-puter vision applications. In general it is described byacentralprojectionmodelwithasingleeffectiveview-point.Centralprojectioncamerasarespecializationsof the general projective camera that can be modeled by a3 × 4matrixwithrank3(HartleyandZisserman,2000).InBakerandNayar(1998),BakerandNayarderivetheentire class of catadioptric systems with a single effec-tive viewpoint. Systems built using a parabolic mirrorwith an orthographic camera, or an hyperbolic, ellip-tical or planar mirror with a perspective camera verifythe fixed viewpoint constraint.In Geyer and Daniilidis (2000), introduce an unify-ing theory for all central catadioptric systems whereconventional perspective imaging appears as a particu-lar case. They show that central panoramic projectionis isomorphic to a projective mapping from the sphereto a plane with a projection center on the perpendic-ular to the plane. A modified version of this unifyingmodel is introduced in the paper (Barreto and Araujo,2001).General central projection image formation can berepresented by a transformation from ℜ 3 to ℜ 2 . When-ever an image is acquired, points in 3D space aremappedintopointsinthe2Dimageplane.Cartesianco-ordinate systems are typically used to reference pointsboth in space and in the image plane. The mapping isnon-injectiveandimplieslossofinformation.Therela-tionship between position and velocity in the 3D spaceand position and velocity in the image are in generalcomplex,difficultandnon-linear.Thispapershowsthatthe choice of the coordinate system to reference pointsinthe3Dspaceisimportant.Theintrinsicnatureofim-age formation process is kept unchanged but the math-ematical relationship between the world and the imagebecomes simpler and more intuitive. This can help notonly the understanding of the imaging process but alsothe development of new algorithms and applications.The first part of the paper focuses on static imag-ing systems that include both perspective cameras andcatadioptric systems. A general framework to describethe mapping from 3D points to 2D points in the imageplaneispresented.Themathematicalexpressionofthisglobalmappingdependsonthecoordinatesystemusedtoreferencepointsinthescene.Asystematicapproachto select the world coordinate system is presented anddiscussed. Differential constraints are defined to en-able the choice of a 3D reference frame. Coordinatetransformationssatisfyingthesedifferentialconstraintsbringadvantageouspropertieswhenmapping3Dspacevelocities into 2D image velocities. One such coordi-nate transformation is described for the case of theperspective camera and then generalized for centralcatadioptric image formation. Using these coordinatetransformationsdoesnotimplythatnewinformationisavailable in the images. Instead the geometric transfor-mations are represented in a common and more com-pact framework, while simultaneously enabling newerinsights into the image formation process. Examplesand applications that benefit from an adequate choiceof the world coordinate system are presented and dis-cussed.The second part of our article applies the derivedmathematical framework to active tracking of movingtargets. For this purpose it is assumed that the imag-ing sensor is mounted on a moving platform. Threedifferent cases are considered: a perspective camerawith translational motion in the XY plane, a perspec-tive camera with rotational pan and tilt motion and aparabolic omnidirectional camera with a rotational de-gree of freedom around the  Z   axis. The goal of thetrackingapplicationistocontrolthemotionoftheplat-form in such a way that the position of the target in theimage plane is kept constant.In the classical eye-in-hand positioning problem thecamera is typically attached to the end effector of a6 d.o.f. manipulator. The platforms considered in thiswork have less than 3 d.o.f. For the purpose of control-ling the constrained 3D motion of these robots it is notnecessary to determine the full pose of the target. It isassumed that target motion is characterized by the 3D  Selection of World Coordinate Systems in Perspective and Catadioptric Imaging Applications 25position and velocity of the corresponding mass centerinaninertialreferenceframe.Itisalsoassumedthattheposition of each degree of freedom is known (possiblyvia an encoder).In active tracking applications the image motiondepends both on target and camera 3D motion. Thederived general framework to describe the mappingbetween 3D points and points in the 2D image planeis extended to central catadioptric imaging systemswith rigid motion. The mathematical expression of theglobalmappingdependsontheworldcoordinatesusedto reference points in the scene. General criteria to se-lect suitable coordinate systems are discussed. Ade-quate choices are presented for each type of platform.The derived mathematical framework is used to es-tablish the position and velocity relationships betweentarget 3D motion, camera motion and image motion.The expressions obtained are used to implement im-age based active visual tracking. Simplifications of theequationsobtained(todecouplethedegreesoffreedomof the pan and tilt vision system) are discussed. 2. Static Imaging Systems This section focuses on the static central projectionvision systems. Examples of such systems are the per-spectivecameraandcatadioptricsystemsthatverifythefixed viewpoint constraint (Baker and Nayar, 1998).The image acquisition process maps points from the3D space into the 2D image plane. Image formationperforms a transformation from  ℜ 3 to  ℜ 2 that can bedenoted by  F . A generic framework to illustrate thetransformation  F  is proposed. This framework is gen-eral to both conventional perspective cameras and cen-tral projection catadioptric systems. It is desirable that F  be as simple as possible and as compact as possible.This can be achieved by selecting a specific coordinatesystems to reference the world points. General criteriatoselecttheworldcoordinatesystemarepresentedanddiscussed. Advantages of using different world coor-dinate systems to change the format of the  F  mappingare presented. 2.1. Mapping Points from the 3D Spacein the 2D Image Plane Figure 1 depicts a generic framework to illustrate thetransformation F from ℜ 3 in ℜ 2 performedbyacentralprojection vision system. If the vision system has a ix (x , y )i i R  2  f ()i  P 2 P 3 XwXh R  3 R  3  f ()h ε P=R[I| − C] (x ,y ,z )  ε x= function =  εε ( X, Y, Z, 1)=( X, Y, Z )= ( φ, ψ, ρ ) =  ε T() function Γ()Ω Figure 1 . Schematic of the mapping performed by general centralprojection imaging systems. unique viewpoint, it preserves central projection andgeometrically correct perspective images can alwaysbe generated (Gluckman and Nayar, 1998). X w  =  (  X  , Y  ,  Z  ) t  is a vector with the Cartesian 3Dcoordinates of a point in space. The domain of trans-formation is the set  D  of visible points in the worldwith  D  ⊂ℜ 3 . Function  f  h  maps ℜ 3 into the projectivespace  ℘ 3 . It is a non-injective and surjective functiontransforming  X w  =  (  X  , Y  ,  Z  ) t  in  X h  =  (  X  , Y  ,  Z  , 1) t  thatarethehomogeneousworldpointcoordinates. P isan arbitrary 3 × 4 homogeneous matrix with rank 3. Itrepresentsageneralprojectivetransformationperform-ing a linear mapping of  ℘ 3 into the projective plane ℘ 2 ( x  =  PX h ). The rank 3 requirement is due to the factthat if the rank is less than 3 then the range of the ma-trix will be a line or a point and not the whole plane.The rank 3 requirement guarantees that the transfor-mation is surjective. In the case of   P  being a cameramodel it can be written as  P  =  KR [ I |−  ˜ C ] where  I is a 3  ×  3 identity matrix,  K  is the intrinsic param-eters matrix,  R  the rotation matrix between cameraand world coordinate systems and  ˜ C  the projectioncenter in world coordinates (Hartley and Zisserman,2000). If nothing is stated we will assume  K  =  I and standard central projection with  P = [ I | 0 ]. Func-tion  f  i  transforms coordinates in the projective plane x  =  (  x  ,  y ,  z ) t  into Cartesian coordinates in the imageplane  x i  =  (  x  i ,  y i ) t  . It is a non-injective, surjective  26  Barreto and Araujo function of   ℘ 2 in  ℜ 2 that maps projective rays in theworld into points in the image. For conventional per-spective cameras  x i  = f  i ( x ) ⇔ (  x  i ,  y i ) = (  x  z ,  y z ). How-ever, as it will be shown later, these relations are morecomplex for generic catadioptric systems.The transformation  F  maps 3D world points into2D points in the image. Points in the scene were rep-resented using standard cartesian coordinates. How-ever a different coordinate system can be used to ref-erence points in the 3D world space. Assume that Ω = ( φ,ψ,ρ ) t  are point coordinates in the new refer-ence frame and that  X w  = T ( Ω ) where  T  is a bijectivefunctionfrom ℜ 3 in ℜ 3 .Thetransformation F ,mapping3D world points Ω in image points  x i  (see Eq. (1)), canbe written as the composition of Eq. (2). x i  =  F ( Ω ) (1) F ( Ω )  =  f  i ( Pf  h ( T ( Ω ))) (2)Equation(3),obtainedbydifferentiatingEq.(1)withrespect to time, establishes the relationship betweenvelocity in 3D space  ˙ Ω  =  ( ˙ φ,  ˙ ψ,  ˙ ρ ) t  and velocity inimage  ˙x i  =  (˙  x  i ,  ˙  y i ) t  .  ˙x i  and  ˙ Ω are related by the jaco-bianmatrix J F  oftransformation F .Equation(4)shows J F  as the product of the Jacobians of the transforma-tions that make up  F . ˙x i  =  J F  ˙ Ω  (3) J F  =  J f  i  · J P · J f  h  · J T  (4)Function  T  represents a change of coordinates. Itmust be bijective which guarantees that it admits aninverse. Assume that  Γ  is the inverse function of  T  ( Γ = T − 1 ). Function  Γ , from  ℜ 3 into  ℜ 3 , trans-forms cartesian coordinates  X w  in new coordinates Ω (Eq. (5)).  J Γ  is the jacobian matrix of  Γ (Eq. (6)). If   T is injective then the jacobian matrix  J T  is non-singularwith inverse  J Γ . Replacing  J T  by  J − 1 Γ  in Eq. (4) yieldsEq. (7) showing the jacobian matrix of   F  expressed interms of the scalar function of  Γ and its partial deriva-tives. Γ ( X w )  =  ( φ (  X  , Y  ,  Z  ) ,ψ (  X  , Y  ,  Z  ) ,ρ (  X  , Y  ,  Z  )) t  (5) J Γ  =  φ  X   φ Y   φ  Z  ψ  X   ψ Y   ψ  Z  ρ  X   ρ Y   ρ  Z   (6) J F  =  J f  i  · J P · J f  h  · J − 1 Γ  (7) 2.2. Criteria to Select the World Coordinate System Function  F  is a transformation from  ℜ 3 (3D worldspace) into  ℜ 2 (image plane). In Eqs. (8) and (9)  F and J F  arewrittenintermsofscalarfunctionsandtheirpartialderivatives.Therelationshipbetweenworldandimagepointscanbecomplexandcounterintuitive.Themathematical expression of the mapping function  F depends on the transformation  T  (see Eqs. (2), (4) and(7)).Theselectionofacertaincoordinatesystemtoref-erence points in the scene changes the way  F  is writtenbutdoesnotchangetheintrinsicnatureofthemapping.However, with an adequate choice of the world coor-dinate system, the mathematical relationship betweenpositionandvelocityinspaceandpositionandvelocityin the image plane can become simpler, more intuitiveor simply more suitable for a specific application. Inthis section we discuss criteria for the selection of theworld coordinate system. F ( Ω )  =  ( h ( φ,ψ,ρ ) , g ( φ,ψ,ρ )) t  (8) J F  =  h φ  h ψ  h ρ g φ  g ψ  g ρ   (9)  2.2.1. The Compactness Constraint.  Consider cen-tralprojectionvisionsystemsasmappingsof3Dpoints,expressed in Cartesian coordinates  X w  =  (  X  , Y  ,  Z  ) t  ,into the 2D image coordinates  x i  = (  x  i ,  y i ). The trans-formation is a function from  ℜ 3 into  ℜ 2 with loss of information (depth). In general the two coordinatesin the image plane depend on the three coordinatesin space. The image gives partial information abouteach one of the three world coordinates but we are notable to recover any of those parameters without fur-ther constraints. The imaging process implies loss of information and there is no additional transformation T  that can change that. However it would be advan-tageous that image coordinates depend only on twoof the 3D parameters. In many situations that can beachieved by means of a change of coordinates  T . Thecoordinate change must be performed in such a waythat  F  only depends on two of those coordinates. As-suming that  Ω  =  ( φ,ψ,ρ ) are the new 3D coordi-nates,  F  becomes a function of only  φ  and  ψ  whichmeans that partial derivatives  h ρ  and  g ρ  are both equalto zero. Whenever a certain change of coordinates  T leads to a jacobian matrix  J F  with a zero column, itis said that mapping  F  is in a compact form and co-ordinate transformation  T  verifies the “compactnessconstraint”.  Selection of World Coordinate Systems in Perspective and Catadioptric Imaging Applications 27Assume that a world coordinate system satisfyingthe “compactness constraint” is selected. If Eq. (10)is verified then the image coordinates (  x  i ,  y i ) dependonly on ( φ,ψ ) and  F  becomes a function from  ℜ 2 in ℜ 2 ( x i  =  F( Ω c )  with Ω c  =  ( φ,ψ ) t  ). A function from ℜ 3 into ℜ 2 is never invertible, thus putting  F  in a com-pact form is a necessary condition to find out an in-verse mapping  F − 1 . If   F − 1 exists then two of the three3D parameters of motion can be recovered from im-age ( Ω c  = F − 1 ( x i )) and the jacobian matrix  J F  can bewritten in term of image coordinates  x  i  and  y i . By ver-ifying the “compactness constraint” the relationshipsin position and velocity between the 3D world and theimage plane tend to be more compact and intuitive andvision yields all the information about two of the 3Dworld coordinates and none about the third one. h ρ  = 0 ∧ g ρ  = 0 (10)  2.2.2. The Decoupling Constraint.  Assume that the“compactness constraint” is verified. This means thata coordinate transformation  T  is used such that imagecoordinates (  x  i ,  y i ) depend only on ( φ,ψ ). It would bealso advantageous to define a world coordinate systemsuch that  x  i  depends only of   φ  and  y i  depends only of  ψ . This is equivalent to say that  h ψ  and  g φ  are bothzero. The one to one correspondence is an advanta-geous feature allowing a better understanding of theimaging process and simplifying subsequent calcula-tions.Ifacoordinatetransformation T isusedsuchthatbothEqs.(10)and(11)areverifiedthenitissaidthat F is in a compact and decoupled form and that  T  verifiesboththe“compactnessconstraint”andthe“decouplingconstraint”. h ψ  = 0 ∧ g φ  = 0 (11)In short, given a general central projection map-ping,thegoalistoselectacoordinatetransformation T verifying both: •  the “compactness constraint” (Eq. (10)) •  the “decoupling constraint” (Eq. (11))Thecoordinatesystemusedtoreferencepointsinthescene does not change the intrinsic nature of the map-ping nor introduces any additional information. Thereare situations where it is impossible to find a worldcoordinates transformation that verifies the “compact-ness constraint” and/or the “decoupling constraint.”Methodologies to find out if it exists such a transfor-mation will be introduced latter. 2.3. Conventional Perspective Camera Consider image acquisition performed by a static con-ventional perspective camera. The image formationprocess follows the scheme depicted in Fig. 1 wherefunction  f  i  is given by Eq. (12). Assume that the matrixofintrinsicparametersis K = I and P = [ I | 0 ](theori-gin of the cartesian reference frame is coincident withthe camera center and the image plane is perpendicularto the  Z   axis). This section derives a world coordinatesystem that verifies both the compactness and decou-pling constraint. If nothing is stated we will work withtheinversetransformation Γ insteadofthedirecttrans-formation  T . f  i ()  : (  x  ,  y ,  z ) −→   x  z ,  y z   (12)  2.3.1. Constraining Γ  to Obtain a New World Coordi- nate System.  Functions  f  i ,  P  and  f  h , as well as their jacobian matrices, are defined for the perspective cam-era case. Replacing  J Γ  (Eq. (6)) in Eq. (7) yields  J F  interms of the partial derivatives of the scalar functionsof  Γ (the computation is omitted). If   F  is in a compactformthenthethirdcolumnof  J F  mustbezero(Eq.(10))which leads to Eqs. (13). A transformation of coordi-nates Γ that verifies the compactness constraint can becomputed by solving the partial differential Eqs. (13)withrespecttothescalarfunctions φ , ψ  and ρ (Eq.(5)).   Z  ( φ Y  ψ  Z   − φ  Z  ψ Y  ) +  X  ( φ Y  ψ  X   − φ  X  ψ Y  ) = 0  Z  ( φ  Z  ψ  X   − φ  X  ψ  Z  ) + Y  ( φ Y  ψ  X   − φ  X  ψ Y  ) = 0(13)The partial differential equations corresponding tothe “decoupling constraint” can be derived in a similarway. If the mapping  F  is decoupled then both  h ψ  and g φ  must be zero, which leads to Eq. (14). A world co-ordinate transformation Γ verifying both the compact-ness and the decoupling constraint can be computedby solving simultaneously Eqs. (13) and (14). Never-theless the integration of systems of partial differentialequations can be difficult and in general it generatesmany solutions. Adequate coordinate systems will bederivedbygeometricalmeans.Equations(13)and(14)will be used to prove that the selected coordinate trans-formation verifies the compactness and/or decouplingconstraints.   Z  ( φ  Z  ρ Y   − φ Y  ρ  Z  ) +  X  ( φ Y  ρ  X   − φ  X  ρ Y  ) = 0  Z  ( ψ  Z  ρ  X   − ψ  X  ρ  Z  ) + Y  ( ψ Y  ρ  X   − ψ  X  ρ Y  ) = 0(14)
Search
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks