Entertainment & Humor

Inverse Perspective Transformations for Moving Observers

Description
—ABSTRACT. As the ability for computer systems to recognize boundaries for objects in photographs and video by way of convolutions is known [1], in this paper the difficulties with interpretation of three-dimensional space are presented. A
Published
of 16
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  F    o  r    R   e  v   i    e  w    O   n  l     y                                         !"#$ #%  %   & ' ( )& *  && "   +% )&   *  && ",!- ,  - * & ' ( )& . �����  -"#) )  /   & ' ( * & ' ( & &  - �����������    F    o  r    R   e  v   i    e  w    O   n  l     y    1 Inverse Perspective Transformations for MovingObservers Christopher A. Arthur  Abstract — A BSTRACT. As the ability for computer systemsto recognize boundaries for objects in photographs and video byway of convolutions is known [1], in this paper the difficultieswith interpretation of three-dimensional space are presented. Acomputational argument is presented for depth recognition insituations where the camera is in motion while the subject isrelatively motionless.  Index Terms —Computational geometry, Differentia-tion(mathematics), Image object recognition, Image patternrecognition, Image processing, Machine vision I. I NTRODUCTION TO  P ERSPECTIVE  E QUATIONS Suppose a video camera is situated at  ( O  +  x )  so thatit looks along the  − x  direction, and its image plane isperpendicular along  y  at  O . The image of the point  p  is αy , where we take  y  and  x  to be orthogonal unit vectors.Given the image of   p , we would like to determine its positionrelative to our camera, but we have enough information onlyto say that  p  lies somewhere along the vector  ( − x +  αy ) through  ( O  +  x ) . In terms of coordinates, if we introduce β   =  p  ·  ( − x ) , then the expression we want to solve is:  p  =  − β   ·  x +  α (1 +  β  ) y  +  O  (1) x y p o  Α     Α                            1           Β               1  Β Fig. 1: Ideal slit-cameraSince we are working with video, let us suppose further thatour camera is moving and rotating, while keeping  p  in view.After some time  ∆ t  , our camera may have moved through a Manuscript received August 25, 2008. Author is a mathematics scholar.536 Cumberland Drive, Allen, Texas, 75002 USAInverse perspective, computer vision, computer graphics, differential geom-etry, virtual reality, impainting, POV-RayThis paper is dedicated to the computer gaming industry vector  v  and rotated by some angle  ∆ θ . Let us examine whatnew information about the location of   p  this transformationprovides. Assuming  p  to be fixed, equation (1) still holds,although all the parameters have changed through time. x  0  y  0  x  1 y  1 p  0  o  0  o  1 v   Θ Α   0  Α 1 Fig. 2: Ideal, moving slit-camera 0 =  ∂p∂t  =  ∂ ∂t ( − β   · x + α (1 +  β  ) y  + O )  (2)We define our frame in polar coordinates, where  { e i }  is thebasis given by  x  and  y  at time  t  = 0 . x ( t ) = cos θe 1  + sin θ  ·  e 2 y ( t ) =  − sin θe 1  + cos θ  ·  e 2 O ( t ) =  u 1 ( t ) e 1  +  u 2 ( t ) e 2  (3)Expanding the left-hand side of (2) gives: ∂O∂t  −  β  ∂x∂t  + α ( β   + 1) ∂y∂t + y ( β   + 1) ∂α∂t  −  x ∂β∂t  +  yα ∂β∂t Let  θ  vary with  t   and  ω  =  ∂θ/∂t , and allow the followingabbreviations:  U   =  − α ( β   + 1) ω  −  ∂β ∂t V   =  − βω  + ( β   + 1) ∂α∂t  +  α∂β ∂t  (4)Substituting in (3) and separating into two equations: ∂u 1 ∂t  =  V   sin( θ )  − U   cos( θ ) ∂u 2 ∂t  =  − (  U   sin( θ ) +  V   cos( θ ))  (5) Page 1 of 15 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960  F    o  r    R   e  v   i    e  w    O   n  l     y    2 Taking the initial condition that  θ  = 0 , − α ( β   + 1) ω  −  ∂β ∂t  +  ∂u 1 ∂t  = 0 − βω  + ( β   + 1) ∂α∂t  +  α∂β ∂t  +  ∂u 2 ∂t  = 0  (6)Combining these two equations, eliminating  ∂β/∂t  andletting  v  =  ∂O/∂t  yields, ( β   + 1) ωα 2 +  β   ω  −  ∂α∂t   =  ∂α∂t  +  αv  · e 1  +  v  ·  e 2  (7)We would like to eliminate  ω  and  v  from this equation,the camera rotation and translation, respectively,since they arenot generally known about an arbitrary video. If instead of a single point  p , we use three points  p 1 ,p 2 ,p 3 , then we caneliminate  v  and solve for  ω : σ i  =   1 2 3  i A  = 2  i =0 ∂α σ i (1) ∂t  1 + β  σ i (1)  α σ i (2)  −  α σ i (3)  B  = 2  i =0  1  −  α 2 σ i (1)  1 + β  σ i (1)  α σ i (2)  −  α σ i (3)  C   =  − 2  i =0  α σ i (2)  − α σ i (3)  ω  =  AB  −  C   (8)II. S PECIAL  C ASES  A. Panning CameraPanning CameraPanning Camera. Suppose that our camera is free tomove through some translation but cannot rotate ( ω  = 0 ), thenequation (I.7) simplifies to: β   ∂α∂t   =  ∂α∂t  + αv  · e 1  + v  · e 2  (1)Which we can solve readily for  β  : β   = 1 +  αv  · e 1  + v  · e 2 ∂α/∂t  (2)  B. Panning Camera RotationPanning Camera RotationPanning Camera Rotation. Suppose that our camera is pinned to one location but can pivot on its axis; i.e.,  v  = 000  ,then from equation (I.7) we have: ( β   + 1) ωα 2 +  β   ω  −  ∂α∂t   =  ∂α∂t  (3)Solving for  β  : β   =  − 1 +  ωω  + α 2 ω  −  ∂α/∂t  (4) x  0  y  0  x  1 y  1 p  0  o  0  o  1 v  Θ  Fig. 3: Target at  p  and image field along  y C. Targeting.Targeting.Targeting. Suppose that our camera may move and rotateas necessary, provided that one point   p 0  stays in the center of the image. We can interpret this constraint as: ∂α∂t  = 0 ,α  = 0tan∆ θ  = ( O ( t  + ∆ t )  − O ( t ))  ·  y ( θ ( t )) β  ( t ) + ( O ( t  + ∆ t )  −  O ( t ))  ·  x ( θ ( t )) Differentiating the third equation gives: lim ∆ t → 0 tan∆ θ ∆ t  = lim ∆ t → 0 1∆ t O ∆ t  · yβ  0  + O ∆ t  ·  x∂ ∂t  tan θ  | θ =0 = lim ∆ t → 0 ( O ∆ t  − O 0 )  ·  y ∆ t 1 β  0  +  O ∆ t  · x  ∂θ∂t  sec 2 θ  θ =0 = 1 β  0 ∂ ∂t ( O  ·  y ) ωβ   =  v  · y  (5)Substituting into equation (I.5) simplifies to v  · x  =  ∂β ∂t  (6)Resolving to the level of   { e i }  for general  θ  gives: v · e 1  =  ∂β ∂t  cos θ − βω sin θv · e 2  =  βω cos θ +  ∂β ∂t  sin θ  (7)  D. Scaling.Scaling.Scaling. Consider two settings, one with a large object and a fast-moving camera, and one with a small object and aslow-moving camera. Assume that the cameras are otherwiseidentical, the objects are also identical except that one isabsolutely larger than the other. If we target the two camerasat the centers of each object and adjust the distance and orientation of the cameras appropriately, then the images of the two objects should be totally identical. In particular, suppose  v B  =  kv A , (  p 1 ,B  −  p 2 ,B ) =  k (  p 1 ,A  −  p 2 ,A ) , and  kβ  1 ,A  =  β  1 ,B . Thenusing equation (I.5), (II.5) and (II.6), Page 2 of 15 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960  F    o  r    R   e  v   i    e  w    O   n  l     y    3 B. Larger and farther object     faster motion    x  0  y  0  x  1 y  1 p  0  kv o  0    o  1 Θ  p  1 Α 1,B  Α 2,B  k   Ξ  A. Smaller and closer object     slower motion x  0  y  0  x  1 y  1 p  0  v o  0  o  1 Θ  p  1 Α 1,A Α 2,A  Ξ  Fig. 4: scaling the image k  =  β  1 ,B β  1 ,A = (  p 1 ,B  −  p 2 ,B )(  p 1 ,A  −  p 2 ,A ) =  α 1 ,B  ( β  1 ,B  + 1) α 1 ,A  ( β  1 ,A  + 1) k  =  α 1 ,B v B  · x  + v B  · y  + ( β  1 ,B  + 1) α 1 ,A v A  · x  + v A  · y  + ( β  1 ,A  + 1)  ωα 21 ,B  +  ∂α 1 ,B ∂t  ωα 21 ,A  +  ∂α 1 ,A ∂t  Now if we suppose that  α 1 ,B  =  α 1 ,A , k  =  α 1 ,A kv A  · x + kv A  · y  + ( kβ  1 ,A  + 1) α 1 ,A  v A  ·  x +  kv A  ·  y  + ( β  1 ,A  + 1)  ωα 21 ,A  +  ∂α 1 ,B ∂t  ωα 21 ,A  +  ∂α 1 ,A ∂t  k  =  k  ωα 21 ,A  +  ∂α 1 ,B ∂t  ωα 21 ,A  +  ∂α 1 ,A ∂t  ∂α 1 ,B ∂t  =  ∂α 1 ,A ∂t Thus, the image derivatives are equal, so the two situationsproduce identical videos. The consequence of this fact isthat without knowing  a priori  the size of objects viewedthrough our camera, we have no way of knowing how farthey are from the camera, regardless of what information wecan derive by considering only how the image transforms.However, if all we seek is to know how points in ourfield of view are situated relative to one another, then itsuffices to assume an arbitrary scale for the image. Forexample, we can assume that our targeted point is initially atunit distance from the camera, if we can not determine thisdistance otherwise. With three points in view having oneas target, we can propose two vectors and thus a basis andsrcin for the view space.  E. Solving the targeting scenario.Solving the targeting scenario.Solving the targeting scenario. Take one target point and two auxiliary points, and set the target point (frame) as unit distance from the camera. Solving for the cosine of the anglebetween the two vectors from the target point is akin tosolving the following system. By eliminating every  b i  and  finally solving for   ξ   = cos φ  , the first six equations come from(I.7) applied to each point, and the last two from the vector-angle equation, Cos φ  =  v 1  · v 2 [[ v 1 ]][[ v 2 ]] =  ξ   (8) x  0  y  0  x  1 y  1 p  0 1 p  2  Α 0  Α 1 Α 2  Α 0  Α 1 Α 2  Θ  v o  0  o  1 Fig. 5: The targeting scenario tan( θ ) =  −   − a 1 + a 2 +( a 4 +1) a 6 b 3 (1 − a 4 ) a 6 + a 4 b 1 + b 1 + a 2 b 3  tan( θ ) =  ((1 − a 4 ) a 6 + a 4 b 1 + b 1 + a 2 b 3 )   ( − a 1 + a 2 +( a 4 +1) a 6 b 3 ) tan( θ ) =  −  ( − a 1 + a 3 +( a 5 +1) a 6 b 4 )((1 − a 5 ) a 6 + a 5 b 2 + b 2 + a 3 b 4 ) tan( θ ) =  ((1 − a 5 ) a 6 + a 5 b 2 + b 2 + a 3 b 4 )( − a 1 + a 3 +( a 5 +1) a 6 b 4 ) tan( θ ) =  − ( a 2 − a 3 + a 6 (( a 4 +1) b 3 − ( a 5 +1) b 4 ))( b 1 + a 4 ( b 1 − a 6 )+ a 5 ( a 6 − b 2 ) − b 2 + a 2 b 3 − a 3 b 4 ) tan( θ ) =  − ( a 4 ( a 6 − b 1 ) − b 1 + b 2 + a 5 ( b 2 − a 6 ) − a 2 b 3 + a 3 b 4 )( a 2 − a 3 + a 6 (( a 4 +1) b 3 − ( a 5 +1) b 4 ))(1 − a 4 )(1 − a 5 )+( a 4 +1)( a 5 +1) b 3 b 4 ξ √  (1 − a 4 ) 2 +( a 4 +1) 2 b 23 =   (1  − a 5 ) 2 + ( a 5  + 1) 2 b 24 a 2 + a 3 + a 1 ( a 4 + a 5 ) − ( a 3 ( a 4 +1) b 3 +( a 5 +1)(( a 4 +1) b 1 + a 2 b 3 )) b 4 2 a 1 + a 3 a 4 + a 2 a 5 +( a 4 +1)( a 5 +1) b 2 b 3 = 1 (9) F. Discussion on coordinates in the view space. Discussion on coordinates in the view space. Discussion on coordinates in the view space. Given twovectors and an srcin, we should be able to specify in somerelative sense the location of any visible point at least withregard to this frame. If the camera is thought to target thissrcin and undergo some targeting motion (as defined earlier),then we might wish to ask what is the angle between the twovectors. It would seem an important question for using themto coordinatize the space, since we cannot have them to be parallel, and furthermore, if they are not orthogonal, then wewould use this angle as a way to reconstruct the view spacein typical Euclidean space with the natural basis. We find, however, that in the targeting scenario with onetarget point and two auxiliary points, that solving the vector- Page 3 of 15 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960  F    o  r    R   e  v   i    e  w    O   n  l     y    4 angle equation  cos θ  =  v 1  · v 2 / [[ v 1 ]][[ v 2 ]]  is still dependentupon the motion of the camera, which would seem to suggestthat two non-congruent triangles each targeted at a vertexwould appear identical, given certain choices for cameramotions, but this proposal is far from what intuition suggests.For instance, given a regular hexagon and a square, and acamera rotating around each of them, is it possible that fromthe right distance, the two videos would appear to be the same?Fig. 6: Similar viewsIII. U SING  POV-R AY  A. Software.Software.Software. Using POV-Ray as an optical studio, it isstraightforward to determine how linear measure in an image plane decreases with depth. Below is POV-Ray’s visualizationof a series of identical square frames placed one unit apart directly along the line of sight. The length of the diagonal of each frame appears progressively smaller, measured in pixels,shown in the adjacent graph. 2 4 6 8 10 12200400600800100012001400 Fig. 7: Perspective with software simulation and graph of apparent sizeInverting the length of the diagonal reveals the inverseproportion: 2 4 6 8 10 120.0010.0020.0030.004 Fig. 8: Linear relationshipA more detailed analysis might prove useful. Theequation for the straight line given directly above is y ( x ) =  14394202100  +  12898 x , using nearest frames to determinethe slope, rather than an average. The first frame is at  z  = 1 ,the camera is positioned at  z  =  − 1 , and the direction vectorof the camera is   0 , 0 , 1  . The image has width and height of 4096 each, and the right vector of the camera is determinedby the formula   1 , 0 , 0  ·  w/h .A little discussion is in order to explain the precise valuesfor the slope and intercept in the given linear equation. Theimage plane has local coordinates  [ − 1 / 2 , +1 / 2] 2 , which interms of pixels we can multiply by 4096. Using Pov-Ray’sequation we can compute that the viewing angle is roughly 53degrees.Why is this important? If the viewing angle were larger, thenwe should expect that the generated image would be com-pressed along the horizontal, resulting in a smaller distancebetween succesive frames in the image plane. In any case,adapting the above model and computing, we get a result thatagrees with Pov-Ray. We find the vector  CD  between thecamera and the object then project it along the line of sight(direction) vector . To map the vector onto the image plane,we scale  CD  by the inverse of the norm of the projection. P   =  − CD ( CD  ·  C  ) , where the negative sign appears since thevector  C   to the camera is negative. C     Pv D Fig. 9: Test pattern model, angle view C  P v D Fig. 10: Test pattern model, camera view  B. Another approach Another approach Another approach. Given a camera with known charac-teristics, can we recreate a three-dimensional scene with someamount of accuracy, given a set of projections? Suppose  A  is a bounded region in R 3 , and  χ B  the character-istic function of some open subset  B  ⊆  A . If we would liketo take a picture of   B  from some point  c  ∈  R 3  A , then weshould decide where to put the image plane  J  −−− somewhere Page 4 of 15 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x