Art & Photos

Block-based disparity estimation by partial finite ridgelet distortion search (PFRDS)

Description
In stereo vision applications, computing the disparity map is an important issue. Performance of different approaches totally depends on the employed similarity measurements. In this paper finite ridgelet transform is used to define an edge sensitive
Categories
Published
of 7
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Block-based disparity estimation by partial finite ridgeletdistortion search (PFRDS) Mohammad Eslami, Farah Torkamani-Azar  Faculty of Electrical and Computer Engineering, Shahid Beheshti University, G.C., Tehran, Iran a r t i c l e i n f o  Article history: Received 17 June 2009Received in revised form29 July 2009Accepted 13 August 2009Available online 10 September 2009 Keywords: Stereo visionFinite ridgelet transformRadon transformDisparity map a b s t r a c t In stereo vision applications, computing the disparity map is an important issue. Performance of different approaches totally depends on the employed similarity measurements. In this paper finiteridgelet transform is used to define an edge sensitive block distortion similarity measure. Simulationresults emphasize to outperform in the conventional criteria and is less sensitive to noise, especiallyat the edge set of images. To speed computations, a new partial search algorithm based on energyconservation property of FRIT is proposed. &  2009 Elsevier Ltd. All rights reserved. 1. Introduction Variousapplicationssuchasrobot navigation,augmentedreality,3D telecommunication and video conference rely on stereo vision tounderstand 3D space from two coherent images [1–3]. There arethree majortopicsinstereo:corresponding,occlusionsandrealtimeimplementations. The primary problems to be solved in computa-tional stereo are calibration, correspondence and reconstruction.The correspondence process of stereo vision is matching the twoimages and computes their disparity map [3,4]. This is mostlyfulfilled by exploring right and left images ( L ,  R ), two correspondingpoints ( m l  ¼  [  x l ,  y l ] T  and m r   ¼  [  x r  ,  y r  ] T  )related withone point  P   in 3Dspace. Disparity vector is defined as the displacement between thispair of points in two images [3] as Eq. (1). d  ¼  m l    m r   ð 1 Þ Indeed searching is constrained to 1D along the epipolar line.By using rectification techniques the epipolar lines lie either alongthe scan-line or perpendicular to it in transformed images [5]. As aconsequence, computation of the disparity vector reduces merelyin one direction (usually  x -axes) as Eq. (2). d  ¼ ½  x l    x r  ; 0  T  ð 2 Þ Lots of algorithms usually employ one of these two con-straints: local constraint which consider a mask around theinterest pixel, or global constraints which consider whole thescan-lines or the entire images. Local methods can be veryefficient, but they are sensitive to locally ambiguous regions inimages (e.g. occlusion regions or regions with uniform texture).However, global methods are more computationally expensive,they are less sensitive to these problems since global constraintsprovide additional support for regions difficult to match locally.There are other methods that they are relying on both constrainsor different views.The most common approach to global matching is dynamicprogramming [6,7], which uses the ordering and smoothnessconstraints to optimize correspondence in each scan-line. In-trinsic curves [8] and graph cuts [9,10] are two another common global methods.Local methods fall into three broad categories: area-based,block matching [11,12], feature-based [13,14] and gradient based or optical flow [15,16]. These methods differ in search strategy orsimilarity criterion.Much of the stereo research in the last decade has focusedon detecting and measuring occlusion regions in stereo imageryand recovering accurate depth estimates for these regions[17–20].Block matching methods seek to estimate disparity at a pointin one image by comparing a small region about that point with aseries of small regions extracted from the other image (searcharea). As stated before, the epipolar constraint reduces the searchto one dimension. Three classes of metrics are commonly used forblock matching: correlation [21,22], intensity differences [3,4,21] and rank metrics [23]. Two simple and widely employed ARTICLE IN PRESS Contents lists available at ScienceDirectjournal homepage: www.elsevier.com/locate/optlaseng Optics and Lasers in Engineering 0143-8166/$-see front matter  &  2009 Elsevier Ltd. All rights reserved.doi:10.1016/j.optlaseng.2009.08.004  Correspondence to: Torkamani-Azar Farah, Department of Communication,Faculty of Electrical and Computer Engineering, Shahid Beheshti University, Evin19839, Tehran, Iran. Tel.: +982129902286; fax: +982122431804. E-mail addresses:  Moh.eslami@mail.sbu.ac.ir (M. Eslami), f-torkamani@sbu.ac.ir,ftorkamaniazar@yahoo.com (F. Torkamani-Azar). URL:  http://faculties.sbu.ac.ir/~f-torkamani (F. Torkamani-Azar).Optics and Lasers in Engineering 48 (2010) 125–131  ARTICLE IN PRESS similarity measures are sum of absolute and square differences(SAD and SSD, respectively) which are used in many real timeapplications [3,4,21]. The values of these criteria, for a pair of pixels  A  and  B  in blocks of the left and right images are: D SAD  ¼ X n 2 N  j L ð  A  þ  n Þ   R ð B  þ  n Þj ð 3 Þ D SSD  ¼ X n 2 N  j L ð  A þ  n Þ  R ð B  þ  n Þj 2 ð 4 Þ where L (  A ) and  R (  A ) are the intensityof the pixel  A  in left and rightimages, also  n  varies within  N   as the neighborhood of the pixels,for instance 3  3 window.Another criterion for similarity measure is the normalizedcross correlation (NCC) which is more appropriate when thestereo cameras have photometric differences [3,21,22]. Let  s l 2 and s r  2 be the intensity variance of considered blocks in  L  and  R images, which are around  A  and  B  pixels and  s lr  2 be their crosscovariance. Then an NCC criterion is defined as: D NCC   ¼ s 2 lr   ffiffiffiffiffiffiffiffiffiffiffiffi  s 2 l  s 2 r  q   ð 5 Þ In this paper we used finite ridgelet transform (FRIT), toemploy different weights for smooth and high contrast areas.Besides, partial search method was used to compute similaritybetween the blocks of right and left images. In the followingsections, first an introduction to FRIT is provided in Section 2.Section 3 contains the theory of the partial finite ridgelet searchand its experimental results are available in Section 4. 2. Finite ridgelet transform (FRIT) Finite ridgelet transform (FRIT) is proposed by Do and Vetterlias an orthonormal version of the ridgelet transform for discreteimages [24]. It is employed in various applications such as noisereduction, watermarking and compression [25–27]. The finiteridgelet transform is based on the Finite radon transform (FRAT)introduced by Bolker [28] to consider a novel ordering of FRATcoefficients to suppress the periodic effect associating with finitetransforms. Taking one dimensional Wavelet transform on theFRATcoefficients in every direction in a special way, results in thefinite ridgelet transform, which is invertible, non-redundant, andcan be computed via fast algorithms (i.e. orthonormal finiteridgelet transform).Finite radon transform (FRAT) is defined as summation of image pixel intensities over a certain set of lines (not in directionsof angles 0, 1, 2, y , 180 that was used in radon and ridgelettransform). Note that it should be considered an image size  p   p as  f  , which  p  is a prime number. In the optimal ordering form of FRAT, lines are defined as Eq. (6): Line ð a ;  b ;  s Þ  :  ax  þ  by   s  ¼  0 ; 8  x ;  y  2 f 0 ;  1 ;  . . .  ;  p   1 g ð 6 Þ Then FRAT will be defined by Eq. (7): r  a ;  b ð s Þ ¼  FRA T   f  ð a ;  b ;  s Þ ¼  1  ffiffiffi   p p  X ð  x ;  y Þ2 Line ð a ;  b ;  s Þ  f  ð  x ;  y Þ ð 7 Þ where  f  (  x ,  y ) is the intensity of pixel (  x ,  y ). Fig.1(a) shows directionlines to compute FRAT for a 7  7 block.FRIT coefficients can be computed from applying discretewavelet transform to the FRAT coefficients at each direction. As aconsequence, the finite ridgelet transform provides  p  coefficientsfor every individual direction (total  p +1 directions) [24]. Finally,for a  p   p  block, FRIT leads to a  p  (  p +1) matrix. Fig. 1(b) showsthe FRIT computation process. 3. The proposed method The finite ridgelet transform as well as the ridgelet transform iscapable to represent line discontinuities. In an image line-shapedsingularities make a few large coefficients while the randomlydistributed singularities are unlikely to produce significantcoefficients. This property of the ridgelet transform which isinherited from the radon transform can be utilized to develop anedge sensitive similarity measure. Fig. 2 compares the FRITcoefficients associating with a horizontal line image and arandomly distributed salt and pepper noise image. Both imageshave identical mean values but differ significantly in their FRITcoefficients. While the two blocks are identical with the sense of SSD criterion, they can be definitely discriminated utilizing FRITcoefficients. There is a pick in FRIT coefficients corresponding tothe edge in Fig. 2(a) but the FRITcoefficients of  Fig. 2(b) are almost negligible in all directions. A Matlab toolbox of the finite ridgelettransform provides these computations [29,30].So, we consider the edge sensitive similarity measurementbetween two blocks, based on their distortion in FRITcoefficients,which was defined as: D FRIT   ¼ j m L   m R j 2 þ a X  p þ 1 k ¼ 1 j R Lk    R Rk j q ð 8 Þ where,  R kL and  R kR are  k th column of FRIT coefficients matrix,( k th direction in FRIT transform) of left and right image in size  p  1, respectively. Also,  m R  and  m L  are the mean value of the rightand left image blocks. Positive parameter  a  controls the impact of FRITcoefficients on the total distortion and integer  q 4 2 magnifies Fig.1.  (a) The directions of lines which is used in FRAT for 7  7 block, (b) the blockdiagram of FRIT computation. M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131 126  ARTICLE IN PRESS the large coefficients which have been produced in edge positions.It should be considered that  D FRIT  is square of mean absolutedifference (when normalized by the number of pixels) for  a  ¼  0and it approaches to  D SSD  as  q  approaches to 2.Computation of   D FRIT  in Eq. (8) involves all FRIT coefficients,which can be efficiently abridged in a partial distortion search.This search strategy with other similarity criteria is frequentlyemployed in motion estimation literature in order to attainan optimum search with least possible calculations [31–33].Since the finite ridgelet transform is an energy preservingtransform, to increase the speed of algorithm, we decided toconsider iterative computation and proposed partial finite ridgeletdistortion search (PFRDS). This provided a faster and stilloptimum search to find two corresponding blocks in pair imagesof stereo vision. D FRI  T  ð k Þ  ¼  D FRI  T  ð k  1 Þ  þ a j R Lk   R Rk j q ; 0 o k o  p  þ 1 D FRI  T  ð 0 Þ  ¼ j m L   m R j 2 ð 9 Þ Each computation step of   D FRIT ( k ) in Eq. (9) accumulates thedistortion of FRITcoefficients till  k th directions (or fist  k  column of FRIT matrix). The exhaustive search strategy of Eq. (8) includescomputation of all  p  iterations for every candidate block amongthe epipolar line. Alternatively, partial distortion search optimallyeliminates unnecessary computations by checking the distortion D FRI  T  k  and terminates the iterations of Eq. (9) as soon as thecandidate looses the competition with the last winner. Supposethat  D min  denotes the least distortion associating with the winnerof all previous candidates. For a new candidate, iterations of Eq. (9) should merely continue until  D FRI  T  ð k Þ o D min . As soon as D FRIT ( k ) becomes greater than  D min  the undergoing candidate isdefinitelya loser. Consequently, there is no need for further. As thePFRDS does not reject any candidate without inspecting, the finalwinner has most similarity to the reference block with minimumrequired computation.Loser rejection policy can be more effective and fast when thosedirections with larger distortion are considered first. This implies if the larger FRITcoefficients are considered in the first iterations. So,the proposed sorting scheme in this strategy was based on themaximum FRIT coefficients in different directions. These largecoefficients correspond to edges of block and as a consequence of the proposed sorting, directions associating with the availableedges in the block would be considered first in computing.Note that, for the first candidate the value of distortion  D FRIT should be computed completely which provides initial  D min . Itshould be considered that the ordinary order of   k  ( k  ¼  1, 2, y ,  p +1) in the proposed search strategy is changed with respect tothe ridgelet coefficients in different directions. Fig. 3 depicts the Fig. 2.  Two sample images and their FRIT coefficients. M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131  127  ARTICLE IN PRESS flowchart of the proposed partial finite ridgelet distortion searchand totally, partial finite ridgelet distortion search for disparityestimation can be summarized as:1. Compute the finite ridgelet coefficients of the reference block( m L and  R L if the reference block is in the left image).2. Sort the directions ( k ) with respect tothe maximum sum of theFRIT coefficients in  R kL , in every direction.3. Choose a new candidate on the corresponding epipolar linein the right image, compute the mean of block,  m R ,  R R andcompute  D FRIT  (0) for the new candidate.4. If   D FRIT  (0) 4 D min  reject the candidate and go to 2 with a newcandidate, else go to step 5.5. Order the columns of   R R with respect of sorting order  R L instep 2.6. Continue the computation of   D FRI  T  k  from  k  ¼  1. For each valueof   k ,a. If   D FRIT  k o D min , and  k o  p +1 ,  increase  k  by 1, and repeat thisstep.b. If   D FRIT  k Z D min , however,  k o  p +1, reject the candidate andgo to step 3 with a new candidate.c. If total  D FRIT o D min , update  D min  with  D FRIT  as the mostsimilarity to the reference block and go to step 3 with a newcandidate.Note that, the lose candidate might be recognized in step 4, onlyafter computation  D FRIT  ( 0 ) or after step 6.b, before  k  receive to  p +1. So, rejected block was recognized so fast which is the loserrejection policy of this algorithm as mentioned before. 4. Experimental results In this paper the proposed PFRDS was applied for disparityestimation of 17  17 blocks from left image by searching thecorresponding candidate blocks of the right image. It was appliedtothe stereodatabase of Middlebury: Books, Wood1, Dolls, Wood2and Lampshade [34,35]. Each dataset consists of at least 30 imagepairs in gray scale with the size 650  555 token by parallelcamera configuration. Fig. 4 shows the left and right images of Books and Wood2.In an exhaustive complete search implementation (notpartially) the computation of   D FRIT  takes more time in comparisonwith SSD and NCC similarity measurements as it contains a FRITtransformation. Typically, an image size of 650  555 waspartitioned by 17  17 blocks with 50% overlap. The average timecomplexity to find corresponding block of each reference block indifferent algorithms is shown in Table 1.These reported times are for searching whole epipolar line thatcan be effectively decreased limiting the number of candidatesand using more sophisticated searching such as hierarchicalmethod.In order to merely compare the proposed PFRDS with othersimilarity criteria (SSD and NCC), they are employed in a simplesearch strategy on whole epipolar line. However, hierarchicalmethods to constraint the search area [36], dynamic program-ming for minimizing [6], consistency check for rejecting outliers[37] and object recognition and prediction for dealing withocclusion [17] would improve performance.In order to have an edge sensitive criterion a larger value of   q  isrequired to magnify the larger FRIT coefficients (corresponding toedges). On the other side, large values of   q  are inappropriate asthey may negate the difference of mean values ( D 0 ). So in thisstudy  q  ¼  3 is set for all the experiments. While the  q  parameterdetermines the effect of individual distortions in FRIT coefficients( R k ), the other parameter ( a ) determines the weight totaldistortion of   R k  in all directions.Employed quantitative measurement for inspecting the accu-racy of disparity estimation is ratio of matching region ( r  m ) andratio of comparison region ( r  c  ) to the total number of pixels [38].The matching region refers to the non-occluded areas determinedby the stereo matching algorithm. The comparison region refers to Fig. 3.  The flowchart of the proposed method. M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131 128  ARTICLE IN PRESS that portion of non-occluded areas for which estimated disparityequals available ground truth disparities.Tables 2 and 3 report the acquired  r  m  and  r  c   from NCC, SSD andPFRDS with various values of   a , respectively applied to the Bookand Woods dataset. Moreover, the ground truth disparity map andthe resultant disparity maps from SSD, NCC, and PFRDS is alsoshown in Figs. 5 and 6.It can be inferred from Fig. 5 and Table 1 that the performance ( r  c  ) of PFRDS method was 7% better than NCC and 11% better thanSSD method in the case of BOOKS dataset. In the case of Woodsdataset (Fig. 6 and Table 2) PFRDS yielded much better result than SSD (above 40% increase in  r  c  ) and relatively better than NCC. Thedifference between results of PFRDS and NCC was less than 3% in r  c  , and almost 10% in  r  m  for this dataset. Inspecting Figs. 5 and 6,the PFRDS explicitly outperformed NCC and SSD at the edges of images but their results became close to NCC in smooth region of image. Hence, the ratio of improvement depended on the amountof edges in image as it decreases from Books to Wood2 dataset(refer to Tables 1 and 2). The high frequency components of these images completely defered and the PFRDS surpassed thetwo other criteria in the case of the Books which contains alarge amount of discontinuities and edges but their results werealmost analogous in the case of Wood2 with lower amount of discontinuity.Tables 1 and 2 also approved that increasing the  a  parameterimproved the performance of PRDS (with respect to  r  c  ) whichindicated the significance of edges in similarity measurement.Fig. 7 depicts attained  r  c   versus  a for Books results. A typical valuefor this parameter, which was used in the experiments of thisstudy, was  a  ¼  100. As stated before, however increasing  a  leadsto enlarge edge effects, this does not mean that, we always obtainbetter performance. Let consider pictures with periodic edges ordistorted ones which their gray scales are still different, (e.g.consider a case that many edges are occlusions) so in these case,we should not increase  a  extremely. By the other words,  a  is atrade off parameter between gray scale and edge significance andshould be set properly respecting to the pictures.Some more results for other image pairs are shown in Table 4.Another critical selection is the block size  p . Generally,accuracy of block-based search for disparity estimation increasesby reducing the size of blocks and increasing the resolution of disparity estimation. But the performance of PFRDS as an edgesensitive similarity measure depends on the attendance of edgesin the block which will be less probable with smaller blocks. Onthe other side, the FRIT transform can only represent straight line Fig. 4.  Left and right images of stereo pair of Books and Wood2.  Table 3 r  m  and  r  c   factors from NCC, SSD and PFRDS with various values of  a , for Wood2 pairimages. r  m  (%)  r  c   (%)NCC 87.92 80.9SSD 76.2 40.21PFRDS  a  ¼  3 91 57.4PFRDS  a  ¼  50 94.9 72.84PFRDS  a  ¼  80 95.35 82.1PFRDS  a  ¼  100 97.2 83.35  Table 2 r  m  and  r  c   factors from NCC, SSD and PFRDS with various values of   a , for Books pairimages. r  m  (%)  r  c   (%)NCC 81.6 76.7SSD 90.5 67.51PFRDS  a  ¼  3 92.44 74.52PFRDS  a  ¼  10 93.7 77.39PFRDS  a  ¼  20 94.37 80.58PFRDS  a  ¼  30 94.74 81.97PFRDS  a  ¼  40 94.95 82.66PFRDS  a  ¼  60 94.96 82.98PFRDS  a  ¼  80 95.18 83.65PFRDS  a  ¼  100 95.2 84.28PFRDS  a  ¼  120 95.6 84.45PFRDS  a  ¼  140 95.62 84.61 Fig. 5.  Disparity map results of Books pair (a) using SSD, (b) using NCC, (c) usingPFRDS,  a  ¼ 100, (d) ground truth image.  Table 1 The required time in different algorithms.SSD NCC Proposed method withwhole computationPFRDSRequired time (s) 0.14 0.47 3.43 1.05 M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131  129
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x