Description

In stereo vision applications, computing the disparity map is an important issue. Performance of different approaches totally depends on the employed similarity measurements. In this paper finite ridgelet transform is used to define an edge sensitive

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

Block-based disparity estimation by partial ﬁnite ridgeletdistortion search (PFRDS)
Mohammad Eslami, Farah Torkamani-Azar
Faculty of Electrical and Computer Engineering, Shahid Beheshti University, G.C., Tehran, Iran
a r t i c l e i n f o
Article history:
Received 17 June 2009Received in revised form29 July 2009Accepted 13 August 2009Available online 10 September 2009
Keywords:
Stereo visionFinite ridgelet transformRadon transformDisparity map
a b s t r a c t
In stereo vision applications, computing the disparity map is an important issue. Performance of different approaches totally depends on the employed similarity measurements. In this paper ﬁniteridgelet transform is used to deﬁne an edge sensitive block distortion similarity measure. Simulationresults emphasize to outperform in the conventional criteria and is less sensitive to noise, especiallyat the edge set of images. To speed computations, a new partial search algorithm based on energyconservation property of FRIT is proposed.
&
2009 Elsevier Ltd. All rights reserved.
1. Introduction
Variousapplicationssuchasrobot navigation,augmentedreality,3D telecommunication and video conference rely on stereo vision tounderstand 3D space from two coherent images [1–3]. There arethree majortopicsinstereo:corresponding,occlusionsandrealtimeimplementations. The primary problems to be solved in computa-tional stereo are calibration, correspondence and reconstruction.The correspondence process of stereo vision is matching the twoimages and computes their disparity map [3,4]. This is mostlyfulﬁlled by exploring right and left images (
L
,
R
), two correspondingpoints (
m
l
¼
[
x
l
,
y
l
]
T
and
m
r
¼
[
x
r
,
y
r
]
T
)related withone point
P
in 3Dspace. Disparity vector is deﬁned as the displacement between thispair of points in two images [3] as Eq. (1).
d
¼
m
l
m
r
ð
1
Þ
Indeed searching is constrained to 1D along the epipolar line.By using rectiﬁcation techniques the epipolar lines lie either alongthe scan-line or perpendicular to it in transformed images [5]. As aconsequence, computation of the disparity vector reduces merelyin one direction (usually
x
-axes) as Eq. (2).
d
¼ ½
x
l
x
r
;
0
T
ð
2
Þ
Lots of algorithms usually employ one of these two con-straints: local constraint which consider a mask around theinterest pixel, or global constraints which consider whole thescan-lines or the entire images. Local methods can be veryefﬁcient, but they are sensitive to locally ambiguous regions inimages (e.g. occlusion regions or regions with uniform texture).However, global methods are more computationally expensive,they are less sensitive to these problems since global constraintsprovide additional support for regions difﬁcult to match locally.There are other methods that they are relying on both constrainsor different views.The most common approach to global matching is dynamicprogramming [6,7], which uses the ordering and smoothnessconstraints to optimize correspondence in each scan-line. In-trinsic curves [8] and graph cuts [9,10] are two another common
global methods.Local methods fall into three broad categories: area-based,block matching [11,12], feature-based [13,14] and gradient based
or optical ﬂow [15,16]. These methods differ in search strategy orsimilarity criterion.Much of the stereo research in the last decade has focusedon detecting and measuring occlusion regions in stereo imageryand recovering accurate depth estimates for these regions[17–20].Block matching methods seek to estimate disparity at a pointin one image by comparing a small region about that point with aseries of small regions extracted from the other image (searcharea). As stated before, the epipolar constraint reduces the searchto one dimension. Three classes of metrics are commonly used forblock matching: correlation [21,22], intensity differences [3,4,21]
and rank metrics [23]. Two simple and widely employed
ARTICLE IN PRESS
Contents lists available at ScienceDirectjournal homepage: www.elsevier.com/locate/optlaseng
Optics and Lasers in Engineering
0143-8166/$-see front matter
&
2009 Elsevier Ltd. All rights reserved.doi:10.1016/j.optlaseng.2009.08.004
Correspondence to: Torkamani-Azar Farah, Department of Communication,Faculty of Electrical and Computer Engineering, Shahid Beheshti University, Evin19839, Tehran, Iran. Tel.: +982129902286; fax: +982122431804.
E-mail addresses:
Moh.eslami@mail.sbu.ac.ir (M. Eslami), f-torkamani@sbu.ac.ir,ftorkamaniazar@yahoo.com (F. Torkamani-Azar).
URL:
http://faculties.sbu.ac.ir/~f-torkamani (F. Torkamani-Azar).Optics and Lasers in Engineering 48 (2010) 125–131
ARTICLE IN PRESS
similarity measures are sum of absolute and square differences(SAD and SSD, respectively) which are used in many real timeapplications [3,4,21]. The values of these criteria, for a pair of pixels
A
and
B
in blocks of the left and right images are:
D
SAD
¼
X
n
2
N
j
L
ð
A
þ
n
Þ
R
ð
B
þ
n
Þj ð
3
Þ
D
SSD
¼
X
n
2
N
j
L
ð
A
þ
n
Þ
R
ð
B
þ
n
Þj
2
ð
4
Þ
where
L
(
A
) and
R
(
A
) are the intensityof the pixel
A
in left and rightimages, also
n
varies within
N
as the neighborhood of the pixels,for instance 3
3 window.Another criterion for similarity measure is the normalizedcross correlation (NCC) which is more appropriate when thestereo cameras have photometric differences [3,21,22]. Let
s
l
2
and
s
r
2
be the intensity variance of considered blocks in
L
and
R
images, which are around
A
and
B
pixels and
s
lr
2
be their crosscovariance. Then an NCC criterion is deﬁned as:
D
NCC
¼
s
2
lr
ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
s
2
l
s
2
r
q
ð
5
Þ
In this paper we used ﬁnite ridgelet transform (FRIT), toemploy different weights for smooth and high contrast areas.Besides, partial search method was used to compute similaritybetween the blocks of right and left images. In the followingsections, ﬁrst an introduction to FRIT is provided in Section 2.Section 3 contains the theory of the partial ﬁnite ridgelet searchand its experimental results are available in Section 4.
2. Finite ridgelet transform (FRIT)
Finite ridgelet transform (FRIT) is proposed by Do and Vetterlias an orthonormal version of the ridgelet transform for discreteimages [24]. It is employed in various applications such as noisereduction, watermarking and compression [25–27]. The ﬁniteridgelet transform is based on the Finite radon transform (FRAT)introduced by Bolker [28] to consider a novel ordering of FRATcoefﬁcients to suppress the periodic effect associating with ﬁnitetransforms. Taking one dimensional Wavelet transform on theFRATcoefﬁcients in every direction in a special way, results in theﬁnite ridgelet transform, which is invertible, non-redundant, andcan be computed via fast algorithms (i.e. orthonormal ﬁniteridgelet transform).Finite radon transform (FRAT) is deﬁned as summation of image pixel intensities over a certain set of lines (not in directionsof angles 0, 1, 2,
y
, 180 that was used in radon and ridgelettransform). Note that it should be considered an image size
p
p
as
f
, which
p
is a prime number. In the optimal ordering form of FRAT, lines are deﬁned as Eq. (6):
Line
ð
a
;
b
;
s
Þ
:
ax
þ
by
s
¼
0
;
8
x
;
y
2 f
0
;
1
;
. . .
;
p
1
g ð
6
Þ
Then FRAT will be deﬁned by Eq. (7):
r
a
;
b
ð
s
Þ ¼
FRA
T
f
ð
a
;
b
;
s
Þ ¼
1
ﬃﬃﬃ
p
p
X
ð
x
;
y
Þ2
Line
ð
a
;
b
;
s
Þ
f
ð
x
;
y
Þ ð
7
Þ
where
f
(
x
,
y
) is the intensity of pixel (
x
,
y
). Fig.1(a) shows directionlines to compute FRAT for a 7
7 block.FRIT coefﬁcients can be computed from applying discretewavelet transform to the FRAT coefﬁcients at each direction. As aconsequence, the ﬁnite ridgelet transform provides
p
coefﬁcientsfor every individual direction (total
p
+1 directions) [24]. Finally,for a
p
p
block, FRIT leads to a
p
(
p
+1) matrix. Fig. 1(b) showsthe FRIT computation process.
3. The proposed method
The ﬁnite ridgelet transform as well as the ridgelet transform iscapable to represent line discontinuities. In an image line-shapedsingularities make a few large coefﬁcients while the randomlydistributed singularities are unlikely to produce signiﬁcantcoefﬁcients. This property of the ridgelet transform which isinherited from the radon transform can be utilized to develop anedge sensitive similarity measure. Fig. 2 compares the FRITcoefﬁcients associating with a horizontal line image and arandomly distributed salt and pepper noise image. Both imageshave identical mean values but differ signiﬁcantly in their FRITcoefﬁcients. While the two blocks are identical with the sense of SSD criterion, they can be deﬁnitely discriminated utilizing FRITcoefﬁcients. There is a pick in FRIT coefﬁcients corresponding tothe edge in Fig. 2(a) but the FRITcoefﬁcients of Fig. 2(b) are almost
negligible in all directions. A Matlab toolbox of the ﬁnite ridgelettransform provides these computations [29,30].So, we consider the edge sensitive similarity measurementbetween two blocks, based on their distortion in FRITcoefﬁcients,which was deﬁned as:
D
FRIT
¼ j
m
L
m
R
j
2
þ
a
X
p
þ
1
k
¼
1
j
R
Lk
R
Rk
j
q
ð
8
Þ
where,
R
kL
and
R
kR
are
k
th
column of FRIT coefﬁcients matrix,(
k
th
direction in FRIT transform) of left and right image in size
p
1, respectively. Also,
m
R
and
m
L
are the mean value of the rightand left image blocks. Positive parameter
a
controls the impact of FRITcoefﬁcients on the total distortion and integer
q
4
2 magniﬁes
Fig.1.
(a) The directions of lines which is used in FRAT for 7
7 block, (b) the blockdiagram of FRIT computation.
M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131
126
ARTICLE IN PRESS
the large coefﬁcients which have been produced in edge positions.It should be considered that
D
FRIT
is square of mean absolutedifference (when normalized by the number of pixels) for
a
¼
0and it approaches to
D
SSD
as
q
approaches to 2.Computation of
D
FRIT
in Eq. (8) involves all FRIT coefﬁcients,which can be efﬁciently abridged in a partial distortion search.This search strategy with other similarity criteria is frequentlyemployed in motion estimation literature in order to attainan optimum search with least possible calculations [31–33].Since the ﬁnite ridgelet transform is an energy preservingtransform, to increase the speed of algorithm, we decided toconsider iterative computation and proposed partial ﬁnite ridgeletdistortion search (PFRDS). This provided a faster and stilloptimum search to ﬁnd two corresponding blocks in pair imagesof stereo vision.
D
FRI
T
ð
k
Þ
¼
D
FRI
T
ð
k
1
Þ
þ
a
j
R
Lk
R
Rk
j
q
;
0
o
k
o
p
þ
1
D
FRI
T
ð
0
Þ
¼ j
m
L
m
R
j
2
ð
9
Þ
Each computation step of
D
FRIT
(
k
) in Eq. (9) accumulates thedistortion of FRITcoefﬁcients till
k
th
directions (or ﬁst
k
column of FRIT matrix). The exhaustive search strategy of Eq. (8) includescomputation of all
p
iterations for every candidate block amongthe epipolar line. Alternatively, partial distortion search optimallyeliminates unnecessary computations by checking the distortion
D
FRI
T
k
and terminates the iterations of Eq. (9) as soon as thecandidate looses the competition with the last winner. Supposethat
D
min
denotes the least distortion associating with the winnerof all previous candidates. For a new candidate, iterations of Eq. (9) should merely continue until
D
FRI
T
ð
k
Þ
o
D
min
. As soon as
D
FRIT
(
k
) becomes greater than
D
min
the undergoing candidate isdeﬁnitelya loser. Consequently, there is no need for further. As thePFRDS does not reject any candidate without inspecting, the ﬁnalwinner has most similarity to the reference block with minimumrequired computation.Loser rejection policy can be more effective and fast when thosedirections with larger distortion are considered ﬁrst. This implies if the larger FRITcoefﬁcients are considered in the ﬁrst iterations. So,the proposed sorting scheme in this strategy was based on themaximum FRIT coefﬁcients in different directions. These largecoefﬁcients correspond to edges of block and as a consequence of the proposed sorting, directions associating with the availableedges in the block would be considered ﬁrst in computing.Note that, for the ﬁrst candidate the value of distortion
D
FRIT
should be computed completely which provides initial
D
min
. Itshould be considered that the ordinary order of
k
(
k
¼
1, 2,
y
,
p
+1) in the proposed search strategy is changed with respect tothe ridgelet coefﬁcients in different directions. Fig. 3 depicts the
Fig. 2.
Two sample images and their FRIT coefﬁcients.
M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131
127
ARTICLE IN PRESS
ﬂowchart of the proposed partial ﬁnite ridgelet distortion searchand totally, partial ﬁnite ridgelet distortion search for disparityestimation can be summarized as:1. Compute the ﬁnite ridgelet coefﬁcients of the reference block(
m
L
and
R
L
if the reference block is in the left image).2. Sort the directions (
k
) with respect tothe maximum sum of theFRIT coefﬁcients in
R
kL
, in every direction.3. Choose a new candidate on the corresponding epipolar linein the right image, compute the mean of block,
m
R
,
R
R
andcompute
D
FRIT
(0) for the new candidate.4. If
D
FRIT
(0)
4
D
min
reject the candidate and go to 2 with a newcandidate, else go to step 5.5. Order the columns of
R
R
with respect of sorting order
R
L
instep 2.6. Continue the computation of
D
FRI
T
k
from
k
¼
1. For each valueof
k
,a. If
D
FRIT
k
o
D
min
, and
k
o
p
+1
,
increase
k
by 1, and repeat thisstep.b. If
D
FRIT
k
Z
D
min
, however,
k
o
p
+1, reject the candidate andgo to step 3 with a new candidate.c. If total
D
FRIT
o
D
min
, update
D
min
with
D
FRIT
as the mostsimilarity to the reference block and go to step 3 with a newcandidate.Note that, the lose candidate might be recognized in step 4, onlyafter computation
D
FRIT
(
0
) or after step 6.b, before
k
receive to
p
+1. So, rejected block was recognized so fast which is the loserrejection policy of this algorithm as mentioned before.
4. Experimental results
In this paper the proposed PFRDS was applied for disparityestimation of 17
17 blocks from left image by searching thecorresponding candidate blocks of the right image. It was appliedtothe stereodatabase of Middlebury: Books, Wood1, Dolls, Wood2and Lampshade [34,35]. Each dataset consists of at least 30 imagepairs in gray scale with the size 650
555 token by parallelcamera conﬁguration. Fig. 4 shows the left and right images of Books and Wood2.In an exhaustive complete search implementation (notpartially) the computation of
D
FRIT
takes more time in comparisonwith SSD and NCC similarity measurements as it contains a FRITtransformation. Typically, an image size of 650
555 waspartitioned by 17
17 blocks with 50% overlap. The average timecomplexity to ﬁnd corresponding block of each reference block indifferent algorithms is shown in Table 1.These reported times are for searching whole epipolar line thatcan be effectively decreased limiting the number of candidatesand using more sophisticated searching such as hierarchicalmethod.In order to merely compare the proposed PFRDS with othersimilarity criteria (SSD and NCC), they are employed in a simplesearch strategy on whole epipolar line. However, hierarchicalmethods to constraint the search area [36], dynamic program-ming for minimizing [6], consistency check for rejecting outliers[37] and object recognition and prediction for dealing withocclusion [17] would improve performance.In order to have an edge sensitive criterion a larger value of
q
isrequired to magnify the larger FRIT coefﬁcients (corresponding toedges). On the other side, large values of
q
are inappropriate asthey may negate the difference of mean values (
D
0
). So in thisstudy
q
¼
3 is set for all the experiments. While the
q
parameterdetermines the effect of individual distortions in FRIT coefﬁcients(
R
k
), the other parameter (
a
) determines the weight totaldistortion of
R
k
in all directions.Employed quantitative measurement for inspecting the accu-racy of disparity estimation is ratio of matching region (
r
m
) andratio of comparison region (
r
c
) to the total number of pixels [38].The matching region refers to the non-occluded areas determinedby the stereo matching algorithm. The comparison region refers to
Fig. 3.
The ﬂowchart of the proposed method.
M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131
128
ARTICLE IN PRESS
that portion of non-occluded areas for which estimated disparityequals available ground truth disparities.Tables 2 and 3 report the acquired
r
m
and
r
c
from NCC, SSD andPFRDS with various values of
a
, respectively applied to the Bookand Woods dataset. Moreover, the ground truth disparity map andthe resultant disparity maps from SSD, NCC, and PFRDS is alsoshown in Figs. 5 and 6.It can be inferred from Fig. 5 and Table 1 that the performance
(
r
c
) of PFRDS method was 7% better than NCC and 11% better thanSSD method in the case of BOOKS dataset. In the case of Woodsdataset (Fig. 6 and Table 2) PFRDS yielded much better result than
SSD (above 40% increase in
r
c
) and relatively better than NCC. Thedifference between results of PFRDS and NCC was less than 3% in
r
c
, and almost 10% in
r
m
for this dataset. Inspecting Figs. 5 and 6,the PFRDS explicitly outperformed NCC and SSD at the edges of images but their results became close to NCC in smooth region of image. Hence, the ratio of improvement depended on the amountof edges in image as it decreases from Books to Wood2 dataset(refer to Tables 1 and 2). The high frequency components of these images completely defered and the PFRDS surpassed thetwo other criteria in the case of the Books which contains alarge amount of discontinuities and edges but their results werealmost analogous in the case of Wood2 with lower amount of discontinuity.Tables 1 and 2 also approved that increasing the
a
parameterimproved the performance of PRDS (with respect to
r
c
) whichindicated the signiﬁcance of edges in similarity measurement.Fig. 7 depicts attained
r
c
versus
a
for Books results. A typical valuefor this parameter, which was used in the experiments of thisstudy, was
a
¼
100. As stated before, however increasing
a
leadsto enlarge edge effects, this does not mean that, we always obtainbetter performance. Let consider pictures with periodic edges ordistorted ones which their gray scales are still different, (e.g.consider a case that many edges are occlusions) so in these case,we should not increase
a
extremely. By the other words,
a
is atrade off parameter between gray scale and edge signiﬁcance andshould be set properly respecting to the pictures.Some more results for other image pairs are shown in Table 4.Another critical selection is the block size
p
. Generally,accuracy of block-based search for disparity estimation increasesby reducing the size of blocks and increasing the resolution of disparity estimation. But the performance of PFRDS as an edgesensitive similarity measure depends on the attendance of edgesin the block which will be less probable with smaller blocks. Onthe other side, the FRIT transform can only represent straight line
Fig. 4.
Left and right images of stereo pair of Books and Wood2.
Table 3
r
m
and
r
c
factors from NCC, SSD and PFRDS with various values of
a
, for Wood2 pairimages.
r
m
(%)
r
c
(%)NCC 87.92 80.9SSD 76.2 40.21PFRDS
a
¼
3 91 57.4PFRDS
a
¼
50 94.9 72.84PFRDS
a
¼
80 95.35 82.1PFRDS
a
¼
100 97.2 83.35
Table 2
r
m
and
r
c
factors from NCC, SSD and PFRDS with various values of
a
, for Books pairimages.
r
m
(%)
r
c
(%)NCC 81.6 76.7SSD 90.5 67.51PFRDS
a
¼
3 92.44 74.52PFRDS
a
¼
10 93.7 77.39PFRDS
a
¼
20 94.37 80.58PFRDS
a
¼
30 94.74 81.97PFRDS
a
¼
40 94.95 82.66PFRDS
a
¼
60 94.96 82.98PFRDS
a
¼
80 95.18 83.65PFRDS
a
¼
100 95.2 84.28PFRDS
a
¼
120 95.6 84.45PFRDS
a
¼
140 95.62 84.61
Fig. 5.
Disparity map results of Books pair (a) using SSD, (b) using NCC, (c) usingPFRDS,
a
¼
100, (d) ground truth image.
Table 1
The required time in different algorithms.SSD NCC Proposed method withwhole computationPFRDSRequired time (s) 0.14 0.47 3.43 1.05
M. Eslami, F. Torkamani-Azar / Optics and Lasers in Engineering 48 (2010) 125–131
129

Search

Similar documents

Tags

Related Search

Evidence Based Thinking, Managing by Fact SysEstimation by Analogy TechniqueBlock 8 -Sudan- operated by White Nile PetrolAnalysis of structures by finite elementsEstimation of Xylitol by Various MethodsGIS Based Soil Loss Estimation Using RUSLE MoFlow sensing-based estimationDisparity Based MethodsEvidence Based MedicineProject Based Learning

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...Sign Now!

We are very appreciated for your Prompt Action!

x