Description

Non-parametric natural image matting

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

NON-PARAMETRIC NATURAL IMAGE MATTING
Muhammad Sarim, Adrian Hilton, Jean-Yves Guillemaut, Hansung Kim
Centre of Vision, Speech and Signal ProcessingUniversity of Surrey, Guildford, GU2 7XH, Surrey, United Kingdom.{m.farooqui, a.hilton, j.guillemaut, h.kim}@surrey.ac.uk
ABSTRACT
Natural image matting is an extremely challenging imageprocessing problem due to its ill-posed nature. It often re-quires skilled user interaction to aid deﬁnition of foregroundand background regions. Current algorithms use these pre-deﬁned regions to build local foreground and backgroundcolour models. In this paper we propose a novel approachwhich uses non-parametric statistics to model image ap-pearance variations. This technique overcomes the limita-tions of previous parametric approaches which are purelycolour-based and thereby unable to model natural imagestructure. The proposed technique consists of three succes-sive stages: (i) background colour estimation, (ii) foregroundcolour estimation, (iii) alpha estimation. Colour estimationuses patch-based matching techniques to efﬁciently recoverthe optimum colour by comparison against patches from theknown regions. Quantitative evaluation against ground truthdemonstrates that the technique produces better results andsuccessfully recovers ﬁne details such as hair where manyother algorithms fail.
Index Terms:
Alpha matte, composite, trimap, non-parametric statistics
1. INTRODUCTION
Image matting is widely used in video editing to composeforeground object. An image
I
is considered to be a com-posite of foreground image
F
and background image
B
. Theobserved colour of the
i
th
pixel in image
I
can be modeledby the compositing equation
I
i
=
α
i
F
i
+ (1
−
α
i
)
B
i
,
(1)where
F
i
and
B
i
are the pure foreground and backgroundcolours while alpha
(
α
i
∈
[0
,
1])
provides their blending pro-portion to form the composite colour
I
i
. Alpha values rangefrom 0 to 1, where
α
= 0
for background,
α
= 1
for fore-ground. Mixed pixels at the foreground boundary have in-termediate alpha values. The compositing equation is under-constrained as
F
i
,
B
i
and
α
i
are unknown. In a three channelcolour space we have three equations to solve for seven un-knowns. The compositing equation can be constrained in astudio environment by using uniform background, typicallygreen or blue [1]. The assumption that the background colourdoes not appear in the foreground leads to a trivial solution tothe compositing equation 1. Natural images have an arbitrarybackground and no limitation over background colour appear-ing in the foreground. In such images, background and fore-ground can be constrained by user interaction, normally in theform of a trimap. A trimap is typically a hand drawn partitionof an image into three regions namely, deﬁnite foreground,background and unknown. Trimap-based techniques use thelocal information in the known foreground and backgroundregions to build foreground and background colour models toestimate alpha values for every unknown pixel. A commonexample is the use of Gaussian mixture models [2, 3, 4].Previous techniques like [2, 3, 4, 5, 6, 7, 8] use a trimapto solve equation 1 for every pixel in the unknown region byexploiting the local information in the known foreground andbackground regions. In Corel Knockout [5],
F
and
B
are as-sumed to be locally smooth and
α
is estimated by taking theweighted average of local known foreground and backgroundpixels. In [2, 3, 4, 6], local foreground and background pix-els are used to build colour distributions. These distributionsare then used to estimate the foreground, background colourand alpha for every unknown pixel. These techniques tend tosuffer when the distributions overlap or when the unknownregion is wide. In [7], the alpha matte is estimated by solv-ing a Poisson equation with the matte gradient ﬁeld by takingthe gradient of the compositing equation. If
F
and
B
are notsmooth in the unknown region, errors may occur in the alphamatte. In such cases local changes to the matte gradient ﬁeldare required to obtain a satisfactory matte. Techniques like[8] use sparse samples of known foreground and backgroundpixels for every unknown pixel. Only the higher conﬁdencesample pairs, which minimize the matting energy function,are used to estimate
α
, giving robustness against outliers. Apropagation-based approach like [9] ﬁts a linear model to theforeground and background colours in a local window, thusdeﬁning a quadratic cost function in alpha. Alpha is then es-timated by globally minimizing this cost function.In this paper, we present a novel approach for estimat-ing an alpha matte using non-parametric statistics. Previ-ously, non-parametric statistics have been used to locally rep-
3213978-1-4244-5654-3/09/$26.00 ©2009 IEEE ICIP 2009
resent image statistics for inpainting [10] and view interpola-tion [11]. They provide a mechanism to represent local imagefeatures, colours and textures which attempts to preserve thespatial information of natural images. Given a trimap, weestimate foreground and background colours for every pixelin the unknown region using a patch-based similarity crite-ria. Initially the background is estimated using an inpaintingtechnique. Foreground colour is estimated for all the pixelsin the unknown region which are dissimilar to the constructedbackground. Foreground colour for each pixel is estimatedby ﬁnding the most similar patch in the known foreground.Finally an alpha matte is generated using the computed fore-ground and background colours.
2. NON-PARAMETRIC IMAGE MATTING
Our technique is split into three main steps: (1) building abackground for the unknown region, (2) estimating a fore-ground colour for every pixel in the unknown region whichis different from the constructed background by a predeﬁneddistance threshold and (3) generating an alpha matte.
2.1. Background colour estimation
Image inﬁlling similar to [10] is used to construct a back-ground in the unknown region from a known background.Fig 1 shows the complete background estimation process.Initially an image is split into three regions, background
Φ
(black), unknown region
Ψ
(gray) and foreground
Θ
(white)as shown in Fig 1a. The contour of
Ψ
where the backgroundand the unknown region meet is found and the backgroundis evolved inwards. We consider a template
ψ
p
centred atpixel
p
on the contour of
Ψ
. The pixels in
ψ
p
can be split intotwo sets,
p
Φ
of background pixels and
p
Ψ
of unknown pixels.Let us denote by
φ
the set of all possible overlapping patchescontained in the background
Φ
with the same dimensions as
ψ
p
. The template
ψ
p
is compared to all the patches in
φ
. Thepixels
p
Φ
are used to ﬁnd the most similar patch
φ
q
in the set
φ
by
φ
q
=
arg min
φ
i
∈
φ
1
n
p
Φ
d
Φ
(
ψ
p
,φ
i
)
,
(2)where, the distance
d
Φ
(
ψ
p
,φ
i
)
between patches
ψ
p
and
φ
i
is the sum of squared difference (SSD) in the
RGB
colourspace for the pixels
p
Φ
in
ψ
p
and the corresponding pixels in
φ
i
. The SSD is normalized by the number of known neigh-boring pixels
n
p
Φ
in the template
ψ
p
to ensure the costs arecomparable. The pixels in
φ
q
corresponding to pixels
p
Ψ
arecopied to ﬁll in the unknown region. The process is iterateduntil the unknown region is completely ﬁlled in as shown inFig 1e.
Fig. 1
: Background estimation: (a) template comparison, (b)srcinal image, (c) trimap, (d) unﬁlled region and (e) ﬁlled inunknown region
2.2. Foreground colour estimation
Once the background is estimated, a predeﬁned thresholdis applied to mark all the pixels which are different fromthe background in the unknown region. A process similarto equation 2, with some modiﬁcation, is applied to thesemarked pixels to estimate their foreground colour. A tem-plate
ψ
p
is centred on a pixel
p
in the unknown region. Letus represent all the marked pixels in the template by
p
. Thedata-set of patches
θ
in the foreground
Θ
is constructed in asimilar fashion to
φ
, background patch data-set in section 2.1.The most similar patch
θ
q
is found by comparing the pixels
p
in the template
ψ
p
to the corresponding pixels of the patch
θ
i
in the data-set
θ
as
θ
q
=
arg min
θ
i
∈
θ
1
n
p
′
d
Θ
(
ψ
p
,θ
i
)
,
(3)where,
n
p
′
is the number of marked pixels in the template
ψ
p
used for normalization and distance
d
Θ
(
ψ
p
,θ
i
)
has the samedeﬁnitionasinsection2.1. Thepartialcomparisonofthetem-plate
ψ
p
to the patch
θ
i
ensures to ﬁnd a similar foregroundstructure in the known foreground region present in the tem-plate
ψ
p
. Noise in the foreground region tends to producesegmentation inaccuracies, so an additional optimization stepis introduced.
2.2.1. Foreground colour optimization
Normalized sum of square difference between
ψ
p
and
θ
i
isgiven by
δ
i
= 1
n
p
′
d
Θ
(
ψ
p
,θ
i
)
.
(4)To optimize the foreground colour for pixel
p
,
δ
is sorted suchthat
δ
k
< δ
k
+1
. Let us denote the triplet of most similarpatches in the set
θ
by
{
θ
1
,θ
2
,θ
3
}
. The foreground colour
f
for pixel
p
is estimated by taking a weighted average of thecentre pixels of the triplet.
f
=
w
1
θ
1
c
+
w
2
θ
2
c
+
w
3
θ
3
c
w
1
+
w
2
+
w
3
(5)
3214
(a) Original (b) Trimap (c) Knockout (d) Hillman (e) Poisson (f) Closed form (g) Robust (h) Non-para (i) Composite
Fig. 2
: Comparison of different techniques on natural imageswhere,
{
w
1
,w
2
,w
3
}
are weights and
{
θ
1
c
,θ
2
c
,θ
3
c
}
are thecentre pixel values of the patches in the triplet. Weights
w
i
are deﬁned as the inverse Euclidean distance in the spatialdomain between the pixel
p
and the centre of patch
θ
i
. In thismanner closer patches receive higher weights. The processis iterated until the foreground colour is estimated for all themarked pixels in the unknown region.
2.3. Alpha estimation
All unmarked pixels in the unknown region which are verysimilar to the background are given the alpha value of zero.We know the estimated background and foreground colour
b
and
f
using equations 2 and 5 respectively. The alpha valueof
i
th
marked pixel in the unknown region is computed usingequation 1 as
α
i
=
c
i
−
b
i
f
i
−
b
i
(6)where
c
i
,
b
i
and
f
i
are the composite, estimated backgroundand foreground colours respectively for the
i
th
marked pixelin the unknown region. Once an alpha matte is computed, theforeground object can be seamlessly composited onto a newbackground.
3. EVALUATION
Wepresentacomparisonoftheproposedtechniquewithotherwellknownmattingalgorithms. Wehaveusedtwonaturalim-ages for qualitative comparison while three composite imageswithknown
α
valuesforquantitativecomparison. Theimagesused were obtain from the data provided by [6, 8]. We haveused ﬁve different techniques for comparison: (1) Knockout2 [5], (2) Robust matting (EZmask) [8] (both are commer-cially provided as a photoshop plug-ins), (3) Hillman method[3], (4) Global Poisson matting [7] and (5) Closed form mat-ting [9]. Although Poisson and Closed form techniques canbe used with limited user interaction in the form of scribblesrather than a trimap, for the sake of fair comparison we haveused a trimap.
3.1. Qualitative evaluation
Fig 2 shows two natural images and alpha mattes computedusing the different techniques. For the ﬁrst image all the tech-niques produced acceptable results. All the parametric tech-niques fail to produce a good alpha matte for the second im-age in the areas where the foreground and background colourdistributions are overlapping or the unknown region is notsmooth. The Robust approach provided a better result buthas some artifacts. Our technique produced results which arevisibly smooth and have no visible artifact in the new com-posites.
3.2. Quantitative evaluation
Fig 3 shows three composite images, ground truth alpha matteand estimated alpha matte using the different techniques. Theground truths are obtained using the triangular approachexplained in [1]. For the ﬁrst two images, Knockout andHillman produce good results because of distinct foregroundand background colour but fail on the third image becauseof the complex background. Poisson produced an erroneousmatte because it is optimized globally for a complex back-ground. Robust approach produced matte with small errors.The Closed form technique produced good results for simplebackground while performed poorly with a complex back-ground. All these mattes have visible artifacts compared tothe ground truth. Our technique produced consistently bet-ter mattes for both simple and complex background with novisible artifacts.Fig 4 shows a bar chart representing the mean square er-ror for the three composite images in Fig 3 against the groundtruth. The errors are calculated only for the unknown regionand alpha value ranges from 0 to 255. Although MSE is notalways correlated to the visual matte quality, it still gives areasonable error comparison. The run time of our Matlab im-plementation of the algorithm for the considered images istypically around three minutes and depends on the image sizeand the unknown region. The run time could be further re-duced by optimizing the algorithm.
3215
(a) Original (b) Trimap (c) Knockout (d) Hillman (e) Poisson (f) Closed form (g) Robust (h) Non-para (i) Ground truth
Fig. 3
: Comparison of different techniques on composite images
Fig.4
: MSEinalphaagainstthegroundtruthfortheunknownregion
4. CONCLUSION
A novel patch based non-parametric natural matting approachis presented. We have utilized an inpainting technique alongwith the patch based foreground colour estimation. A de-tailed evaluation shows that our technique has a clear ad-vantage over previous parametric techniques. The algorithmis robust to both complex background and long foregroundstrands. Future work will concentrate on developing a morerobust matching criteria and incorporating smoothness con-strains to further optimize the alpha matte.
5. REFERENCES
[1] A. R. Smith and J. F. Blinn, “Blue screen matting,” in
ACM SIGGRAPH ’96: Proceedings of the 23rd annualconference on Computer graphics and interactive tech-niques
, 1996, pp. 259–268.[2] Y. Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski,“A bayesian approach to digital matting,” in
Proceed-ings of IEEE CVPR ’01
, vol. 2, December 2001, pp.264–271.[3] P. Hillman, J. Hannah, and D. Renshaw, “Alpha chan-nel estimation in high resolution images and image se-quences,” in
IEEE CVPR
, 2001, pp. 1063–1068.[4] M. A. Ruzon and C. Tomasi, “Alpha estimation in natu-ral images,” in
CVPR
, June 2000, pp. 18–25.[5] A. Berman, A. Dadourian, and P. Vlahos, “Method of removing from an image the background surrounding aselected object.” U.S. Patent 6,134,346, 2000.[6] Y. Y. Chuang, A. Agarwala, B. Curless, D. Salesin, andR. Szeliski, “Video matting of complex scenes,” in
Pro-ceedings of ACM SIGGRAPH
, 2002, pp. 243–248.[7] J. Sun, J. Jia, C.-K. Tang, and H.-Y. Shum, “Poissonmatting,”
ACM Transactions on Graphics
, vol. 23, no. 3,pp. 315–321, 2004.[8] J. Wang and M. F. Cohen, “Optimized color sam-pling for robust matting,”
Computer Vision and Pat-tern Recognition, IEEE Computer Society Conferenceon
, vol. 0, pp. 1–8, 2007.[9] A. Levin, D. Lischinski, and Y. Weiss, “A closed formsolution to natural image matting,”
Computer Vision and Pattern Recognition, IEEE Computer Society Confer-ence on
, vol. 1, pp. 61–68, 2006.[10] A. Criminisi, P. Pérez, and K. Toyama, “Object removalby exemplar-based inpainting,”
Computer Vision and Pattern Recognition, IEEE Computer Society Confer-ence on
, vol. 2, pp. 721–728, 2003.[11] A. Fitzgibbon, Y. Wexler, and A. Zisserman, “Imagebased redering using image based priors,” in
Interna-tional conference on computer vision ICCV
, 2003, pp.1176–1184.
3216

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks