Calendars

A multisize superpixel approach for salient object detection based on multivariate normal distribution estimation

Description
This paper presents a new method for salient object detection based on a sophisticated appearance comparison of multisize superpixels. Those superpixels are modeled by multivariate normal distributions in CIE-Lab color space, which are estimated from
Categories
Published
of 14
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  1 A Multi-size Superpixel Approach for SalientObject Detection based on Multivariate NormalDistribution Estimation Lei Zhu, Dominik A. Klein, Simone Frintrop, Zhiguo Cao ∗ , and Armin B. Cremers  Abstract —This article presents a new method for salient objectdetection based on a sophisticated appearance comparison of multi-size superpixels. Those superpixels are modeled by multi-variate normal distributions in CIE-Lab color space, which areestimated from the pixels they comprise. This fitting facilitates anefficient application of the Wasserstein distance on the Euclideannorm ( W  2 ) to measure perceptual similarity between elements.Saliency is computed in two ways: on the one hand, we computeglobal saliency by probabilistically grouping visually similarsuperpixels into clusters and rate their compactness. On the otherhand, we use the same distance measure to determine local center-surround contrasts between superpixels. Then, an innovativelocally constrained random walk technique that considers localsimilarity between elements balances the saliency ratings insideprobable objects and background. The results of our experimentsshow the robustness and efficiency of our approach against 11recently published state-of-the-art saliency detection methods onfive widely used benchmark datasets.  Index Terms —Saliency detection, Multi-size superpixels,Wasserstein distance, Center-surround contrasts, Cluster com-pactness, Random walk. EDICS Category: 5. SMR-HPM, 2. SMR-SMD, 4.SMR-Rep, 33. ARS-IIU, 8. TEC-MRS I. I NTRODUCTION H UMAN vision is usually capable of locating the mostsalient parts of the scene with a selective attentionmechanism [1]. From the perspective of computer vision,salient region detection is still challenging since the humanattention system has not been fully understood. However, animportant attribute which makes a region salient is that itstands out from its surroundings in one or more visual features.In recent years, saliency detection has become a major researcharea and many computational attention systems have been builtduring the last decade that are based on this center-surroundconcept [2]. Applications of saliency detection include objectdetection [3], [4], image retrieval [5], [6], image and videocompression [7], [8], as well as image segmentation [9], [10]. Copyright (c) 2013 IEEE. Personal use of this material is permitted.However, permission to use this material for any other purposes must beobtained from the IEEE by sending a request to pubs-permissions@ieee.org.Lei Zhu and Zhiguo Cao are with the National Key Lab of Science andTechnology on Multi-spectral Information Processing, School of Automation,Huazhong University of Science and Technology, 430074 Wuhan, China. e-mail: (zhulei.iprai@gmail.com, zgcao@mail.hust.edu.cn). Corresponding au-thor is Zhiguo Cao.Dominik A. Klein, Simone Frintrop, and Armin B. Cremers are with theInstitute of Computer Science III, University of Bonn, 53117 Bonn, Germany.e-mail: ( { kleind, frintrop, abc } @iai.uni-bonn.de). The classical approaches for saliency computation stemfrom the simulation of human attention mechanisms. Theseapproaches compute the saliency of a pixel as the differenceof a center and a surround regions, both of which are centeredat the pixel and can be rectangular or circular [11]–[13]. There-fore, we call these methods the  local saliency approaches . Theselection of surrounding regions is always a difficult problemfor pixel-based or region-based methods due to the ambiguityof unknown object scales. A reasonable solution is the multi-scale scheme that computes the center-surround response atseveral different scales [11], [12], [14], [15]. Some existingapproaches also explore the local contrast on single scale. Inthis case, the surroundings can be chosen as the maximumsymmetric surround [16] or regions of the entire image withspatial weighting [17], [18].Alternative approaches consider the occurrence frequencyof certain features in the whole image, i.e., salient objectsare more likely belonging to parts with rare observations inthe frequency domain [19], [20]. We call these approachesthe  global saliency approaches . Zhai et al. [21] evaluate thepixel-level saliency by contrasting each pixel to all others.Achanta et al. [9] directly assign the salient value of a pixelwith the difference from the average color. By abstracting thecolor components, the global contrast is efficiently computedin [17] at pixel level. Global saliency is also investigated viathe visual organization rule, which can be computationallytransformed into rating the color distribution [22].Different from the methods based on the local or globalcontrast, some researchers work on the priors regarding thepotential positions of foreground and background mathemat-ically or empirically. Gopalakrishnan et al. [23] represent animage as a graph and search the most salient nodes andthe background nodes using the random walk technique. Byanalyzing photographic images, Wei et al. [24] found thatpixels located on four boundaries of an image contain thebackground attributes and validated this prior on two populardatasets. Recently, the assumption of boundary prior wasinvestigated in several graph-based saliency models [25]–[28]and achieved impressive results.In this work, a new segment-based saliency detectionmethod is proposed. We mainly address two problems thatare seldom discussed in previous work:1) Saliency models which take color information as theprimary feature often simply compute the region contrast asthe Euclidean distance between the average colors of regionsor as the histogram-based contrast. The former is efficient and This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.The final version of record is available athttp://dx.doi.org/10.1109/TIP.2014.2361024Copyright (c) 2014 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.  2 reasonable especially when regions are organized as superpix-els. However, it might be imprecise when large regions areconsidered. Conversely, the histogram-based contrast is moreprecise in this case but still suffers from parameter problemssuch as the number of bins and the selection of metric space.Instead, we represent the color appearance of superpixels bymultivariate normal distributions. This bases on the assumptionthat the color occurrences of the pixels in each region follow amultivariate normal distribution. This assumption is especiallywell suited for superpixels since the clustered pixels havesimilar properties in the selected feature space. The differencebetween two superpixels is measured with the Wassersteindistance on the Euclidean norm ( W  2  distance), which wasfirstly introduced to compute the pixel-based saliency in ourprevious work [29]. Additionally, we also propose a fastalgorithm to compute the W  2  distance on N-d (N ≤ 3 ) normaldistributions.2) Holding a uniform saliency distribution of an objectinterior is difficult in the local saliency computation that isbased on the center-surround principle. Typically, this problemcan be alleviated by combining multi-layer saliency maps or,smoothing the single layer saliency map at pixel level [18].We propose a locally constrained random walk procedureto directly refine the local saliency map at region level andachieve a more balanced rating inside of probable proto-objects. On the one hand, this approach can improve the finalcombination results. On the other hand, compared to the Gaus-sian weight-based up-sampling [18], it avoids spreading theerror of saliency assignment to the background regions wheninappropriate Gaussian weights for controlling the sensitivityto color and position are selected.Thus, in a nutshell, the main contributions of this paper are •  A new representation of superpixels by multivariate nor-mal distributions and their comparison with the Wasser-stein distance, which is consistently used throughout theapproach for local as well as global saliency computation.It enables to combine the advantage of the rich infor-mation of probability distributions to represent featurestatistics with a computationally efficient method forrepresentation and comparison. •  A novel saliency flow method, which is a locally con-strained random walk procedure to refine the local salien-cy map. It achieves a more balanced rating inside of probable proto-objects and improves the performancesignificantly.II. R ELATED  W ORK The detection of visual saliency is one of the two aspects of human visual attention: bottom-up and top-down attention [1],[30]. Bottom-up attention relates to the detection of salientregions in the perceptual data by purely analyzing this datawithout any additional information. Top-down attention on theother hand considers prior knowledge about a target, the con-text, or the mental state of the agent. While top-down attentionis an important aspect in human attention, prior knowledge isnot always available and many computational methods profitfrom purely determining the bottom-up saliency. Among theseapplication areas are object detection and segmentation, thatwe will consider here. Thus, we concentrate on the followingapproaches that deal with bottom-up saliency detection.  A. Pixel-based Saliency The local contrast principle assumes that the more differentan image region is compared to its local surround the moresalient it is. One of the first pixel-based methods to detectsaliency in a biologically motivated way was introduced byItti et al. [11]. Their  Neuromorphic Vision Toolkit (iNVT) computes the center-surround contrast at different levels inDoG scale space and searches the local maximum responseswith a Winner-Take-All network. Harel et al. [31] extend theapproach of Itti by evaluating the saliency with a graph-basedmethod. In a recent approach, Goferman et al. [32] followseveral basic principles of human attention and assume that thepatches which are distinctive in colors or patterns are salient.The algorithm proposed by Achanta et al. [16] produces ansrcinal scale saliency map which can keep the boundariesof salient objects by accumulating the information of thesurrounding area of each pixel. Milanfar and Peyman [33]compute the center-surround contrast of each pixel using akind of local structure called  LSK   which is robust to thenoise and variation of luminance. The approach of Liu etal. [14] combines local, regional, and global features in a CRFbased framework. Li et al. [34] propose a method using theconditional entropy under the distortion to measure the visualsaliency, which is also a center-surrounding scheme.The pure global approaches assume that the more infrequentfeatures occur in the whole image, the more salient they are.In [19] and [35], Hou et al. assign higher saliency values tothose pixels which have higher response to the rare magnitudesin amplitude spectrum, and identify others as the redundantcomponents. However, Guo et al. [20] found the image’s phasespectrum is more essential than the amplitude spectrum toobtain the saliency map. Achanta et al. [9] also assume thatthe background has lower frequencies, and directly comparedeach pixel with the entire image in color space.The global principle only works well if the background isfree of uncommon feature manifestations. On the other hand,the local contrast principle involves the difficulty to estimatethe scale of a salient object. To avoid this problem, suchmethods usually define several ranges of a pixel’s neighbor-hood or construct a multi-level scale space of the originalimage. However, these local methods suffer more from theboundary blurring problem, since on unsuitable scales theforeground/background relation cannot be clearly decided.  B. Segment-based Saliency Segment-based methods take homogeneous regions as thebasic element rather than pixels. Cheng et al. [17] segment theimage into regions with the algorithm proposed in [36], andobtain the saliency map by computing the distance betweenhistograms which are generated by mapping the color vectorsof each region into a 3D space. The same pre-segmentationmethod was also used in [13] and [37]. Instead of computingthe dissimilarity between regions directly, Wei et al. [13] This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.The final version of record is available athttp://dx.doi.org/10.1109/TIP.2014.2361024Copyright (c) 2014 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.  3   (c) Spatial distribution scoring(normalized to [0, 1]) (a) Multi-size superpixel extraction (h) Final saliency map(b) Multivariate normal distribution estimation(d) Local contrast with boundary weight(e) Locally constrained saliency flowing   Size level t, t = 0,1, …    ,L Size level 0 Distribution of featuresGlobal and local measurementsFusion of measures over all levels Basic element extraction  L 2  - norm Wasserstein distance 0. 639 0. 113 Exemplar Superpixel Local SaliencyGlobal Saliency (f) Global saliency map(g) Local saliency map   Fig. 1. The overall algorithm flowchart of out method. The structure of the algorithm is exemplarily presented for two scales. (a): each region surroundedby the red curves refers to one superpixel. (b): each superpixel is represented by the multivariate normal distribution estimated from its pixels. Based on the L 2 -norm Wasserstein distance between every pair of superpixels, the local and global saliency are obtained. (c):  global saliency  computation: superpixels areclustered according to their color similarity and exemplar superpixels (cluster centers) are determined. The two images show exemplarily two of the clusters,the corresponding exemplar superpixels, and the cluster saliency scores that measure the spatial distribution of a cluster. (d) and (e): the  local saliency  iscomputed by a local contrast approach based on superpixels, which is further refined by a saliency flow step. (h): the final saliency map is obtained by fusingthe global and local saliency maps ((f) and (g), respectively) over all scales. obtain the saliency of an image window by computing thecost of composing the window from the remaining imageparts. Park et al. [37] merge regions with their neighborsrepeatedly according to the similarity of their HSV histograms,and update the saliency of joint regions in every combination.Ren et al. [38] first extract the superpixels from the imagewhich are further clustered with GMM, and use the PageRank algorithm to obtain the superpixel-level saliency. Perazzi etal. [18] obtain the region-level saliency map by measuringthe color uniqueness and spatial distribution of each extractedsuperpixel. A finer pixel-level saliency map is produced bycomparing each pixel with all superpixels in both color spaceand location. Wei et al. [24] firstly proposed the backgroundprior which assumes that the boundaries of an image caneffectively represent the background components. Followingthis idea, Yang et al. [25] consider saliency detection as agraph-based ranking problem and use the label propagation todetermine the region-level saliency. A similar graph model isemployed in [26], which casts saliency detection into a randomwalk problem in the absorbing Markov chain.III. M ULTI - SIZE  S UPERPIXEL - BASED  S ALIENCY D ETECTION We propose a superpixel based method for bottom-up de-tection of salient image regions. An image is segmented into acompound of visually homogeneous regions at different scalelevels for representing its fine details as well as large scalestructures. On each scale, two complementary approaches forthe determination of saliency are employed separately: 1) Ina global way, we measure the spatial compactness of similar-looking parts. Superpixels are more salient if they form amore coherent cluster within the image when categorized bytheir color appearances. 2) In a local way, we compute thecenter-surround contrast at the superpixel level. The more asuperpixel differs from its surrounding ones, the more salientit is. Local contrast approaches usually grasp every pop-outregion whose scale fits the current center-surround structure.That is, isolated background regions with an appropriate scaleare also emphasized. In our work, the boundary prior [24] isused to eliminate the highlighted background regions. Further-more, the local saliency map is refined by a locally constrainedrandom walk procedure that dilutes saliency in the backgroundand likewise balances it inside potential objects.We assume that the appearance of pixels grouped into onesuperpixel is well expressed by the associated ML-estimateof a multivariate normal distribution in CIE-Lab space. Thisrepresentation enables to efficiently measure visual differ-ence/similarity between superpixels using the Wasserstein dis-tance on the Euclidean norm [29]. Figure 1 shows a flowchartof our system.  A. Multi-size Superpixel Extraction We use the SLIC superpixel extraction method introducedin [39], which divides an image into adjacent segments of about the same size containing as homogeneous colors insideas possible. For a given number of superpixels, the image isinitially segmented into regularly sized grid cells. Then, iter-ative  K-Means  clustering is performed on a feature space thatcombines CIE-Lab colors and pixel locations. This clusteringof nearby, similar-looking pixels refines the cells into superpix-els. As mentioned in Section I, we extract superpixels at multi- This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.The final version of record is available athttp://dx.doi.org/10.1109/TIP.2014.2361024Copyright (c) 2014 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.  4 size levels. This is achieved by repeating the SLIC algorithmwith different numbers of desired clusters, thus initializingwith a coarser or finer grid. In our method, we increase thenumber of superpixels in steps of factor 2 between scale levels.The images in Figure 2a show examples of segmentationresults with different superpixel sizes. Notice that we ensurea minimal cell size of 100 pixels when initializing the grid,because with less pixels it becomes increasingly unlikely toget meaningful appearance distributions. In our experiments,we analyzed  L  = 3  scale levels.  B. Superpixel Representation and Superpixel Contrast  We express the occurrence of low-level features in eachsuperpixel by means of multivariate normal distributions. Asargued in Section I, the unimodal distribution assumption isappropriate for superpixels. Different to [29], who splits thefeature space into a one-dimensional lightness plus a two-dimensional color distribution, we directly use the srcinalthree dimensions of CIE-Lab color space. For conversion fromRGB web images, we assume the D65 standard illuminant tobe most suitable. For the notations in the following sections,the  i th superpixel of scale  t  forms a set: S  ti  =   N  S ( µ, Σ) ,c S  =  xy  ti (1)comprised of its feature distribution  N  S ti  and spatial center c S ti in image coordinates. 1 Several measuring techniques for distribution contrasts suchas the KL-divergence [12], the Conditional Entropy [34]and the Bhattacharyya distance [40] have been employedin previous methods to identify local differences. Recently,Klein and Frintrop [29] applied the Wasserstein distanceon the  L 2 -norm between feature distributions gathered fromGaussian weighted, local integration windows. We continuethis idea, but instead, employ the Wasserstein metric to scorecontrasts between superpixels. The Wasserstein distance on theEuclidean norm in real-valued vector space is defined as W  2 ( χ,υ ) =   inf  γ  ∈ Γ( χ,υ )   R n × R n || X   − Y  || 2 d γ  ( X,Y  )  12 ,  (2)where  χ  and  υ  are probability measures on the metric space ( R n ,L 2 )  and  Γ( χ,υ )  denotes the set of all measures on R n × R n with marginals  χ  and  υ . Briefly worded, the Wasser-stein distance represents the minimum cost of transformingone distribution into another, taking into account not only theindividual difference in each point of the underlying metricspace, but also how far one has to shift probability masses.In machine vision, the discretized  W  1  distance is also wellknown as  Earth Mover’s Distance  and widely used to comparehistograms.The calculation of Eq. (2) is very demanding for arbitrary,continuous distributions, but thankfully can be solved to a 1 Note that  N  S  denotes the normal distribution representing a superpixel,while  N  C , that will be introduced in Section III-C, denotes the normaldistribution representing a cluster. more facile term in case of normal distributions. As intro-duced in [41] 2 , an explicit solution for multivariate normaldistributions  N  1 ( µ 1 , Σ 1 )  and  N  2 ( µ 2 , Σ 2 )  is W  2 (  N  1 ,  N  2 ) =  || µ 1 − µ 2 || 2 + tr  Σ 1  + Σ 2 − 2   Σ 1 Σ 2  12 =  || µ 1 − µ 2 || 2 + tr(Σ 1 ) + tr(Σ 2 ) − 2tr   Σ 1 Σ 2  12 .(3)In general, there is no explicit formula to obtain the squareroot of an arbitrary  n  ×  n  matrix for  n >  2 , which wouldlead to an iterative algorithm for determining √  Σ 1 Σ 2  inEq. (3). However, noticing the relationship between the traceand the eigenvalues of a matrix, the trace of  √  Σ 1 Σ 2  can berepresented as tr   Σ 1 Σ 2  = n  k =1 λ Σ 1 Σ 2 ( k ) 12 ,  (4)where  λ Σ 1 Σ 2 ( k )  is the  k th eigenvalue of   Σ 1 Σ 2 .Considering a  n  = 3  dimensional space such as CIE-Lab,given a  3 × 3  matrix  A , its characteristic polynomial can berepresented as det( λ A I   − A ) = λ 3 A − λ 2 A  tr( A ) −  12 λ A  tr( A 2 ) − tr 2 ( A )  − det( A ) ,(5)where  λ A  is an eigenvalue of   A .  λ A  can be directly determinedusing a trigonometric solution introduced in [43] by makingan affine change from  A  to  B  as A  =  pB  + qI  . (6)Thereby,  B  is a matrix with the same eigenvectors as  A ∀  p ∈ R \ 0 ,q   ∈ R ⇒ v A  =  v B , (7)thus from the definition of eigenvalues it holds that Def., Eqs. (6), (7) ⇐⇒  λ A  =  p · λ B  + q  , (8)where  λ B  is an eigenvalue of   B .Choosing  p  =   tr(( A − qI  ) 2 / 6)  and  q   = tr( A ) / 3  3 aswell as considering Eq. (5) to Eq. (8), the characteristicequation of   B  can be simplified to det( λ B I   − B ) =  λ 3 B  − 3 λ B  − det( B ) = 0 . (9)By directly solving Eq. (9), one can get all three eigenvaluesof   B  as λ B ( k ) = 2cos  13 arccos  det( B )2  + 2 kπ 3  , (10)where  λ B ( k )  is the  k th eigenvalue of   B  with  k  = 0 , 1 , 2 .Thus, Eq. (3) can be applied to quickly compute meaningful 2 A slightly different term was later introduced in [42], claiming that Eq. (3)is only valid in case of commuting covariances. However, we could show thatboth solutions are equivalent, because  Σ 1 Σ 2  = √  Σ 1 ( √  Σ 1 Σ 2 )  has the samecharacteristic polynomial as  ( √  Σ 1 Σ 2 ) √  Σ 1 , thus has the same eigenvalues. 3 This choice guarantees the validity of Eqs. (6) and (8) also in the specialcase  p  = 0 , since this would imply  A  =  qI  , thus it has a triple eigenvalue λ A  =  q  = tr( qI  ) / 3 . This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.The final version of record is available athttp://dx.doi.org/10.1109/TIP.2014.2361024Copyright (c) 2014 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.  5 (a)(b)Fig. 2. An example of multi-size superpixel segmentation and the corre-sponding global saliency maps. (a): from left to right, the initial grid area insuperpixel extraction decreases approximately in steps of 2. (b): the imagesare the obtained global saliency maps corresponding to each scale in (a). appearance distances between two superpixels using Eqs. (4),(6), (8), and (10).In the following, we use the  W  2  distance coherently inthe different aspects of saliency computation: it first servesas a similarity measure in the clustering approach for globalsaliency computation (Section III-C), second, it measures thelocal contrast of a superpixel to its neighbors (Section III-D),and third, it provides the similarity metric required for the ran-dom walk process that enables the saliency flow computationintroduced in Section III-E. C. Global Saliency: The Spatial Distribution of Colors In natural scenes, the colors of regions belonging to thebackground are usually more spatially scattered in the wholeimage than in salient regions. In other words, the more thecolor is spread, the less salient it is [22]. To determinethe spatial spreading, a further clustering is needed. This iscomputed much more efficiently on superpixels than wouldbe possible on pixel level, since there are much less elements.Thereby, the spatial distribution of colors can be estimated interms of a higher cluster-of-superpixels level by comparingthe spatial intra-cluster distances. GMM method is widelyused to represent the probabilities of color appearance, suchas in [14], [38], [44]. However, it may be inappropriate toassign a fixed number of clusters for different images, sincethis should depend on the image complexity. e.g., a clutteredscene has much more dominant colors than one showinga monotonous background. We employ the APC algorithm(Affinity Propagation Clustering) introduced in [45] to identifyclusters. Here, it is not necessary to initialize the cluster centersas well as the number of clusters.APC is based on the similarities between elements (super-pixels). It tries to minimize squared errors, thus in our method,we use  − ( W  2 (  N  S ti ,  N  S tj )) 2 obtained by Eq. (3) between eachpair of superpixels  S  ti  and  S  tj . Figure 1(c) shows exemplarilytwo identified clusters. Compatible to superpixels, the  k th cluster on scale  t  forms a set C tk  =   N  C ( µ, Σ) ,c C  tk  . (11)APC selects so called  exemplar   superpixels to become clustercenters. Thus, we define the cluster appearance model  N  C  toequal the one of its corresponding exemplar superpixel. Thespatial center of a cluster in image coordinates is computedfrom a linear combination of superpixel centers weighted bytheir cluster membership probability: c C tk  =  M( t ) i =1  P  g ( C tk |S  ti ) · c S ti  M( t ) i =1  P  g ( C tk |S  ti ) , (12)where  M( t )  denotes the number of superpixels on scale  t .Note that APC is also employed to group GMMs in [46].Different from that work, the inherent message exchangedin APC is further explored to facilitate the computation of  P  g ( C tk |S  ti ) . The membership probability of a superpixel toeach cluster can be computed from its visual similarity to theexemplar of that cluster. Converting distances to similaritiesusing Gaussian function has been widely adopted by numerousmethods [18], [25], [26], [46], [47]. However, the fall-off rateof the exponential function is often selected empirically. In thissection, we take advantage of the messages that are propagatedbetween superpixels for directly determining the membershipprobabilities [48].Let  X  tk  denote the exemplar of cluster  C tk  and then, let r ( S  ti , X  tk )  denote the exchanged message named  responsibility which represents how well-suited superpixel  X  tk  is to serve asthe exemplar for superpixel S  ti . Actually,  r ( S  ti , X  tk )  implies thelogarithmic form of the cluster membership probability [45].Let  B t denote the set that is composed of all non-exemplarsuperpixels. We first normalize all responsibilities between thesuperpixels in  B t and exemplar  X  tk  to  [ − 1 , 0]  (denoting as ˆ r ( B t , X  tk ) ) then exponentially scale them as ˆ r e ( B  ti , X  tk ) = exp  ˆ r ( B  ti , X  tk )  Var  ˆ r ( B t , X  tk )  , (13)where  ˆ r ( B  ti , X  tk )  refers to the normalized responsibility be-tween the non-exemplar superpixel  B  ti  and exemplar  X  tk , and Var( · )  refers to the variance. For exemplars, we simply assigntheir scaled responsibilities as ˆ r e ( X  ti , X  tk ) =  1 ,  if   i  =  k 0 ,  otherwise. (14)Eqs. (13) and (14) construct the scaled responsibilities be-tween all superpixels to each cluster. Then, the intra-clusterprobabilities of each superpixel can be computed as P  g ( C tk |S  ti ) = ˆ r e ( S  ti , X  tk )  K( t )  k =1 ˆ r e ( S  ti , X  tk ) , (15)where  K( t )  is the number of clusters on scale  t . Next, wecompute the probability of being salient for cluster  C tk . Thisprobability value is obtained by scoring the relative spatialspreading between the superpixels within the cluster: P  g ( sal  |C tk ) = 1  K( t )  j =1  M( t ) i =1  P  g ( C tk |S  ti ) ·|| c S ti  − c C tj || 2  M( t ) i =1  P  g ( C tk |S  ti ) , (16)where  Sal   =  { sal  , ¬ sal  }  is a binary random variable, in-dicating whether something is salient, that means, whether This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication.The final version of record is available athttp://dx.doi.org/10.1109/TIP.2014.2361024Copyright (c) 2014 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.
Search
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks