Legal forms

A Semi-automatic Approach to Photo Identification of Wild Elephants

A Semi-automatic Approach to Photo Identification of Wild Elephants
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  A Semi-automatic Approach to PhotoIdentification of Wild Elephants Alessandro Ardovini 1 , Luigi Cinque 1 , Francesca Della Rocca 2 ,and Enver Sangineto 1 1 Computer Science Department, University of Rome “La Sapienza”, via Salaria 113,00198 Rome, Italy 2 Department of Animal and Human Biology, University of Rome “La Sapienza”,viale dell’Universit`a 32, 00185 Rome, Italy Abstract.  Zoologists studying elephant populations in wild environ-ments need to recognize different individuals from photos taken in dif-ferent periods of time. Individuals can be distinguished by the shape of the nicks on their ears. Nevertheless, shape comparison is not trivial dueto a highly cluttered background. We propose a method for partially,non-connected curve matching able to compare photos of elephant ears. 1 Introduction We present in this paper a photo identification system for elephant recognition.Elephant conservation and study need to trace the movements of wild individualsof a given population during time. Currently one of the most common, non-invasive and cheap technique is (human-made) photo identification. Elephantsare distinguished using some biological features, such as the characteristic shapeof the nicks present in their ears (e.g., see Figure 1 (a)). We propose a semi-automatic elephant identification system based on a partially defined matchbetween the set of curves representing the nicks of a query photo and the set of curves of the nick of each database photo.For a computer vision point of view, the problem is not trivial because of theusually highly cluttered images representing the interesting shapes. The finalresult of a common edge detection applied on typically available (low-resolution)images is composed of a non-connected set of nick curves which must be matchedwith the shapes of the nicks in the database (e.g., see Figure 1 (c)).A similar problem is dealt with in [1], where the authors use the character-istic shape of the dorsal fin for dolphin identification. The dorsal fin curve isrepresented by means of an attributed string and a string matching procedure isused in order to estimate the similarity between two curves. Nevertheless, rep-resenting a real curve by means of a string is an unstable operation in whichminor noise can produce very different strings to deal with. Analogously, in [4]the characteristics curves of dolphin fins, sea lion flippers and grey whale flukesare used to respectively recognize individuals of the three cetacean species. Alsoin this case the assumption is a one-to-one curve matching. Both the mentionedsystems need a human intervention in the image segmentation phase. J. Mart´ı et al. (Eds.): IbPRIA 2007, Part I, LNCS 4477, pp. 225–232, 2007.c   Springer-Verlag Berlin Heidelberg 2007  226 A. Ardovini et al.(a) (b) (c) Fig.1.  (a) Elephant photos. Top row: the nicks are marked in yellow. Sometimeselephant identification using the nick shape is a non-trivial task also for human beings(e.g., bottom row). (b) The two human-input points used for the ear reference system.(c) The edge map of Figure 1 (b). The problem of matching multiple curves in a cluttered background image israrely dealt with in computer vision literature, since the most of the existingcurve matching approaches implicitly or explicitly assume to compare a pair of isolated curves rather than two sets of curves. Some examples of curve match-ing methods are:  curvature scale space   [6],  Fourier descriptors   [3] and  shape signature  -based approaches [1], which assume to deal with isolated silhouettesupon a uniform background (which is clearly not our case).  Active contour   meth-ods [5] can deal with non-uniform backgrounds but their iterative energy min-imization process can easily be trapped in false local minima in images witha complex background like the ones representing elephants in an uncontrolledenvironment (e.g., see Figure 1 (c)).In our system the problem of reliably matching multiple curves in a clutteredbackground is partially alleviated using human input information. The humanoperator approximately selects the nicks’ positions using the mouse. Then, thesystem produces a set of model-image rigid transformation hypotheses whichrefine the segmentation-detection process looking for the transformation whichminimizes the dissimilarity between the compared nicks. Each transformationhypothesis is evaluated by taking into account both the position of the nickswith respect to the whole ear and the nicks’ shapes. 2 Extraction of the Nick Curvature In this section we show the different phases of the semi-automatic segmentationprocess performed on each image when it is either off-line stored in the system’srepository or used as on-line query.First of all, the user is requested to input some reference points in order toapproximately define the elephant’s head position, orientation and scale. Withthe assumption that the ground direction is given by the bottom of the image,only two points are sufficient to define a similarity-invariant reference system for  A Semi-automatic Approach to Photo Identification of Wild Elephants 227 the whole head and to automatically select the right/left profile. In fact the useris asked for clicking on the eye’s central position ( a ) and on a second point ( b ) onthe ear’s boarder which is the farthest from the ear attachment (Figure 1 (b)).When either  a  or  b  are not visible due to occlusions or other reasons, the usercan click on an approximate estimation of their positions (e.g., Figure 1 (b)).Since the definition of   b  is a bit arbitrary and since different users can selectdifferent points  a  and  b  for the same image, the spatial information so obtainedcan only be approximately used in order to estimate the nicks’ positions withrespect to the whole ear (see Section 2).Once  a  and  b  have been selected, the system performs an edge detection(Figure 1 (c)) using the standard Canny edge detector [2] whose binarizationthreshold can possibly be varied by the user by means of a slider bar providedby the system’s interface. Let us call  I  ′ the edge map of the image  I  .  I  ′ isshown to the user who is then requested to input the initial ( e ) and the final( f  ) endpoint for each nick (see Figures 2 (b) and (c)). The order in which  e and  f   are inserted is not important. Once a pair of endpoints for every nick hasbeen input, the system automatically computes the shortest path in  I  ′ between e  and  f  . If   e  and  f   are not connected or the computed path is not the desiredone, the user can add/delete points on  I  ′ using a suitable GUI. This is theonly time-consuming human intervention requested by our system, which is onlyoccasionally necessary for few points per nick. However this manual operationtakes few dozens of seconds per nick.The path between  e  and  f   gives the set of edge points defining the shape of the nick curve (Figure 2 (d)), which is subsampled using a fixed number ( n ) of points (currently,  n  = 25). Moreover, the endpoints  e i  and  f  i  of a given nick P  i  of   I   are normalized with respect to a symmetric-invariant reference systemdefined using the reference points  a I   and  b I   of   I  . Let  x  =  b I   − a I  , and  d  =   x  2 .Moreover, let  v 1  =  1 d x  (see Figure 1 (b)); and v 2  be an unitary vector centered in a I   and orthogonal to  v 1  whose direction is fixed swapping a  π/ 2 angle clockwisefrom  x  for the right ear and counterclockwise for the left ear If   e ′ is an endpointselected by the user, the normalized nick endpoint  e  is then given by: e  = 1 d  v T  1  ( e ′ − a I  ) v T  2  ( e ′ − a I  )   (1)and the same is for  f  . The reference system defined by  v 1  and  v 2 , centered in  a I  and with unit of length  d  is invariant with respect to symmetric transformationsof the head appearance.Finally, if the image is used as query ( Q ), we indicate with  Q 1 ,...Q N   its nicks,being each  Q j  (1  ≤  j  ≤  N  ) associated with the endpoints  g j  and  h j , obtainedusing Equation (1) and the corresponding reference points  a Q  and  b Q . 3 Curve Sequence Matching Given a database image  I   with  P  1 ,...P  M   nicks and a query image  Q  with Q 1 ,...Q N   nicks, the problem is to find a portion of   I   and a portion of   Q  which  228 A. Ardovini et al. are similar, i.e., contain the same nicks. Even if both  I   and  Q  represent the sameelephant’s ear  E  , we can have  N    =  M  . In fact, a portion of   E   can be partiallyoccluded either in  I   or in  Q  due to possible trees or other elephants. Moreoverthe lobe and other parts of the ear are sometimes folded and non visible.We assume in the following that, if both  Q  and  I   represent the same ear  E  of the same elephant, then there is a unique portion  V   of   E   which is visible inboth  I   and  Q . This is a very common situation in photos taken by zoologists for(human) elephant identification and leads to searching for a unique sequence  V  of nicks belonging to both  Q  and  I   (see Figure 2 (a)).The aim of the global matching phase is to compare  Q 1 ,...Q N   with  P  1 ,...P  M  looking for two subsets of   k  nicks  Q j ,Q j +1 ,...Q j + k − 1  and  P  i ,P  i +1 ,...P  i + k − 1 with similar positions with respect to the ear’s reference points, respectively a I  ,  b I   and  a Q ,  b Q  (see Section 2). While in this step we are interested only incomparing the  positions   of   Q j ,Q j +1 ,...Q j + k − 1  and  P  i ,P  i +1 ,...P  i + k − 1 , in Section4 we show how the system compares the  shapes   of the two nick sets.Note that if   M   ≤  N   then  V   is delimited by either  P  1  or  P  M   (see Figure 2 (a)).Vice versa, if   M > N   then  V   is delimited by either  Q 1  or  Q N  . The algorithmfor the case in which  M   ≤  N   is described below, while the case  M > N   is dealtwith analogously by exchanging  Q 1 ,...Q N   with  P  1 ,...P  M  . Global Matching (  Q 1 ,...Q N  ,  P  1 ,...P  M  ) 1  µ  :=  ∞ .*** Case  P  1  is matched with  Q j  (  j  ∈  [1 ,N  ]) ***2 For each  j  ∈  [1 ,N  ], do:3  k  := min { M,N   −  j  + 1 } . *** This is the size of   V   ***4 If   CheckPosition ( P  1 ,...P  k ,Q j ,...,Q j + k − 1 ,k ), then:5  d  :=  Shape Difference   ( P  1 ,...P  k ,Q j ,...,Q j + k − 1 ,k ).6 If   µ > d  then  µ  :=  d .*** Case  P  1  cannot be matched with any  Q j  ***7 For each  j  ∈  [1 ,M   −  1], do:8 If   CheckPosition ( P  M  − j +1 ,...,P  M  ,Q 1 ,...,Q j ,j ), then:9  d  :=  Shape Difference   ( P  M  − j +1 ,...,P  M  ,Q 1 ,...,Q j ,j ).10 If   µ > d  then  µ  :=  d .11 Return  µ .Referring to Figure 2 (a), the case in which  P  1  can be matched with any  Q j is dealt with by the above algorithm in Steps 2-6, with, either  k  =  M   or  k  = N   −  j  +1, depending on whether  P  M   belongs to  V   or not. Steps 7-10 deal withthe case in which  P  1  does not belong to  V   . The function  CheckPosition () isresponsible to check the position consistency of the current ordered subsequenceof matched nicks. This is done using the corresponding normalized endpoints (seeSection 2) and verifying that, for each couple ( P  l ,Q l ),  l  ∈  [1 ,k ], the distances  e l  −  g l  2  and   f  l  −  h l  2  are both lower than a given threshold (empiricallyfixed).Once all the possible subsequence matches have been analyzed, the minimumcomputed dissimilarity value ( µ ) is returned in Step 11. This is the value which  A Semi-automatic Approach to Photo Identification of Wild Elephants 229 will be used to rank the current photo  I   with respect to the other databaseimages. The resulting decreasing order sequence of database photos is finallyshown to the user. Figure 3 shows the first three images ranked by the systemin correspondence of the query shown in Figure 1 (b). 4 Shape Comparison Given a query ( Q i ) and an image ( P  i ) nick, the shape difference of the corre-sponding curves can be estimated by subsampling  Q i  and  P  i  using the samenumber of points ( n ) and then computing the squared Euclidean distance be-tween pairs of corresponding points ( q j , p j ), such that:  q j  ∈  Q i  and  p j  ∈  P  i (1  ≤  j  ≤  n ). However, before to compare  Q i  and  P  i , the two curves have to bealigned, i.e., they must be represented in a symmetric-invariant reference systemin order to take into account for possible viewpoint changes. Nevertheless, sincethe shape comparison of two curves must be  carefully   estimated, we cannot usethe reference points  a I  ,  b I  ,  a Q  and  b Q  for this purpose (see Sections 2 and 3).Rather, we use the nick-specific endpoints  e i ,  f  i ,  g i  and  h i . Moreover, we have totake into account that a user can select slightly different endpoints for the samenick in different photos of the same animal. For this reason we compare the nickdefined by the endpoints  e  and  f   with the nick defined by a pair of endpointschosen  in the proximity of   g  and  h .If   g i  and  h i  are the endpoints of   Q i  and  Q ′ is the edge map of   Q , then theneighborhoods  N  ( g i ),  N  ( h i ), respectively of   g i  and  h i  are defined using: N  ( x ) =  { p ′ ∈  Q ′ :   x − p ′  2  ≤  λ, and  x  and  p ′ are connected } ,  (2)where  λ  is a distance threshold. We omit the algorithm for the effective construc-tion of   N  ( g i ) and  N  ( h i ) since it is a trivial visit of the connected edge point setof   Q ′ starting from, respectively,  g i  and  h i .The below algorithm  Shape Difference   shows how  N  ( g i ) and  N  ( h i ) are usedin order to hypothesize a set of symmetry transformations able to align  P  i  withthe nick  Q i . The transformation hypotheses depend on different choices of theendpoints respectively in the set  N  ( g i ) and  N  ( h i ). The algorithm is based on thewell known geometric observation that, given two pairs of points  q 1  = ( x 1 ,y 1 ) T  , q 2  = ( x 2 ,y 2 ) T  and  p 1  = ( x ′ 1 ,y ′ 1 ) T  ,  p 2  = ( x ′ 2 ,y ′ 2 ) T  , there is a unique (up to a re-flection) symmetric 2D transformation  T   such that  p 1  =  T  ( q 1 ) and  p 2  =  T  ( q 2 ).We represent  T   by means of a vector of four parameters:  T   = ( t x ,t y ,θ,s ) T  ,where  t x  and  t y  are the translation offsets,  θ  is the rotation and  s  the scaleparameter. The parameters of   T   are given by the following quadruple:  s  =   q  2 /  p  2 ,θ  = arccos  q T  p  q  2 · p  2 ,t x  =  x ′ 1  −  s  ·  ( x 1  ·  cos θ  −  y 1  ·  sin θ ) ,t y  =  y ′ 1  −  s  ·  ( x 1  ·  sin θ  +  y 1  ·  cos θ )  , (3)where  q  =  q 2  − q 1  and  p  =  p 2  − p 1 . The details of the algorithm are:
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks