A security assistance system combining person tracking with chemical attributes and video event analysis

A security assistance system combining person tracking with chemical attributes and video event analysis
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  A Security Assistance System combiningPerson Tracking with Chemical Attributesand Video Event Analysis C. Becher a , G.L. Foresti b , P. Kaul a , W. Koch c , F.P. Lorenz c , D. Lubczyk  d  , C. Micheloni b ,C. Piciarelli b , K. Safenreiter c , C. Siering d  , M. Varela c , S.R. Waldvogel d  and M. Wieneke ca Department of Natural Sciences, University of Applied Sciences Bonn-Rhein-Sieg, St. Augustin, Germany b Department of Computer Science, Universit`a degli Studi di Udine, Udine, Italy c Department of Sensor Data and Information Fusion, FGAN-FKIE, Wachtberg, Germany d  Kekul´e-Institute for Organic Chemistry and Biochemistry, University of Bonn, Bonn, Germany  Abstract —Timely recognition of threats can be significantlysupported by security assistance systems that work continuouslyin time and call security personnel in case of anomalous eventsin the surveillance area. We are describing the concept andthe realization of an indoor security assistance system for real-time decision support. The system consists of a computer visionmodule and a person classification module. The computer visionmodule provides a video event analysis of the entrance region infront of the demonstrator. After entering the control corridor,the persons are tracked, classified and potential threats arelocalized inside the demonstrator. Data for person classificationare provided by chemical sensors detecting hazardous materials.Due to their limited spatio-temporal resolution, a single chemicalsensor cannot localize this material and associate it with aperson. We compensate this deficiency by fusing the outputof multiple, distributed chemical sensors with kinematical datafrom laser-range scanners. Considering both the computer visioninformation and the results of the person classification affords thelocalization of threats and a timely reaction of security personnel. Keywords: Event Recognition, Behavior Analysis, PersonTracking, Classification, Probabilistic Multiple HypothesisTracking (PMHT), Attributes, Security Assistance, QuartzMicro Balance (QMB), TATP, Explosives I. I NTRODUCTION Freedom of movement for people as well as freedom tocome together safely in open public events or utilities is vitalto each citizen. The defence of this freedom against ubiqui-tous threats requires the development of intelligent securityassistance systems that comprise state-of-the-art surveillancetechnology and work continuously in time. In our work wedemonstrate core functions of an indoor security assistancesystem for real-time decision support that is based on a het-erogeneous sensor suite and multiple sensor fusion techniques.Within this system potential threats are classified, tracked Further author information: (Send correspondence to M. Wieneke) a { christopher.becher, peter.kaul } b { michelon, piccia, foresti } c { w.koch, lorenz, safenreiter, varela, wieneke } d  { lubczyk, carsten.siering, waldvogel } and localized in order to focus the attention of the securitypersonnel.In many security-relevant utilities, there exist well-definedaccess regions, e.g. stairways, escalators or gangways. Themost efficient way to solve the surveillance task is to focus onthese access regions and to continuously monitor and analyzethe dynamic events that occur when persons enter the utility.We propose to divide the system into two parts: a video-controlled area in front of the access region and a scanner-controlled area covering the actual access region (fig. 1). Figure 1. Concept of our security assistance system The video surveillance module provides event recognitionand classification of the events into several alarm levels (fig. 1– top right). In the explicit event recognition approach, thesystem has an explicit knowledge of the events that must beidentified, and once an event is detected, it can be properlylabeled with a semantic description. The fundamental part of an explicit recognition system is thus an  a priori  knowledgebase, where all the information about the recognizable eventsis stored, and the system behaves as a “parser” matchingthe incoming data with predefined templates found in theknowledge base. Because of the nature of the explicit eventrecognition task, it is not surprising that most works on thistopic are based on stochastic parsers for the identification of known patterns of atomic events, as in the works by Ivanov andBobick [8], Minnen et al. [9] or Moore and Essa [10]. Similarto a parser-based approach is the work of Vu, Br´emond andThonnat [11], even if in their case the language used to expresscomplex events is not a full grammar, but rather a set of sub- 1447  events together with temporal and logical constraints on thesubevents. Other works are based on more general stochasticmodels, such as Bayesian networks, as in Hongeng, Nevatiaand Br´emond [12] or Mo¨enne-Loccoz, Br´emond and Thonnat[13]. More recently, in [14] an unsupervised technique fordetecting unusual events in a large video set was presented.In our work we focus on event recognition for the purpose of anomaly detection to support the security personnel.When persons enter the actual access region their move-ments are continuously recorded by multiple rotating laser-range scanners. Input data for the person classification are pro-vided by chemical sensors detecting hazardous materials, suchas explosives. However, due to the fact that these sensors havea limited spatio-temporal resolution, an individual chemicalsensor is unable to localize hazardous material and to associateit with the persons in the surveillance area. Our system realizesan integrative approach that compensates this deficiency indynamic scenarios by fusing the output of multiple chemicalsensors with the kinematic estimates resulting from multipleperson tracking based on the laser data (fig. 1 – left).The incoming laser-range measurements can be assignedto the constructed and successively updated tracks in manyways. Thus, the solution of the assignment problem is crucialfor every multiple target tracking algorithm. The traditionalapproaches to multiple hypothesis tracking rely on the com-plete enumeration of all possible association interpretations of a series of measurements and avoid an exponential growthof the arising hypothesis trees by various approximations(MHT: Multiple Hypothesis Tracking [22], [23], (J)PDAF:(Joint) Probabilistic Data Association Filter [20]). A powerful,alternative approach is represented by Probabilistic MultipleHypothesis Tracking (PMHT) [25], [27], [28], [31]. EssentiallyPMHT is based on Expectation-Maximization for handlingassignment conflicts. Linearity in the number of targets andmeasurements is the main motivation for a further developmentand extension of this methodology. The srcinal formulationof PMHT [25] deals with measurements that are instantaneousobservations of the state of a particular model – here, thekinematical model of a person. The problem of associatingmeasurements to targets arises because the particular modelthat caused a measurement is unknown. Thus PMHT formsan estimate of the unknown model states based on the stateobservation with uncertain srcin. In practical applications, asensor may be able to get other information besides the stateobservations. Davey [24] considers the case where the trackingfilter has an estimate of the class of the target that causedeach available state observation and extended the PMHTfor dealing with classification measurements (PMHT-c). Aclassification measurement is treated as an observation of theassignment of the corresponding measurement. One examplefor such a measurement is the range profile that occurs inhigh resolution radar. However, in our security scenario thereis no fixed assignment between a position measurement and achemical output. In this work we will show how PMHT-c cannevertheless be applied for the purpose of person classification.The dashed parts in figure 1 (right) are not yet realizedwithin our real-time demonstrator. They refer to the fusion of the event recognition with the chemical person classification.At the current stage the computer vision module and the personclassification are running separately.The remainder of this work is organized as follows: Insection II the core components of the proposed securityassistance system are introduced. First the computer visionmodule will be explained. Hereupon we present the state of theart in the development of chemical sensor devices detecting theexplosive TATP. In the last part of this section we describe thealgorithm that fuses chemical attributes with person tracks andprovides the person classification. The classification ability of the algorithm is shown within a simulated scenario. Section IIIdeals with the realization of the demonstrator that was de-signed to show the operability of the presented concept and toexplore the behavior of chemical sensors. We have to point outthat the sensors of section II-B are not able to identify TATPin an open system yet. Our current demonstrator is equippedwith metal oxide (MOx) sensors detecting hydrocarbons likefuels, alcohols or solvents. Section IV provides a conclusionand a preview of our future work.II. C ORE  C OMPONENTS  A. Video Event Analysis The developed computer-vision module has been focusedon anomalous event detection for human operator support insecurity-oriented applications. The acquired video sequence isprocessed in order to infer a plausible semantic interpretationof the scene, in which the scene itself and its sub-componentsare labeled with semantic information giving a high-leveldescription of the recognized scenario.Our solution provides a first level of computation con-cerning the extraction of moving objects (blobs) from videostreams and the computation of relevant features for eventanalysis purposes. Moving objects are detected from the back-ground (background modeling and foreground segmentation)and their movements tracked on a 2D top-view map of themonitored scene [15]. Tracking is performed using a KalmanFilter applied on map positions together with a Meanshift-based tracking technique on the image plane [16]. The ob- ject classification is performed by employing an adaptivehigh order neural tree (AHNT) classifier [17] that enablesa distinction by two main classes: a) person, b) luggage.At each time instant, a set of low-level features (i.e. objectclass, position, trajectory, mean speed, etc.) is extracted andmaintained for any foreground object in the scene. Once all thenecessary features have been extracted from the video stream,an approach based on explicit modeling of dangerous eventsis adopted to describe and understand the activities occurringinside the monitored environment. Two different types of events have been considered: simple events, characterized bythe motion (and behavior) of a single object and compositeevents, characterized by interactions among multiple objects.In an indoor environment, a simple event is normallyrepresented by a person or a light vehicle moving in themonitored environment. A simple event  v  is defined over 1448  a temporal interval [ T  s ,  T  f  ] and contains a set of features F   =  { f  1 ,...,f  m }  belonging to a given object  O j  observedover a sequence of   n  consecutive frames as: v ( T  s ,T  f  ) = { f  k | f  k  ∈ O j ,k  ∈ [1 ..m ] }  (1)Composite events are represented by a set of simple eventsthat are spatially and/or temporally correlated. Hence, acomposite event is defined over a wide temporal inter-val as a graph  G ( V,E  )  where the set of vertexes  V   =  v 1 ( T  s ,T  f  ) ,...,v n ( T  s ,T  f  )   is the set of simple events andthe set of edges  E   is the set of the temporal and spatialassociations between simple events. In the proposed solutionwe restrict the event association to a set of compatible simpleevents. This is achieved by exploiting an  Event Correlation Diagram  (ECD) that describes the allowed relations betweenobject types, their states and actions. It therefore defines thepossible links between different simple events, even when theyare generated by different objects. To generate the ECD, theexplicitly defined simple events are considered. For each of these, its possible relations with any other defined simple eventare analyzed and, if any exists, a link between the two simpleevents is added in the ECD [18].To recognize composite events thus by associating themto a predefined list of event of interest, we use a graph-matching technique [19]. Each event stored in a graph forest(the database of event of interests) is associated to an alarmlevel describing the degree of importance of that event in aspecific contest. We identified four different alarm levels, inincreasing scale of danger. The alarm levels are defined as:a) normal, b) suspicious and c) critical. The ’normal’ level isassociated to all those events representing a complete eventdetected within a scene that do not pose any threat from asecurity point of view. The ’suspicious’ level is associatedto all those events that are not necessarily dangerous, butthat could lead a human operator to identify the threat if the person classification component of our system raised analarm concerning the same object. In the specific case, aperson waiting in front of the control corridor for more thana predefined time is an example of a suspicious event, sincegenerally people have no reason to wait before entering thedemonstrator. Hence, if this person is inside the corridor whena chemical alarm raises we can assume that he or she is oneof the most probable people to look for. Finally, the ’critical’level is associated to all those events that have been explicitlyclassified as security threats by the developers of the eventknowledge base, and that require an immediate interventionby human operators.Thus, fusing the alarm level raised by the video eventanalyzer with those raised by the person classifier insidethe tunnel would yield a more robust information about thecurrent threat level for a human operator.  B. Chemical Sensors The detection of explosives or explosive related compoundsis a challenging task, because most explosives have a very lowvapor pressure and do not evaporate enough analyte moleculesinto air. An exception is TATP (Triacetone Triperoxide), whichis known as a homemade explosive often used by terrorists andwhich has a relatively high vapor pressure. A novel chemicalsensor device based on quartz micro balances (QMB) has beendeveloped to trace TATP. Since no exclusive affinity materialwas found, an array of minimum 3 QMB sensors has to beemployed providing data which are unequivocally interpretedby PCA.The development of efficient sensor materials on a tracelevel relies on selective binding of the corresponding substrateby well defined molecular recognition sites or distinct affinities[1]. QMB technology can be used as highly sensitive balancewhich translates very small mass changes typically in theorder of nanograms. According to the  Sauerbrey  equationthe frequency shift of a quartz crystal resonator is directlyproportional to the added mass [2], [3]. Since the sensitivityis strongly dependent on the fundamental frequency of suchresonators, 200 MHz systems were applied in this project.Interactions between thin organic layers deposited on the sur-face of such a quartz crystal and analytes in headspace resultin a mass increment that lowers the fundamental resonancefrequency of the oscillating crystal. An inverse modificationof its resonance frequency is easily recorded by standardtools [4]. The vapor pressure of TATP (fig. 3) is at ambientconditions in the range of 68 ppm (Vol) [5]. Consequently, OO OOOOH 3 C CH 3 CH 3 H 3 CCH 3 CH 3 Figure 3. Triacetone Triperoxide (TATP) enough analyte should be available in the atmosphere to traceTATP. A typical bomb accounts to multi-kilogram quantitiesof TATP mostly carried in open plastic bags. Since thedeveloped QMB device represents a low cost sensor a massapplication is possible. Furthermore, the signal of the QMBis provided almost immediately after exposure to analyte andthe sensor device recovers quickly upon removal of target.These are all splendid criteria for an employment within ourperson classification algorithm presented in the next section.Extensive screening studies of naturally occurring as wellas synthetic compounds have led to three selective affinitymaterials that are appropriate for the detection of TATP withrespect to common interfering components. The employedaffinity materials belong to salts of bile acid, cyclodextrins,and phenylene dendrimers. The latter have a high affinityfor TATP and other polar organic analytes, whereas the bileacid interacts with compounds by hydrogen bridges (esp.hydrogen peroxide or water). Cyclodextrin derivatives exhibita strong interaction with polar organic compounds such asacetone. Immobilization of the affine coating on QMB discswas accomplished by means of the electrostatic spray method[6]. For tapping the full potential of QMB and getting a 1449  (a) (b) (c)Figure 2. Example of video event understanding. (a) typical normal event recognized by the system and presented to the human operatorwith a green light. A suspicious event (b) has been recognized and signaled to the operator with a yellow light. The operator can look inthe log of the events at the bottom of the interface to identify the suspicious entity and investigate more its behavior with respect to theother sensors. Finally a critical event (c) has been recognized and signalled to the operator with a red light. The operator can extract fromthe event logs the identity of the entity and task other people to deeply investigate.Figure 4. Schematic drawing of calibration setup practical solution the knowledge of the detection limit isnecessary. The test setup has to ensure defined test conditionswith respect to temperature and analyte concentration. The Figure 5. Principal Component Analysis experimental setup consists of two parts (fig. 4). Part A isa gas mixing unit that creates a well defined gas mixture.Part B consists of an oven, with a multi-sensor array of sixQMB inside. Both parts are connected with a short pipe. Thefeasible TATP concentration range by this method is 3 - 44ppm (Vol) Principal Component Analysis (PCA) of the dataset yields two dimensional presentation (fig. 5). The TATParea (orange) is clearly separated from all other competinganalytes. An unequivocal detection and identification of TATPwas possible even at concentrations as low as 3 ppm (Vol) [7].However, in the open system TATP cannot be clearly identifiedyet. C. Combined Person Tracking and Classification We assume that  S   persons are moving in the surveillancearea and are observed by multiple laser-range scanners. Thesensors generate a measurement series  Z   =  Z  1: T   for a timeinterval  [1 :  T  ] ∗ . The sensor output at a scan  t  consistsof the measurement set  z t  (containing the measurements of all sensors) and of the number of all measurements  N  t .Measurements  z nt  ∈  R 2 with  n  ∈  [1 :  N  t ]  are assumedto be Cartesian position data. The task of person-trackingconsists in estimating the kinematic states  X   =  X  1: T   of theobserved persons (the person tracks). The states  x st  ∈ R 4 with s ∈ [1 :  S  ]  comprise position and velocity. Each person movesaccording to a discrete-linear model [21]. Difficulties arisefrom unknown associations  A  =  A 1: T   of measurements topersons. The associations are modeled as random variables a nt  that map each measurement  n  ∈  [1 :  N  t ]  to one of thepersons  s ∈ [1 :  S  ]  by assigning  a nt  =  s . 1) Multiple Person Tracking:  Probabilistic multiple hypoth-esis tracking (PMHT) is an efficient method to solve thetracking problem. It works on a sliding data window (alsocalled batch) and exploits the information of previous and fol-lowing scans in every of its kinematic state estimates. For eachwindow position, the method of expectation-maximization(EM) [26] is applied to the underlying data. Based on EM, aniterative algorithm can be derived [25]. Let  l  be the numberof the current iteration. Each iteration consists of two steps.Starting with the  Expectation-Step  (E-Step) we calculateposterior assignment weights  p ( a nt  =  s | z nt  , x st ( l ))  representing ∗ [1 :  T  ]  denotes the integral interval from 1 up to  T  1450  the probability that a measurement  z nt  refers to a person  s .The weights are calculated for all scans of the current datawindow and for all persons with respect to all measurementsof a certain scan. Each weight is governed by the distancebetween a particular measurement  z nt  and the state estimate x st ( l ) . Hereupon the weights are used to form the weightedsum  ¯ z st ( l )  of all measurements which leads to one syntheticmeasurement per person at each scan  t . There is anotherformula for the corresponding synthetic covariance. Duringthe  Maximization-Step  (M-Step) each person track is updatedby means of a Kalman Smoother that processes the syntheticvalues. This leads to new, improved state estimates  x s 1: T  ( l +1) for each person  s .E-Step and M-Step are repeated until the state estimatesdo not change considerably anymore (convergence). Afterconvergence, the prediction  x sT  +1 | T   is to be calculated forthe following window position. When all persons have beenprocessed, the window is shifted by one scan and the iterationprocess is started for the new window position. Detailedformulae of the PMHT algorithm are presented in [29]. 2) Incorporating Classification Information:  OriginallyPMHT-c [24] was designed to take advantage of classificationmeasurements to improve data association and state estima-tion. In the considered scenarios, the class observation couldbe utilized to improve tracking, because for each position mea-surement the corresponding classification output was known.The author deals with  pairs  of measurements that consist of akinematical and a classification part. The target class estimatesoccur as a by-product. High resolution radar is one example of a system where these classification measurements exist. Rangeprofiles from various azimuth angles form a radar image of the target. The location of primary scatterers can be used toclassify the target. Figure 6. Corridor with 5 Chemos (circles) and 2 Lasers (rectangles) In our system there is no information about the assignmentof chemical attributes to laser data. To apply PMHT-c we haveto consider the scenario in a different way: In the securityscenario there are also  pairs  of position and classificationoutput but this position is not provided by the lasers. In factit is given by the chemical sensor placement and output. Soreferring to the experimental corridor in figure 6 we have fivemeasurement pairs at each scan  t . Each of them consists of thechemical sensor position and its classification output. In thefollowing we explain how to associate the chemical outputswith person tracks applying PMHT-c and to find out who iscarrying the hazardous material.In an early version of our algorithm [30] we used theinstallation place of each chemical sensor as its positioninformation. The closer a person passes a chemical sensor,the higher is the influence of the reported sensor output withrespect the classification of this person. The procedure failsif another person, not carrying an explosive, stays closer tothe sensor than the dangerous one. A precise mathematicalmodel of a chemical sensor could yield more useful positioninformation, e.g. an estimated distance of the chemical sourcebased on the amplitude of the chemical signal. In this casethe position measurement belonging to a certain classificationoutput lies on a circle whose radius is determined by the valueof the signal amplitude (fig. 7).Let  p ch denote the position information of the chemicalsensor  ch ∈ [1 : 5]  ( s  and  t  omitted for the sake of simplicity).We denote the classification measurement associated with  p ch as  o ch t  and let the total measurement vector  z ch t  := ( p ch ,o ch t  ) T be the collection of the position (state observation) withits associated classification measurement. The classificationmeasurement is a discrete variable that can take a value fromfive different classes:  green  stands for  No Alert  and  yellow , orange ,  red  and  dark red  symbolize the alert levels from  I up to  IV  ( ≡  low to high). Our problem can be formalizedas follows: Given the estimated kinematic states  X  l of the  S  persons we want to estimate their classification. The desiredinformation is represented by a so called confusion matrix C  = { c is }  whereas the entry  c is  is the probability that person s  produces class output  i .Starting from these modeling assumptions and observationsthe PMHT-c algorithm [24] can be applied. To get PMHT-c, the expectation-step (E-Step) and the maximization-step(M-Step) of the basic PMHT have to be extended by theclassification estimates. a) Calculate Assignment Weights (E-Step):  First we haveto calculate the posterior assignment probabilities  w ch → st  ( l ) .Following [24] we use the update formula w ch → st  ( l ) =  σ ·N  ( p ch ; Hx st ( l ) , Cov ) · c o ch t  s ( l ) .  (2)These posterior weights reflect the  relevance  of a chemicaloutput for a person  s  in the surveillance area. The posteriorassignment weights are mainly governed by the Gaussian  N  ( p ch ; Hx st ( l ) , Cov ) , which is a measure for the distancebetween the position information of sensor  ch  and the cur-rent position estimate of person  s  (fig. 7 – top).  H  is theobservation matrix. The corresponding covariance matrix  Cov reflects the uncertainty of the position information and has tobe experimentally determined.  c o ch t  s ( l )  is the current estimateof the matrix entry that associates the output of sensor  ch  withperson  s . The posterior weights  w ch → st  ( l )  are calculated foreach sensor  ch  and each person s at each scan of the currentdata window. b) Update Classification Matrix (M-Step):  During theM-Step our parameter estimates have to be updated. Besidesthe parameters for tracking purposes, we have to update the 1451
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks