Documents

target tracking system

Description
seminar paper ece
Categories
Published
of 5
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  TSINGHUA SCIENCE AND TECHNOLOGY   ISSN   1007-0214   55/67   pp343-347 Volume   13,   Number   S1,   October   2008   A Target Tracking System for Applications in Hydraulic Engineering SHEN Qiaonan (  ), AN Xuehui (  ) **   Department of Hydraulic and Hydropower Engineering, Tsinghua University, Beijing 100084, China Abstract:  A new type of digital video monitoring system (DVMS) named user defined target tracking system (UDTTS), was developed based on the digital image processing (DIP) technology and the practice demands of construction site management in hydraulic engineering. The position, speed, and track of moving targets such as humans and vehicles, which could be calculated by their locations at anytime in images basically, were required for management. The proposed algorithm, dependent on the context-sensitive moving infor-mation of image sequences which was much more than one or two images provided, compared the blobs’ properties in current frame to the trajectories of targets in the previous frames and then corresponded them. The processing frame rate is about 10fps with the image 240-by-120 pixels. Experimental results show that position, direction, and speed measurements have an accuracy level compatible with the manual work. The user-define process makes the UDTTS available to the public whenever appropriate. Key words: target tracking system; digital image processing; user-defined; consecutive trajectory   Introduction It is widely recognized that hydraulic construction en-gineering is information intensive and complex indus-try. Present trends in the hydraulic construction engi-neering have heightened the need for effective and ef-ficient collecting, monitoring and analysis the con-struction progress data. In recent years, the use of digi-tal video monitoring system (DVMS) in the surveil-lance phase of a project is rapidly growing which im- proves the progress controlling, safety monitoring and work coordination during entire project [1] . However, information within thousands of digital videos and images stored for a project from the DVMS could not be obtained automatically. A large number of components and their features need to be inspected on construction sites [2-3] . Many of these features need to be assessed based on tight toler-ances, requiring that inspections be extremely accurate. At the same time, inspection resources, such as the time that inspectors can spend on site, are limited. Therefore, inspectors can benefit from emerging tech-nologies that improve the efficiency of data collection while on site, and from visualization technologies that improve the effectiveness and efficiency of inspection tasks using this data. The capability to automatically identify objects from images through many methodologies is a product of the technological breakthroughs in the area of digital image processing (DIP) [4,5] . Detection and tracking of targets in construction site is not only a single object tracking problem, but also a multi-object tracking problem. Numerous approaches [6]  for multi-object tracking have been proposed. But it is still a very different and more challenging problem. In addition to the normal frame-to-frame following of a salient area, the system must be able to handle occur-rences, disappearances, crossing and other complicated events related to multiple moving targets. Features [7-12]  such as color, texture, shape, and motion properties are used for tracking. In this study, a new type of DVMS named user Received: 2008-05-30 ** To whom correspondence should be addressed. E-mail: anxue@mail.tsinghua.edu.cn; Tel: 86-10-62794285   Tsinghua Science and Technology ,  October 2008, 13(S1): 343-347   344 defined target tracking system (UDTTS) was proposed and developed based on the DIP technology and the  practice demands of construction site management in hydraulic engineering. And a new algorithm was pro- posed for multi-object tracking, dependent on blob  properties and context-sensitive motion information. 1   System Overview The system called UDTTS includes four parts: User-defined part, data preprocessing, moving object detec-tion and tracking. The input data is a video file or the stream of images captured by a stationary digital video mounted on a horizontal gantry or on a tripod and in static positions at construction site. 1.1   User-defined process This system can do many aspects of management by user-define process. Users can define the application, such as vehicle flow, human flow, grinding variables  by three steps. Images including targets and static  background should be provided to the UDTTS. Firstly, generate the initial background model when the back-ground image is input; secondly, define a target on the target image captured on construction site; thirdly, de-fine the controlling conditions that the target must sat-isfy; finally, define an output format. So the definition of an application is finished. 1.2   Application analysis Moving targets such as vehicles, humans, and other things at construction site have   variable colors, sizes, shapes, speeds, and directions. Their features can be utilized to detect and track them. As is shown in Fig. 1, an application can be worked out from a target’s trajec-tory which consists of its positions at sequential time. The problem is how to know the positions of a target at any time from the streams of color image. In the Fig. 1   Application analysis UDTTS, after the user-define process, the video cap-tured on construction site is input to be processed. The  procedure performs several images processing tasks to detect and tracking moving objects in the scene. The result can be output as user-define format. 2   Tracking Method The purpose of the tracking part is to detect moving objects from the video stream and collect appropriate data of their routes. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the track-ing problem in the context of a particular application. In its simplest form, tracking can be defined as the  problem of estimating the trajectory of an object in the image plane as it moves around a scene. The task of detecting and tracking moving objects from video deals with the problem of extracting mov-ing objects (foreground-background separation) and generating corresponding persistent trajectories. In the case of multiple objects in the scene, the tracking task is equivalent with the task of solving the correspon-dence problem. At each frame a set of trajectories and a set of measured objects (blobs) are available. Each object is identified by finding the matching trajectory. 2.1   Detection of moving objects Detection of moving objects in video streams is the first relevant step of information extraction in many computer vision applications. Aside from the intrinsic usefulness of being able to segment video streams into moving and background components, detecting mov-ing objects provides a focus of attention for recogni-tion, classification, and activity analysis, making these later steps more efficient. At hardware level, color images are usually captured, stored and displayed using elementary R, G, B compo-nent images. The color images read from the frame grabber are transformed to gray scale images with only luminance information preserved in order to reduce the computational load and to guarantee adequate frame rate (around 10 fps) for tracking. Each incoming frame goes through four successive image processing stages where the raw intensity data is reduced to a compact set of features which can be used for the matching  SHEN Qiaonan (  ) et al   A Target Tracking System for Applications in Hydraulic Engineering 345 method. These four stages are gray-scale transforma-tion, background subtraction, threshold segmentation and connected component labeling as is shown in Fig. 2. Fig. 2   The digital image processing steps Motion detection is started by computing a pixel  based absolute difference between each incoming frame and the static background frame provided by us-ers. The pixels are assumed to contain motion if the absolution difference exceeds a predefined threshold level. As a result, a binary image is formed where ac-tive pixels are labeled with “1” and non-active ones with “0”. The figures directly extracted from the resulting bi-nary image are typically fragmented to multiple seg-ments. In order to avoid this, a morphological closing operation with a 3-by-3 square kernel is applied to the image. As a result, small gaps between the isolated segments are erased and the regions are merged. After closing, we use a connected component analy-sis [13]  followed by region area detection in this stage. The regions with a smaller area than the predefined threshold are now discarded. Position and area of each blob are detected in local model of individual frame. After detection, the objects in a local model of single frame must be integrated to the trajectories in a world model of all frames through matching method. 2.2   Tracking of moving objects Tracking is needed for determining the object corre-spondence between frames. In our approach, the main tracked feature is the object trajectory which is con-secutive in frame sequences. Since the speed of the objects at construction site is not too fast, we assume that the blob in current frame and its corresponding tra- jectory in the previous frames overlap. The object cen-troid and dynamic information are used for tracking. The speed and direction of the object generated by the  previous trajectory are stored in the world model of all frames. They are also useful features for matching. In general, high occurrences of objects that visually overlap cause difficulties for a tracking system. Since  blob generation of moving objects is based on con-nected component analysis, touching objects generate a single merged object, where pixel classification, i.e., to which srcinal blob individual pixels belong, is hard to resolve. This lead to the problem that in a merged state individual tracks cannot be updated. To overcome this  problem, we propose a solution using a technique, which generates plausible trajectories of the objects in a merged state by performing matching between ob- jects entering and leaving the merged state. The match-ing is based on the kinematic smoothness constraint. The method is presented in section 2.3. In the first frame, each blob generates a trajectory with the following attributes: area, speed, direction and status. Consecutive judgement is used for matching, which is described in section 2.3. The scheme of the tracking algorithm is outlined as follows. Step 1   If a blob is exactly matched to one existing trajectory, the trajectory properties (area, speed, direc-tion, and status) are updated. Step 2   If a blob matches two trajectories, crossing happens. Set the status of these trajectories crossing. Then do not process them until splitting happens. Step 3   If a trajectory matches two blobs, splitting happens. Find the partner trajectory and compare them to these two blobs. Update the two trajectories proper-ties. Step 4   If a none-matched blob is found, a new tra- jectory is generated. Step 5   In case of detecting a non-matched trajec-tory, exiting or failure of the blob detection happens. If the trajectory tends to be out of the view, maybe exit-ing is right; or leave it to be processed in next frame. 2.3   Consecutive judgement Consecutive judgement: As is shown in Fig. 3, if a  blob with solid line in current frame and a trajectory   Tsinghua Science and Technology ,  October 2008, 13(S1): 343-347   346 with dotted line overlap, we say they are consecutive, otherwise, they are inconsecutive. Fig. 3   Consecutive and inconsecutive trajectory In the case of inconsecutive trajectory, these features (maximum distance, limited speed, and correlative di-rection) are used for matching (conditions shown in Fig. 4). Fig. 4   Inconsecutive trajectory conditions If a trajectory is only generated by one blob, speed and direction are not effective values. The distance d  ij   between current blob centroid and the previous blob centroid should fulfill the condition as d ij dX    (1) where  X  d  denotes maximum distance an object can move in a certain interval, i, j  are the frame number. If a trajectory is generated by more than two blobs, speed and direction can be used for matching. If the current speed v  and the direction correlation described as    are in the acceptable range, i.e., 11VXVX11VYVY (1)*(1)*(1)*(1)* nnn xxxnnn yyy  xVVxV  xVVxV            (2) where V   x  is the speed in  X  -axis, V   y  is the speed in Y  -axis, n  is the frame number,  x VX  and  x VY  are predefined ratios in (0,1); 1 coscos1       (3) where    is the angle between the current direction and the previous,   1  is the predefined angel in (  90, 90), the blob and the trajectory match each other, otherwise they do not. As described above, when blobs overlap the obser-vation of a single merged blob does not allow recon-structing the trajectories of the srcinal entering blobs. Just add the blob to these trajectories for the latter con-secutive judgement. Remember the frame number i  and the time at which crossing happens. When splitting happens at frame k  , direction consistence and correla-tive speed are used for matching the blobs and the tra- jectories based on the kinematic smoothness constraint. In the case of entering or exiting, the blob must be near to the boundary of the processing area. 3   An Example The tracking system UDTTS has been applied to two video files captured from Xiangjiaba dam to track ve-hicles. One of the test sequences contains one object and the other one contains multiple objects occurring entering, exiting and crossing events. The static back-ground is provided to define the processing area (the rectangle in Fig. 5), and the target’s area is obtained  before processing. Main parameters of algorithm im- plementation: Windows XP, VC++ 6.0, CPU AMD Athlon 2.01 GHz, and memory 1.00 GB. The process-ing frame rate is about 8 fps, while the image size is 240 by 120. The accuracy and stability of the system depend on these parameters which are predefined initially. The second sequence contains 4 vehicles generating 1 entering, 4 exiting and 3 crossing events. The track-ing results such as the centroid sequence and the trajec-tory of each vehicle are shown in Fig. 5. 4 frames at 1st second, 3rd second, 7-th second and 26-th second are listed on the left. Crossing event between vehicle 2 and 3 occurs at T  2 , vehicle 1 and 3 cross at T  3 , and ve-hicle1 and vehicle 4. Vehicle 3 is moving out of the  processing area at T  4 . Vehicle 1 disappears at T  4 , and vehicle 2 has moved out of the processing area from T  3 . Vehicle 4 appears after T  3  and leaves the processing area finally. A qualitative summary of the observed events is summarized in Table 1. Table 1   Critical events processed by tracking method Items Entering events Exiting event Crossing events Test results 1 4 3 Actual situations 1 4 3
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks