A General Review of Human Face Detection Including a Study of Neural Networks and Haar Feature-based Cascade Classifier in Face Detection

—Face detection is an interesting area in research application of computer vision and pattern recognition, especially during the past several years. It is also plays a vital role in surveillance systems which is the first steps in face recognition
of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
   2014 International Symposium on Biometrics and Security Technologies (ISBAST) 73 A General Review of Human Face Detection Including a Study of Neural Networks and Haar Feature-based Cascade Classifier in Face Detection Ali Sharifara, Mohd Shafry Mohd Rahim and Yasaman Anisi Department of Computer Graphics and Multimedia, Faculty of Computing Universiti Teknologi Malaysia 81310, Johor, Malaysia {a.sharifara, shafryr, yasaman.anisi}@gmail.com  Abstract   —Face detection is an interesting area in research application of computer vision and pattern recognition, especially during the past several years. It is also plays a vital role in surveillance systems which is the first steps in face recognition systems. The high degree of variation in the appearance of human faces causes the face detection as a complex problem in computer vision. The face detection systems aimed to decrease false positive rate and increase the accuracy of detecting face especially in complex background images. The main aim of this paper is to present an up-to-date review of face detection methods including feature-based, appearance-based, knowledge-based and template matching. Also, the study presents the effect of applying Haar-like features along with neural networks. We also conclude this paper with some discussions on how the work can be taken further .  Keywords—face detection; feature based face detection; human face detection; haar-like features; neural networks. I.   I  NTRODUCTION  Over the last decades, the development of technology has facilitated the improvement of real-time vision modules that interact with individuals. Object detection is one of the computer technologies, which is connected to the image  processing and computer vision and it interacts with detecting instances of objects from the specified class, such as human faces, building, tree, car and etc. The objects can  be taken from the digital images or video frames. The basic aim of face detection algorithms is to determine whether there is any face in an image or not [1]. In other words, face detection is a task where faces shown on pictures or video are searched for automatically. Face detection is one of the domains in object detection, which many methods have been proposed before and all of them aim to detect face(s) in the given image or real time surveillance systems with different accuracy and false detection rates. Furthermore, most of the researchers also mentioned, which machine learning is their main tool to detect faces in static and video mode. During the past several years, the face detection problem has been given an important attention due to the range of its applications in commerce and law enforcement. Moreover, in recent years a lot of pattern recognition and heuristic  based methods have been proposed for detecting human face in images and videos [2]. Face detection is the first stage of many face processing systems, including face recognition, automatic focusing on cameras, automatic face confusion in  pictures, pedestrian and driver drowsiness detection in cars, criminal identification, access control, etc [3]. The challenging issue which can be mentioned in face detection is inherent diversity in faces such as shape, texture, colour, got a beard\moustache and/or glasses. Furthermore, the  photographing occurrence can cause additional differences such as different lighting conditions, head pose and facial expressions. In addition, most of the face detection algorithms can be extended to recognize other objects such as cars, humans, pedestrians, and etc [4]. II.   B ACKGROUND  Face detection is one of the demanding issues in the image processing and it aims to apply for all feasible appearance variations occurred by changing in illumination, occlusions, facial feature, etc [5]. Furthermore, face detection algorithms have to detect faces which appear with different scale and pose. In the last decade, in spite of all these difficulties, superb progress has been made and many systems have shown remarkable performance. The recent advances of these algorithms have also made important contributions in detecting other objects such as  buildings, pedestrians, and cars. Face detection algorithms can tolerate some factors which including posture, existence or lack of structural elements, facial expression, Occlusion, Image orientation, Illumination and the speed and time of computation. In the next section some factors have been verified which can effect on the result of face detection algorithms such as head  pose, facial expression, image orientation, Occlusion, and Illumination. 978-1-4799-6444-4/14/$31.00 ©2014 IEEE  74  A. Head Pose The location of face in images can vary due to different frontal plane rotation, profile, half profile, and upside down. Also, some of the features of face such as eyes or nose may  be partially or completely blocked. Figure 1 shows different  posses views (from GTAV Face Database). Fig. 1. Different poses views  B. Facial expression Facial Expression is one of the most influential, temperament, and instant means for human  beings to converse their emotions and meanings. The Facial expression is related to the appearance of face like angriness or happiness which can directly impact on the individual's face [6] . C. Image orientation Image orientation depends on the nature of images may appear correct, upside- down, rotated, or inversed from left to right and it looks like trying to read a sign in a mirror.  D. Occlusion Occlusion is the obstructing the face(s) in images that can  be covered by part or the whole of other objects. For instance, a face in an image can be partially or fully covered with other peoples' faces. Figure 2 illustrates some occluded faces (from GTAV Face Database) which can impact on the result of face detection. Fig. 2. Sample occluded face images  E. Illumination Illumination is a crucial factor in determining the quality of images and also can have much effect on the evaluation of the image and consequently detected faces. The factor is related to the lighting and the angle of light which exist in the images. Figure 3 shows a same face under different illuminations. Fig. 3. Same face under different illumination  F. The Computation time and speed Execution time is another important factor which is very important for real time application and face detection is not an exception. This parameter is critical especially when the algorithm supposes to work in the real time purposes.  Numerous studies have been proposed in face detection to enhance the performance and accuracy of recognition under mentioned challenges. Normally each study aims to cover and concentrate on a specific issue. For instance, there are some researches in face detection which worked on the expression challenges and some others worked on the performance of execution time. The vast applications of the face detection motivate researchers to discover methods of enhancing the accuracy and execution time of the face detection systems. In addition, there are many approaches which have been  proposed to improve the accuracy as well as the  performance. III.   G ENERAL FACE DETECTION METHODS  Generally, face detection approaches can be divided into four main categories including Feature-based, appearance- based, Knowledge-based, and template matching method. Figure 4 depicts the general methods of face detection. Fig. 4. General face detection methods  75  A.  Feature based methods   There are many features in human face, which can be distinguished between a face and many other objects. This method depends on the mining of features in the face which is not undergone of any changes in the images such as Occlusion, illumination, pose and etc. There are some features that can be used to detect face(s) including skin colour, nose, ears, eyes, mouth, and etc. Furthermore, some studies have proved that colour of skin is an excellent feature for detecting faces among other objects due to different people have different skin colour and it is more clear when the race of people is also a metric of evaluation [7]. Additionally, human faces have particular textures which can be used to differentiate between face and other objects. Moreover, edge of features can help to detect the objects from the face. Also, using blobs and streaks can assist to discover objects from a given image. One of the challenges for feature extraction methods is feature "restoration". This can be happened when the algorithm attempts to retrieve features that are invisible due to large variations, e.g. head pose when we are matching a  profile image with a frontal image [8]. Feature based methods have some advantages which are rotation independency, scale independency, and also their execution time are so quick, in compared to other methods [9]. Feature based methods contains facial features, skin color, texture, and multiple features.  B. Appearance based methods Appearance based method is also another type of face detection methods. In this type, face detection is considered as a 2-class pattern recognition issue. This method also uses classification and it uses the features in the search window. The features are calculated from the values of pixels. For example, MCT (Modified Census Transform), Haar-like are some types which have been used for the method. In addition, this method is a learning based method and the classifier required to be created by using a statistical learning between the enormous instances. For instance, adaptive boosting (AdaBoost) includes some weak classifiers to create a cascade classifier which is multi stage and effective [10]. in other words , there are some weak and strong classifiers which weak classifiers first check each single window and if they pass the required threshold value, they will be passed to stronger classifiers and the algorithm continues this scheme until the end of pixel value. the advantage of this method is that the non-face window will be rejected early and subsequently the execution time will be decreased and the accuracy will be increased as well [11]. Figure 5 illustrates the Adaboost classifier which starts from weak to strong classifiers. C. Knowledge Based Methods   There is another method which is called Knowledge  based method. In the method some simple rules are defined for detecting faces from image and the rules also can be extended to detect face(s) from complicated background. The rules can be features of a face such as: there are two ears , one nose , one mouth and other features of a face. Facial features in an input image are extracted first, then face candidates are recognized, based on the defined rules [12]. It is noticeable that, the method uses the position and the distance between selected features. The challenging issue in this approach is how human knowledge can be translated into the rules which be able to apply for all faces with different conditions [13]. Fig. 5 AdaBoost cascade classifiers  D. Template based methods Template matching based methods is commonly used to obtain regions with the most possibilities to be human face. A template is an instance of the objects or features of a face. Ttemplate based methods use the relation between the  pattern of the input image and the defined pattern of the face or its features. This kind of approach uses some predefined templates for the purpose of detection. In fact, this kind of method compares between the sub regions and the  predefined templates by executing pixel intensity. Figure 6 illustrates a sample of defined templates which can be used in template-based methods. Fig. 6. Sample of defined template's images Template based methods are simple in terms of implementation and some assumptions must be considered in advance. For example, face must be from frontal view without any occlusion. Also, face(s) must be in the same size as predefined templates. In other words, this kind of algorithms is dependent to size, scale and rotation. Also, their computations are so expensive and it decreases the  performance of method. Figure 7 depicts some sample of aligned face images which are ready to use for template  based methods.  76 Fig. 7. Sample of aligned face images  E. Template based methods in side view faces Most of the face detection algorithms concentrated only on the frontal view face detection and faces with different views will be missed. Hence, recent studies aim to solve the  problem of face detection even for special angle of face [14]. Apparently side-view faces are difficult in terms of getting a mathematical model of their shapes, meanwhile in frontal faces we have an oval shape and it can help to define a general template to compare with the stored templates. Furthermore, side view detection is one of the demanding issues in template based algorithms. Though, some methods have been proposed to resolve this problem, it is still not  precise with a satisfaction result. Figure 8 illustrates region splitting and a template for side view face. Fig. 8. Region splitting for the side-view faces In the current section 4 main types of face detection have  been verified and in the next section the famous face detection method which is Haar-like features is described and in the last section a hybrid method of Haar classifier algorithm with neural network is introduced as well. H AAR  - LIKE FEATURES A standout amongst the face detection methods is Haar-Like features. However, the method is introduced in 2001, still has been used widely in the face detection algorithms. In this method, Firstly, the value of all pixels in greyscale images which are in black accumulated. Then, they subtracted from the total of white boxes. Finally, the result will be compared to the defined threshold and if the criteria is met, the feature considers a hit. In general, for each computation in Haar-feature may need to obtain each single pixel in the areas of features and this step can be bypassed by applying integral images that the value of each pixel is equal to the summation of gray values above and left in the image[13]. Therefore, it only calculates the pixel value for four pixels lookups from the integral image. Figure 9 depicts some examples of Haar-Like features. Fig. 9. Some examples of Haar-like features  A.  How Haar-like features works   The fundamental idea behind Haar-like feature is utilizing machine learning, which a cascade function is trained by numerous positive and negative images and after upgrading of their classifier, it utilizes it to locate objects within different image (The positive and negative images are those images which are include human face and without human face, respectively). Moreover, the classifier utilizes the positive and negative images to train its classifier. Hence, the classifier is able to detect objects in other images by extracting features form them. For Instance, for an image with the size of 24x24 window, there are 160000 features which need to be verified  by this method. To compute each single feature, total of pixels is required under white and black rectangles. To solve this problem, the researchers applied the integral images. It can simplify the computation of total of the pixels. in fact, one operation is used for all pixels regardless of the size of image. it's  believed that among all of the selected and calculated features, some of them are useless and unsuitable. As can be seen from Figure 10, it shows the feature selection in a face which the first feature selected focused on the region of the eyes, that are darker than region of the nose and forehead, and the second feature selection focused on the nose, that are between two eyes (darker object). Hence, in this example, If same window be applied on the cheeks or lips or other parts are unsuitable. Figure 10 : Sample of applying Haar-like feature on a face Haar-like feature can be calculated with the following equation:  77 Feature { } ( ) ∑ = hw y xcSumwi N ie ,,,Re...1  Where RecSum (x,y,w,h) is the summation of intensity in any given upright or rotated rectangle enclosed in a detection window and x,y,w,h are for coordinates , dimensions , and rotation of that rectangle, respectively. Haar Wavelets represented as box classifier which is used to extract face features by using integral image which is described in the next section.  B.    Integral   Images One of the main contributions of the authors was to use summed area tables which they called integral images. Integral image has been used for quick feature detection. The meaning of integral image is the outline of the pixel values in the srcinal images. As can be seen from Figure 11, each location of x and y in the integral image is the sum of pixel values in above and left location of x and y [15]. This can be rapidly executed in one pass through the image. Fig. 11. Finding the sum of the shaded rectangular area For instance, let A,B,C,D be the values of the integral image at the corner of a rectangle, next the sum of srcinal image values within the rectangle can be computed as the  below equation and only 3 additions are required for any size of the rectangle. SUM = A-B-C+D Value = ∑  (Pixels in White area) ∑  (Pixels in Black area) To conclude, an integral image is the sum of input image  pixel values. It is mostly utilized to increase the speed of the  process of a box filter operation. Though, memory for integral image data can be a problem in an embedded situation with limited hardware. Hence, a proficient method to store the integral image is essential. IV.    N EURAL NETWORK IN FACE DETECTION  Some researches proved that neural network can improve the performance of face detection, but it causes to increase the execution time [16]. On the other hand, Haar-like feature is good in detecting rate but contribute a lot of false detection rate. Hence, we need to have good performance as well as good execution time. For this reason a method has been  proposed to use the combination of Haar-cascade classifiers with neural network. Figure 12 illustrates the method which is a hybrid of Haar-cascade and neural network as face’s classifier. Fig. 12. Combination of Haar and Neural network A simple kind of neural network system is used which is Multi-Layer Feed-Forward. Figure 13 depicts the flow of neural network classifier. This classifier consists of 3 main steps. In the first step gets face candidates from Haar-like features and then finds rough location of the centre point of faces and finally it checks window by using neural network classifier. Fig. 13. Neural network classifier Figure 14 also illustrates the general architecture of neural networks which contains 3 main layers including input, hidden and output layers [12]. The Back propagation Network is the main part in the second classifier of the method. According to Figure 14, face candidates from cascade Haar classifier will be cropped and the second classifier input image will be added to decrease the number of calculations in the search potential face candidate. Finally, the task of neural network is to detect final output from the next classifier. The assumption is that 16 by 16 pixels does not include extra visual information which can increase the precision of the detection, but reduce  processing time.
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks