Viola & Jones Object Detection Algorithm
Viola Jones Face
Detection Algorithm [1] uses Haar features as shown in Figure 1. The steps in the algorithm
are
- Integral Image Calculation
- Feature Computation
- Adaboost Feature Selection
- Classifier Cascade
1. Integral image Calculation
Consider an image
with pixel intensities as shown in Figure 1.1
5
|
2
|
5
|
2
|
3
|
6
|
3
|
6
|
5
|
2
|
5
|
2
|
3
|
6
|
3
|
6
|
Integral Image of the above image is calculated as follows
For each pixel, we
draw a line as follows. All the pixel intensities above the line must be added
to get the integral image.
5
|
2
|
3
|
6
|
The value of first pixel remains the same.
5
|
2
|
3
|
6
|
The value of first row second column value changes from 2 to 7
5
|
2
|
3
|
6
|
So, in place of 6 we get 6+5+2+3 = 16
We calculate like this for all the
pixels in the image and the resultant image is called integral image.
2. Feature Computation
Sum of pixels under black region = 5+32-(7+16)=14 (same as 6+2+6 =14 in given
image). Sum of pixels under white region
= 7+48-(12+32)=11 (same as 3+5+3 in
given image)
3. AdaBoost
Adaboost is used to
train Strong Classifier which is linear combination of weak classifier. It also
decides whether a feature is relevant or not. The steps in Adaboost are:
- Training set of positive and negative examples (Ex: faces and non-faces images).
- Initially all the positive training images are given weights equal to and negative training images are given weights equal to .
- All the 1,80,000 Haar features or weak classifiers are run on the training images
- A good threshold (for ex: decision tree) such that any image above threshold is face and below threshold is non-face is determined.
- Now, Error rate is calculated as sum of weights of images misclassified by each weak classifier. Of the 1,80,000 error rates choose the weak classifier with lowest error rate.
The chosen weak classifier is added
to the strong classifier. Now, increase the weights of misclassified images and
decrease the weights of correctly classified by normalizing the weights. Again
repeat the steps 3 to 5 for 1,80,000 times and all the Haar features
are run on the images with updated weights and each round selects one weak
classifier, which is added as linear combination to obtain final Strong
Classifier. The output of weak classifier is 1 or 0 for classifying the image
as face or non face.
4. Cascading Of Stages
After all the rounds
of Adaboost, we build a strong classifier which is a linear combination of
selected weak classifiers (let’s say, 2,000). Instead of running all the 2,000
weak classifiers on the 24x24 window of test image, we build a cascade of
classifiers. This will reduce computation cost as Stage1 immediately rejects
windows that are non-faces.
To train a cascade, we must choose
- Number of stages or Strong classifiers in cascade
- Number of weak classifiers in strong Classifier (which is done by Adaboost)
For this we do Manual Tweaking, which is a heuristic algorithm to
train the cascade
- Select Maximum Acceptable False Positive rate.
- Select Minimum Acceptable True Positive rate.
- Threshold for each Strong Classifier (which is decided by Adaboost)
Let the User select the Target Overall False Positive for all the
stages
Until Target Overall False Positive is met
Add new Stage
Until Maximum
Acceptable False Positive rate and Minimum Acceptable
True Positive rate are met
True Positive rate are met
Keep adding weak classifiers and train
Strong Classifier using Adaboost.
You can listen to the video at below link
https://www.youtube.com/watch?v=WfdYYNamHZ8
References
https://www.youtube.com/watch?v=WfdYYNamHZ8
References
[1] Viola, Paul, and Michael Jones. "Rapid object detection using a boosted cascade of simple features." Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on. Vol. 1. IEEE, 2001.
nicely explained
ReplyDeletevery informative and explained clearly. I never understood viola jones technique, your post made me understand the technique clearly
ReplyDelete