Fire Detection on Video Using ViBe Algorithm and LBP-TOP

In this research, we built a system to detect fire using the ViBe (Visual Background Extractor) algorithm to extract dynamic targets. The ViBe algorithm is better at detecting moving target objects such as flame combustion. In this research we combined the ViBe algorithm with three frame differencing to gain better results on movement object. The HSI color space model was applied after the movement object was obtained. We used Local Binary Pattern-Three Orthogonal Planes to obtain the feature extraction to be classified with Support Vector Machine. Our result has shown that the proposed system were able to detect the fire using the LBP-TOP and ViBe algorithm methods with an average accuracy rate of 88.10%, and the best accuracy was 90.37%. The parameters used to achieve this accuracy in the feature extraction process were T=120, Radius=2, and frame gap=15, then the threshold value parameter for three-frame difference parameter was 25.


Introduction
Fire is one of the disasters that causes very high losses, both loss of funds and lives. Most of the fire accidents are happen in crowded neighborhood areas. Such accidents can be prevented with early detection of the fire. In the current time, there are a lot of surveillance cameras or CCTVs are installed almost on all buildings or houses. A fire detection system can be installed on surveillance cameras for better cost efficiency instead of installing new devices for fire detection sensors. With the increasing usage of CCTV surveillance video systems for security purposes in monitoring industrial environment, public and general environment, and other environments, many people consider this fire detection system as the early fire detector [1]. The color, shape, flicker frequency, and motion characteristics of the flame image are analyzed in the early flame detection [2].
There are many methods has been used by the researcher to detect fire using video [1]- [7]. A fire detection system proposed by H. Yamagishi and J. Yamaguchi based on HSV color space and neural network, which uses hue and saturation to identify the fire field, is one of the earliest methods. However, due to the high computational complexity of the operation, the duration of the process is not real-time [1]. There is also research in [6], using Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and Grey-level Cooccurrence Matrix (GLCM) which resulting the method used is not sensitive to the object movement and cannot fully detect the object visually. Research using hidden Markov models in [7] showing a result that it is efficiently detects flames. But it may produce false alarms because the method used is only based on color information and ordinary movement detection. It can be reduced using separate Markov models. Another method using background subtraction is proposed in [5]. This method needs to declare the background image first. However, using this background subtraction method has disadvantages. It cannot deal with the sudden, drastic light changes. In this research we overcame some of the previously mentioned drawbacks by implementing ViBe algorithm. ViBe algorithm is better on detecting moving target objects such as flame combustion, but the result from ViBe algorithm processing still has noise [8]. In order to solve that problem, we combined the ViBe algorithm with three frame differences to reduce noise, improve accuracy, increase image morphology processing, and reduce voids in the recognition results [8].
ViBe algorithm was proposed by Olivier Barnich and Marc Van Droogenbroeck according to their study in [8]. The word ViBe stands for "Visual Background Extractor" which is a foreground detection algorithm that uses random strategy to detect background sample estimation. ViBe algorithm has more benefits in terms of detection speed and model background update than optical flow method and Gaussian model algorithm [9]. ViBe algorithm also has high initialization speed, less memory consumption, and less resource utilization which can be a benefit for a surveillance camera that usually has low-performance specification. A research from the person who proposed the ViBe algorithm in [9] concludes from the experiment of seven other algorithms that the ViBe algorithm was proven to outperform all seven other background subtraction algorithm. ViBe algorithm was the fastest of those seven algorithms.
A research from [6] stated that by using Local Binary Pattern (LBP) for feature extraction method, the result is very sensitive against noise and has very high computational cost. Also, by using Volume Local Binary Pattern (VLBP) the result has complex computational and hard to extend. LBP-TOP is the better option amongst the other method such as the original LBP with Optical Flow and VLBP because both of those methods has high computational complexity. Therefore, LBP-TOP was used in this research. Another fire detection research in [10] is processed with testing video, fire color detection, and finding region of interest (ROI), LBP-TOP, and K-NN classification. The experiment of this research resulting in the LBP-TOP gives 92% of detection accuracy. But, in the early stages, some errors occured around the ROI stages. Therefore, another color segmentation method is preferable.
For the first step of our research, we conducted a color space transformation from RGB format to HSI format. This improved the saturation identification and the average saturation of the image used instead of the originally fixed threshold, making it adapted to different light and dark environments to identify flames. Then, we combined the ViBe algorithm with three frame differencing method in order to reduce the noise, and other problems stated before. Then, LBP-TOP was applied as feature extraction. After that, we classified the features using SVM Classification. The previous research of fire detection according to [8] were processed with color segmentation, background subtraction, or moving target recognition and judge the roundness of the target graph. This previous research used RGB to HSI color rules and ViBe algorithm which was the reference of this research. Even though this previous research and method successfully detected fire efficiently, the judge of the target graph had disadvantages. This method may interfere with another object such as car lights, which has similar color with flames and it is hard to differentiate it with flames when moving. In this research, the roundness of the target graph method was replaced with another feature extraction method called LBP-TOP.
The support vector machine (SVM) is a wellestablished classification algorithm. SVM conducts the classification process by constructing a hyperplane. SVM can be implemented on both linear and nonlinear classification problems [11]. According to research in [12], SVM classification had better accuracy and performance on LBP-TOP compared to LBP. SVM classification on LBP-TOP has higher testing accuracy at 96.54% compared to LBP which is 94%.
Our paper structured as follows. In section 2, we described the methods that we used in our research, from the HSI color convertion, background substraction using ViBe algorithm, feature extraction using LBP-TOP and classification using SVM. In section 3, we showed the result of our experiment using the aforementioned methods. And section 4 explained our conclusion from this research.

Research Method
In this research, the objective is to detect fire when appears on the camera using ViBe algorithm and LBP-TOP. The system design consisted of four stages. First, three-frame differencing was used combined with ViBe (Visual Background Extractor) Algorithm to detect movement object, color segmentation using HSI color space model, feature extraction using Local Binary Pattern-Three Orthogonal Planes and after that classified the features using Support Vector Machine (SVM). In the classification stage, the process was split into 2 parts, the training and testing process. Before implement the testing, a training process was required to train the system. The process of training is described on figure 1 and for the testing process can be seen on figure 2. The first step for ViBe algorithm is background subtraction by using the three-frame differencing method. The process of ViBe algorithm can be seen on figure 3. Three-frame differencing uses three frames to be processed. In this case, those 3 frames are k-1, k, and k+1. Each of those frames has five frame gaps to be processed. After the 3 frames are initialized, the difference between k-1 with k and k with k+1 is processed. From those differences, And operation is used to find the three-frame difference result. The ViBe Algorithm will be applied to frame k-1 and frame k difference results and will be processed with the three-frame difference results with And operator. The process of three-frame differencing with the ViBe algorithm can be seen in Figure 4.  Figure 5 shows the distance of the frame to be processed. We can see at a distance of 5 frames, moving objects are detected less than the others. This is because with fewer frames, changes in moving or moving objects that are detected will be less in accordance with the 5 frames and will be more accurate. However, processing time is slower due to more frames being processed. At a distance of 15 frames, the detected area is wider because within 15 frames, more moving or moving objects will be detected according to those 15 frames. The advantage of this 15 frame distance is that the processing time is faster than the 5 frame distance. However, the detected object area becomes wider and causes objects moving outside the fire position to be detected more.

LBP-TOP
From the chosen dynamic target processed from the background subtraction model in the previous step, the next process is to begin the feature extraction. Local Binary Pattern-Three Orthogonal Plane (LBP-TOP) will be used as the feature extraction method. LBP-TOP needs input as a volume data. Therefore, the previous frame until the next frame from three-frame difference is needed and created into a volume data [14]. Figure 6 illustrates the process of LBP-TOP and figure 7 shows the example of XT, XY and YT plane respectively. From the volume data obtained, focusing on the area of the flames, 3 planes are taken, namely the XY plane, the XT plane, and the YT plane. Plane XY is 2D data taken from the x-axis and the y-axis at the center value of the t-axis which represents spatial information, while XT and YT, respectively, are 2D data taken from the x-t axis with the center value of the y-axis and the y-t-axis with the mean of the x-axis, that represent temporal information on the row (for XT) and the column (for YT). Each plane is converted to grayscale because LBP accepts data input in the form of grayscale. The process of LBP-TOP is described on figure 8.

SVM Classification
Support Vector Machine (SVM) is selected as the classifier for the next process. SVM is known as high performance and accurate classification results with limited training dataset [15]. SVM can be simply explained as an attempt to find the best hyperplane in the input space that serves as a separator of two classes. SVM is unique in that it employs a neural network to find a hyperplane separator between classes. Hyperplane is a separator from 2 different data classes, on a hyperplane dimension called a point, in 2  [11]. Figure 9 illustrates the hyperplane constructed by SVM algorithm. In this research, we used liner kernel for SVM because it does not need a transformation process for the input, hence the classification process faster compared to other kernel function. Figure 9. Support Vector Machine Hyperplane [11] In this step, the process is split into 2 part, training and testing. The training and classification process uses the SVM model done with the product in between two data vectors (x) using a kernel function K. Figure 10 shows the process of training an SVM model.

Results and Discussions
For evaluation metric, we used accuracy score in this research. Accuracy can be calculated as the total of True Positive (TP) and True Negative (TN) case divided by all of cases, whether it is true or false. True Positive (TP) means the case when a real fire is detected successfully by our system as fire and True Negative (TN) means the case when non fire object is not detected by our system as fire.

Experiment
Experiment I, Background subtraction threshold. The result can be seen on figure 11. Figure 11. Background Substraction Threshold From the experimental results in Figure 11, using a threshold of 10, the results show that many areas were detected so that many unnecessary objects were masked. While at the threshold value of 30, the detected area is less but some important objects such as fire are not detected. In this study, the threshold used is 25, where fire objects can still be detected properly and unnecessary objects can be filtered properly.
Experiment II, ViBe Algorithm, 3 frame differencing, ViBe + 3 frame differencing result. The result is shown on figure 12. The results from the image above show that the results of the ViBe algorithm are still not very complete. Therefore, the 3 frame differencing method is combined in this study.
Experiment III, HSI Color Rules.   [14]. For the wider range as shown on Figure 13,  On the table above, it can be seen that the best result shown is LBP-TOP with radius parameter is 2 and T is 120. T is the number of stacked frames used as plane in LBP-TOP. So this parameter will be chosen to be process in the next scenario test.
2. Scenario II, Fire detection system using the chosen parameter from the passed in the previous scenario test. The result of scenario II can be seen on table 2.

Analysis of Test Result
The results of the overall system test above can be taken the average accuracy for videos containing fire objects is 88.10%. it has the greatest accuracy, which is 95.97% in Video 2 which is an ideal fire video with an indoor location, a little color similar to fire, and video shooting is done during the day. Then for the smallest accuracy, which is 75.15% in the Video 5, which is a fire video with an outdoor location, where there are many objects with colors that are similar to fire and many objects are found moving around the fire. The decrease in accuracy occurs due to many moving objects that have colors that pass by the fire color rule for example an arm is false detected as a fire in this scenaro test. Figure 14 shows the example of true and false detection result.

Conclusion
In this research, we have built a system to detect fire using ViBe algorithm and LBP-TOP for feature extraction. The system detected fire objects quite well with an average accuracy rate of 88.10%. The best accuracy for LBP-TOP from the test results was 90.37%. The parameters used to achieve this accuracy in the feature extraction process were T=120, Radius=2, and frame gap=15, then the threshold value parameter for three-frame difference parameter=25. Moving objects detected by the three frame differencing method with the ViBe algorithm provided good results for detecting moving objects, especially fire objects. With the right morphology, a more intact moving object could be obtained. The C value in the SVM classification haD no effect on the accuracy of this system because after testing the C value the change in the accuracy value was very small.
In this system, the minimum contour size to be processed into feature extraction and LBP-TOP was 125 pixels with the video datasets resolution is 480x270 pixels after resized. With this, the size of the fire would be more balanced on every video. However, this system sometimes cannot detect big size of fire that appear in the video. One of the reasons is the moving object detection process can't detect movement in the center of