Page 82 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)
P. 82

66


                     detectors are end-to-end CNN based models that reach good mean average precisions,
                     around  73%,  on  benchmarks  of  high  quality  images.  However,  these  models  still
                     produce a large number of false positives in low quality videos such as, surveillance
                     videos.  In  [2]  human  posture  estimation  to  minimize  false  negatives  in  weapon
                     detection  in  video  surveillance  is  proposed.  Applying  CNN-based  object  detection
                     models to weapons detection in video surveillance still produces many false negatives.
                     In this context, most of the existing work has focused on a single weapon, primarily
                     firearms, and improved detection by using different pre- and post-processing methods.
                       In 2020, Khanh Tram et al [3] proposed to use the YOLO - V3 model for pistol
                     detection, the results were achieved with accuracy in the range of 63-78%. Then, they
                     extended the model to the knives detection problem [4]. In [5] they combined the data
                     preprocessing method to increase the accuracy of the object detection training model,
                     the accuracy is improved to 83%. Khanh Tram et al [6] has proposed the Edge AI model
                     for  the  problem  of  detecting  hot  weapons  from  surveillance  cameras.  However,
                     previous studies have low accuracy with YOLO - V3 model, the object detection is still
                     discrete, inconsistent, high computation time, slow processing speed. Low accuracy in
                     cases of hidden objects, blurred images, only one point.
                       In this paper, we want to improve the processing efficiency and accuracy of the Hot
                     Weapon detection model with the latest updated versions of YOLO - V5, 7, 8 which


                     but also works well with detecting small targets. Then, we compare, evaluate the results
                     and propose the best learning model for this problem.



                     2      Experiments


                     The following is an overview YOLO model and a specific YOLO model according to
                     YOLO V3, 5, 7, 8 versions that the paper will use in training and evaluation.

                     2.1   YOLO Model

                     YOLO. YOLO (You Only Look Once) is an algorithm that uses neural networks to
                     provide real-time object detection. This algorithm is popular because of its speed and
                     accuracy. The model has been used in a variety of applications to detect traffic signals,
                     people, parking meters, and animals. At launch, YOLO showed remarkable speed, so
                     far, YOLO has developed on many different versions including YOLO - V1, V2, V3,
                     V4, V5, V6, V7 and the latest is V8.
                       The input is divided into an SxS grid of the analyzed image that allows for partial
                     evaluation and detection of limit boxes and corresponding reliability. The input word
                     is  the  input  image,  through  a network  of  convolution,  pooling  and  fully  connected
                     classes to be able to output.












                     CITA 2023                                                   ISBN: 978-604-80-8083-9
   77   78   79   80   81   82   83   84   85   86   87