Page 82 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)

P. 82

detectors are end-to-end CNN based models that reach good mean average precisions,
around 73%, on benchmarks of high quality images. However, these models still
produce a large number of false positives in low quality videos such as, surveillance
videos. In [2] human posture estimation to minimize false negatives in weapon
detection in video surveillance is proposed. Applying CNN-based object detection
models to weapons detection in video surveillance still produces many false negatives.
In this context, most of the existing work has focused on a single weapon, primarily
firearms, and improved detection by using different pre- and post-processing methods.
In 2020, Khanh Tram et al [3] proposed to use the YOLO - V3 model for pistol
detection, the results were achieved with accuracy in the range of 63-78%. Then, they
extended the model to the knives detection problem [4]. In [5] they combined the data
preprocessing method to increase the accuracy of the object detection training model,
the accuracy is improved to 83%. Khanh Tram et al [6] has proposed the Edge AI model
for the problem of detecting hot weapons from surveillance cameras. However,
previous studies have low accuracy with YOLO - V3 model, the object detection is still
discrete, inconsistent, high computation time, slow processing speed. Low accuracy in
cases of hidden objects, blurred images, only one point.
In this paper, we want to improve the processing efficiency and accuracy of the Hot
Weapon detection model with the latest updated versions of YOLO - V5, 7, 8 which

but also works well with detecting small targets. Then, we compare, evaluate the results
and propose the best learning model for this problem.

2 Experiments

The following is an overview YOLO model and a specific YOLO model according to
YOLO V3, 5, 7, 8 versions that the paper will use in training and evaluation.

2.1 YOLO Model

YOLO. YOLO (You Only Look Once) is an algorithm that uses neural networks to
provide real-time object detection. This algorithm is popular because of its speed and
accuracy. The model has been used in a variety of applications to detect traffic signals,
people, parking meters, and animals. At launch, YOLO showed remarkable speed, so
far, YOLO has developed on many different versions including YOLO - V1, V2, V3,
V4, V5, V6, V7 and the latest is V8.
The input is divided into an SxS grid of the analyzed image that allows for partial
evaluation and detection of limit boxes and corresponding reliability. The input word
is the input image, through a network of convolution, pooling and fully connected
classes to be able to output.

CITA 2023 ISBN: 978-604-80-8083-9

77 78 79 80 81 82 83 84 85 86 87