Page 178 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)
P. 178
162
optimizer is used as Adam, and a loss measure is categorical cross-entropy for
multiple weather image classification.
A part from feature extractor, we also use this CNN architecture to classify the
weather images. This part uses the ReLU (Rectified Linear Unit) for all the layers,
except for the output layer where the softmax function was used to classify the
weather images [see Fig. 2].
Fig. 2. CNN architecture for feature extraction and classification
At the second stage for the classifier, we employ XGBoost classifier to identify the
weather images. In which, XGBoost algorithm stands for Extreme Gradient Boosting,
a highly efficient machine learning algorithm based on a combination of techniques to
adjust error weights on weaker models to create a stronger model. XGBoost algorithm
principle is based on decision tree and gradient enhancement technique to give the
optimal model. Sequentially generated new trees minimize the error from the previous
tree by relearning the error of the previous tree, performing error correction to get a
better tree. XGBoost was originally introduced by Chen and Guestrin (2016) to
improve the performance and speed of decision trees according to the principle of
gradient-boosted [9].
According to the description of the XGBoost algorithm given by authors of Chen
and Guestrin [9], XGBoost works as follows:
m
For a given dataset with n samples and m features D = {(xi, yi)} (|D| = n, xi R , yi
R), apply a model that combines the tree uses K enhancement functions to predict
the output.
m
T
where F = {f(x) = w q(x)} (q : R R ) is the space of the regression tree (also
known as CART). Here q is a representation for the structure of each tree, mapping a
data sample to the corresponding leaf index. T is the number of leaves on the tree.
Each f k corresponds to an independent tree structure q and leaf weight w.
CITA 2023 ISBN: 978-604-80-8083-9