Page 35 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)

P. 35

Bao Ngoc Vi, Cao Truong Tran, Chi Cong Nguyen 19

4 Experiments

In this section, we validate the performance of EGAIN using multiple real-world da-
tasets. In the first set of experiments we quantitatively evaluate the imputation perfor-
mance of GAIN using various UCI datasets [20], which include Wine, Blood, Hepati-
tisC, Shill and iBeacon, following by the comparisons with GAIN. These are all com-
plete datasets, though to make missing datasets we randomly remove some of the data
points (MCAR). The missing rate is the percentage of data points removed. Secondly,
we evaluate the performance of EGAIN with different missing rates.The discriminator
is updated after each evolutionary step to continually provide the adaptive losses to
drive the population of generator(s) evolving to produce better solutions. Next, our net-
work and the evolutionary step are represented in detail.
We conduct each experiment 20 times and within each experiment we randomly split
data into train and test set with ratio 90:10. We report average RMSE on test set along
with their standard deviations across the 20 experiments.

4.1 Quantitative analysis of EGAIN

In the Table 1 we report the RMSE (and its standard deviation) for EGAIN, GAIN with
two traditional imputation methods included Mean imputation, KNN imputation. The
missing rate in this experiments is set equal to 20\%. As can be seen from the table,
EGAIN outperforms each benchmark.

Table 1. Imputation performance in terms of RMSE (Average ± Std of RMSE)
Imputation Wine Blood HepatitisC Shill i Beacon
method
EGAIN 0.2371 ± 0.0206 0.1922 ± 0.0210 0.4081 ± 0.0207 0.3147 ± 0.0197 0.2641 ± 0.0074
GAIN 0.2422 ± 0.0369 0.2059 ± 0.0369 0.4524 ± 0.0063 0.3228 ± 0.0161 0.2760 ± 0.0290
Mean 0.2586 0.2237 0.3716 0.3376 0.2602

KNN 0.2602 0.2184 0.4090 0.3681 0.2914

4.2 EGAIN in different missing rate

To better understand GAIN, we conduct several experiments in which we vary the
missing rate. Figure 1 shows the performance (RMSE) of EGAIN within this different
setting in comparison to the other methods. It can be seen that, even though the perfor-
mance of each algorithm decreases as missing rates increase, EGAIN consistently out-
performs the benchmarks across the entire range of missing rates.

ISBN: 978-604-80-8083-9 CITA 2023

30 31 32 33 34 35 36 37 38 39 40