Page 67 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)
P. 67

Duy Tran, Thang Le, Khoa Tran, Hoang Le, Cuong Do, Thanh Ha                      51


                     team can have access to models pre-trained on facial datasets, the performance is likely
                     to be better.
                       Another potential avenue for future work is to explore other transformer variants, such
                     as Swin-Transformer, which has shown promising results in other computer vision tasks.
                     It would be interesting to investigate whether DeiT (Data-efficient Image Transformers)
                     could achieve even higher accuracy for face beauty evaluation than ViT [13].
                       Another  area  of  future  research  could  be  to  explore  ensembling  methods  for
                     combining multiple models. Ensembling has been shown to be an effective way to
                     improve the accuracy of deep learning models by combining the strengths of multiple
                     models.  It  would  be  interesting  to  investigate  whether  ensembling  ViT  with  other
                     models, such as ResNet or VGG, could achieve even higher accuracy for face beauty
                     evaluation.
                       Overall, there are several promising directions for future research in this area, and
                     we hope that our work will inspire further investigation into the use of transformer-
                     based models for evaluating human beauty.

                     References



                      1.  Xu, L., Xiang, J., Yuan, X.: Transferring Rich Deep Features for Facial Beauty Prediction
                          (Version 1) (2018).
                      2.  Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end
                          object detection with transformers. In: European Conference on Computer Vision, LNCS,
                          pp. 213-229. Springer, Cham (2020).
                      3.   OpenAI. GPT-4 Technical Report. ArXiv, (2023). Available: https://arxiv.org/abs/2303.08774.
                      4.  Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T.,
                          Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image
                          is  Worth  16x16  Words:  Transformers  for  Image  Recognition  at  Scale.  ArXiv,  (2020).
                          Available: https://arxiv.org/abs/2010.11929.
                      5.  Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end
                          object detection with transformers. In: European Conference on Computer Vision, LNCS,
                          pp. 213-229. Springer, Cham (2020).
                      6.  Russakovsky, O., Deng, J., Su, H., Krause, J., Sathe, S., Ma, S., Huang, Z., Karpathy, A.,
                          Khosla,  A.,  Bernstein,  M.,  Berg,  AC.,  Fei-Fei,  L.:  ImageNet  Large  Scale  Visual
                          Recognition Challenge (Version 3) (2014).
                      7.  Iyer,  J.,  K  T,  R.,  Nersisson,  R.,  Zhuang,  Z.,  Joseph  Raj,  AN.,  Refayee,  I.:  Machine
                          Learning-Based Facial Beauty Prediction and Analysis of Frontal Facial Images Using
                          Facial Landmarks and Traditional Image Descriptors. In: López Rubio, E. Computational
                          Intelligence and Neuroscience, vol. 2021, p. 1-14. Hindawi Limited (2021).
                      8.  Xiao, Q., Wu, Y., Wang, D., Yang, Y-L., Jin, X.: Beauty3DFaceNet: Deep geometry and
                          texture fusion for 3D facial attractiveness prediction. In: Computers & Graphics, vol. 98,
                          pp. 11-18. Elsevier BV (2021).
                      9.  Pramerdorfer,  C.,  &  Kampel,  M.:  Facial  Expression  Recognition  using  Convolutional
                          Neural Networks: State of the Art (Version 1) (2016).
                      10.  Liang L, Lin L, Jin L, Xie D, Li M.: SCUT-FBP5500: A Diverse Benchmark Dataset for
                          Multi-Paradigm Facial Beauty Prediction (Version 1) (2018).



                     ISBN: 978-604-80-8083-9                                                  CITA 2023
   62   63   64   65   66   67   68   69   70   71   72