Page 67 - Kỷ yếu hội thảo khoa học lần thứ 12 - Công nghệ thông tin và Ứng dụng trong các lĩnh vực (CITA 2023)

P. 67

Duy Tran, Thang Le, Khoa Tran, Hoang Le, Cuong Do, Thanh Ha 51

team can have access to models pre-trained on facial datasets, the performance is likely
to be better.
Another potential avenue for future work is to explore other transformer variants, such
as Swin-Transformer, which has shown promising results in other computer vision tasks.
It would be interesting to investigate whether DeiT (Data-efficient Image Transformers)
could achieve even higher accuracy for face beauty evaluation than ViT [13].
Another area of future research could be to explore ensembling methods for
combining multiple models. Ensembling has been shown to be an effective way to
improve the accuracy of deep learning models by combining the strengths of multiple
models. It would be interesting to investigate whether ensembling ViT with other
models, such as ResNet or VGG, could achieve even higher accuracy for face beauty
evaluation.
Overall, there are several promising directions for future research in this area, and
we hope that our work will inspire further investigation into the use of transformer-
based models for evaluating human beauty.

References

1. Xu, L., Xiang, J., Yuan, X.: Transferring Rich Deep Features for Facial Beauty Prediction
(Version 1) (2018).
2. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end
object detection with transformers. In: European Conference on Computer Vision, LNCS,
pp. 213-229. Springer, Cham (2020).
3. OpenAI. GPT-4 Technical Report. ArXiv, (2023). Available: https://arxiv.org/abs/2303.08774.
4. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T.,
Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An Image
is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv, (2020).
Available: https://arxiv.org/abs/2010.11929.
5. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end
object detection with transformers. In: European Conference on Computer Vision, LNCS,
pp. 213-229. Springer, Cham (2020).
6. Russakovsky, O., Deng, J., Su, H., Krause, J., Sathe, S., Ma, S., Huang, Z., Karpathy, A.,
Khosla, A., Bernstein, M., Berg, AC., Fei-Fei, L.: ImageNet Large Scale Visual
Recognition Challenge (Version 3) (2014).
7. Iyer, J., K T, R., Nersisson, R., Zhuang, Z., Joseph Raj, AN., Refayee, I.: Machine
Learning-Based Facial Beauty Prediction and Analysis of Frontal Facial Images Using
Facial Landmarks and Traditional Image Descriptors. In: López Rubio, E. Computational
Intelligence and Neuroscience, vol. 2021, p. 1-14. Hindawi Limited (2021).
8. Xiao, Q., Wu, Y., Wang, D., Yang, Y-L., Jin, X.: Beauty3DFaceNet: Deep geometry and
texture fusion for 3D facial attractiveness prediction. In: Computers & Graphics, vol. 98,
pp. 11-18. Elsevier BV (2021).
9. Pramerdorfer, C., & Kampel, M.: Facial Expression Recognition using Convolutional
Neural Networks: State of the Art (Version 1) (2016).
10. Liang L, Lin L, Jin L, Xie D, Li M.: SCUT-FBP5500: A Diverse Benchmark Dataset for
Multi-Paradigm Facial Beauty Prediction (Version 1) (2018).

ISBN: 978-604-80-8083-9 CITA 2023

62 63 64 65 66 67 68 69 70 71 72