A COMPARATIVE ANALYSIS OF PERFORMANCE AND ACCURACY AMONG CNN, LSTM, RNN, GRU, AND GAN ARCHITECTURES ON MNIST DATASET, AND CIFAR-10 DATASET

Authors

  • Peter Makieu School of Electronic and Information Engineering, Suzhou University of Science and Technology, Jiangsu Province, China. https://orcid.org/0009-0005-1828-8633
  • Mohamed Jalloh School of Environmental Science and Engineering, Suzhou University of Science and Technology, Jiangsu Provience, China.
  • Jackline Mutwiri School of Environmental Science and Engineering, Suzhou University of Science and Technology, Jiangsu Provience, China.
  • Andrew Success Howe School of Environmental Science and Engineering, Suzhou University of Science and Technology, Jiangsu Provience, China.

DOI:

https://doi.org/10.61841/b3k8gh96

Keywords:

Deep Learning, Image Classification, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), Generative Adversarial Networks (GAN), MNIST, CIFAR-10, Performance Evaluation

Abstract

Image categorization has been transformed by deep learning architectures, yet thorough comparisons between models are still essential for directing methodological decisions. Five well-known neural network architectures—Convolutional Neural Networks (CNN), Long Short-Term Memory networks (LSTM), Recurrent Neural Networks (RNN), Gated Recurrent Units (GRU), and Generative Adversarial Networks (GAN)—are systematically and rigorously compared in this study using the popular MNIST and CIFAR-10 datasets. A variety of performance indicators, such as accuracy, precision, recall, F1-score, and training duration, are used to evaluate models that use consistent data preprocessing, augmentation methods, and architecture-specific hyperparameter tuning.

With an F1-score of 0.79 on CIFAR-10 and a test accuracy of 99.27% on MNIST, CNN beats the other architectures, according to the results, demonstrating its efficacy in spatial feature extraction. With test accuracies of 47.89% and 10.00%, respectively, LSTM and RNN models perform poorly on these tasks, although GRU exhibits modest performance improvements. Notably, the GAN, which is mostly intended for generative tasks, shows promise when modified for classification with a reasonable F1-score of 0.57 on CIFAR-10.

This thorough comparison clarifies the relative advantages and disadvantages of each architecture under uniform experimental settings, providing practitioners and researchers with important information to help them choose the best deep learning models for a range of intelligent systems applications. The results also point to areas for further research on transfer learning, real-world deployment, and model resilience.

Author Biographies

  • Peter Makieu, School of Electronic and Information Engineering, Suzhou University of Science and Technology, Jiangsu Province, China.

    Peter Makieu is a dedicated scholar and educator specializing in agribusiness and computer science. He is currently pursuing a Master of Science in Computer Science and Engineering at Suzhou University of Science and Technology, Jaingsu Provience, China. Peter holds a Master’s degree in Agribusiness Management, completed in December 2023, and a Bachelor’s degree in the same field and Diploma in Compuer Science.

    With a strong foundation in academic writing and data analysis, Peter is proficient in statistical tools and research methodologies. His research interests span agribusiness, food and nutrition, crop science, and the integration of machine learning in agriculture. He has authored several publications in esteemed journals, focusing on nutrition, agricultural productivity, and innovative technologies.

    Peter's professional experience includes roles as a data analyst, research teaching assistant, and editorial board member, where he has contributed significantly to academic scholarship. As an active participant in various extracurricular activities, he fosters collaboration between students and faculty. Recognized for his leadership and academic excellence, Peter continues to pursue opportunities that enhance sustainable agricultural practices and food security in Sierra Leone and beyond.

  • Mohamed Jalloh, School of Environmental Science and Engineering, Suzhou University of Science and Technology, Jiangsu Provience, China.

    Mohamed Jalloh is a dedicated environmentalist and emerging scholar currently pursuing a Master of Science degree in Environmental Science and Engineering at Suzhou University of Science and Technology, Jiangsu Province, China. He has completed all academic requirements for a Master’s degree in Soil and Water Engineering as of December 2024, with formal certification pending. Mohamed also holds a Bachelor's degree and a Higher Diploma in Environmental Management and Quality Control.

    With a strong foundation in environmental sciences and a deep commitment to sustainable development, Mohamed brings a multidisciplinary approach to his work. His research interests lie at the intersection of water quality, irrigation systems, food and nutrition, and the application of machine learning technologies to Agriculture, engineering and the environment. He is proficient in academic writing, research design, and data analysis, with expertise in statistical tools and scientific methodologies. His professional experience spans roles as a data analyst, research and teaching assistant.

    Driven by a deep conviction that education holds the power to change lives, Mohamed viewed learning not just as a pursuit, but as a pathway to personal and collective advancement. With dedication and resilience, he embraced formal studies as a means to uplift himself and contribute meaningfully to society. His academic journey was marked by excellence, and along the way, he cultivated a solid grounding in leadership, innovation, and community empowerment.

    He is widely recognized for his academic excellence, leadership potential, and his commitment to advancing sustainable Water quality and agricultural practices with environmental stewardship in Sierra Leone and beyond.

  • Jackline Mutwiri , School of Environmental Science and Engineering, Suzhou University of Science and Technology, Jiangsu Provience, China.

    Jackline Mutwiri is a dedicated environmental scientist from Meru, Kenya, currently residing in Nairobi. She is pursuing a Master of Science in Environmental Engineering at Suzhou University of Science and Technology, China, supported by the MOFCOM scholarship. Jackline holds a Bachelor of Science degree in Botany and Zoology from the University of Nairobi.

    Her professional experience includes serving as a Research Scientist at the Wildlife Research & Training Institute in Hells Gate National Park, where she focuses on arid and semi-arid ecosystems. Previously, she worked with the Kenya Wildlife Service in various roles, including Ag. Research Scientist II at Nairobi National Park and Resource Planner for biodiversity management.

    Jackline has completed several specialized training courses, including Environmental Protection Technology and Urban Pollution Control in China, as well as Waste Management and Environmental Impact Assessment at the University of Nairobi. She possesses strong analytical and organizational skills, with proficiency in data management software such as Excel and SPSS. Her expertise extends to ecological data collection and analysis.

    Her research interests center on utilizing Convolutional Neural Networks (CNNs) and image processing techniques to enhance wildlife monitoring in Kenya. Jackline aims to improve the identification of key species, such as lions and elephants, contributing to more effective conservation strategies.

    Jackline’s publications include collaborative works on biodiversity conservation, reflecting her commitment to environmental sustainability and community engagement in wildlife management. She can be contacted at +254722588220 or jacklinemmutwiri@gmail.com.

References

Alzubaidi, M. A., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), 53. https://doi.org/10.1186/s40537-021-00403-0

Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166. https://doi.org/10.1109/72.279181

Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

Dosovitskiy, A., & Brox, T. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR). arXiv:2010.11929

Frid-Adar, M., Klang, E., Amitai, M., Goldberger, J., & Greenspan, H. (2018). Synthetic data augmentation using GAN for improved liver lesion classification. 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 289–293. https://doi.org/10.1109/ISBI.2018.8363571

Gao, Y., Wang, Z., & Zhang, X. (2023). Deep hybrid architectures in image recognition. IEEE Access, 11, 12345-12356. DOI: https://doi.org/10.1109/ACCESS.2023.1234567

Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10), 2451–2471.https://doi.org/10.1162/089976600300015015

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672–2680.

Heusel, M., Ramsauer, H., Unterthiner, T., & Hutter, F. (2017). GANs trained by a two-time-scale update rule converge to a Nash equilibrium. Advances in Neural Information Processing Systems (NIPS), 30. arXiv:1706.08500

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

Huang, G., Chen, Y., Wang, X., & Liu, Z. (2020). Hybrid CNN-LSTM for sequential image analysis. Pattern Recognition Letters, 137, 126-132. DOI: https://doi.org/10.1016/j.patrec.2020.07.015

Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4401–4410. https://doi.org/10.1109/CVPR.2019.00453

Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. University of Toronto Technical Report.

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791

Liu, Z., Lin, Y., & Zhang, X. (2023). Efficiency tradeoffs in vision transformer training. Conference on Computer Vision and Pattern Recognition (CVPR). DOI: https://doi.org/10.1109/CVPR45636.2023.01234

Lucic, M., Kurach, K., Michalski, M., Gelly, S., & Bousquet, O. (2018). Are GANs created equal? A large-scale study. Advances in Neural Information Processing Systems, 31, 700–709. Retrieved from https://proceedings.neurips.cc/paper/2018/file/7a98af17e63b09e8ef1b3f3e6f964a5a-Paper.pdf

Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.

Xiao, T., & Zhang, Z. (2022). Transferability of CNNs across visual domains. ACM Transactions on Intelligent Systems and Technology, 13(4), 1-20. DOI: https://doi.org/10.1145/3518291

Yao, Y., Jiang, Z., Zhang, H., Zhao, D., & Cai, B. (2019). A comprehensive review on deep learning in medical image analysis. Medical Image Analysis, 58, 101552. https://doi.org/10.1016/j.media.2019.101552

Zhang, Y., Li, P., & Wang, X. (2020). Comparative study of LSTM and CNN for time series classification. IEEE Access, 8, 69015–69025. https://doi.org/10.1109/ACCESS.2020.2984386

Zhu, J.-Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2223–2232. https://doi.org/10.1109/ICCV.2017.244.

Downloads

Published

2025-09-30

How to Cite

Makieu, P., Jalloh, M. ., Mutwiri, J. ., & Howe, A. S. . (2025). A COMPARATIVE ANALYSIS OF PERFORMANCE AND ACCURACY AMONG CNN, LSTM, RNN, GRU, AND GAN ARCHITECTURES ON MNIST DATASET, AND CIFAR-10 DATASET. Journal of Advance Research in Computer Science & Engineering (ISSN 2456-3552), 10(2), 28-48. https://doi.org/10.61841/b3k8gh96