A Hybrid Deep Learning Model for Efficient Anomaly Detection in Video Surveillance Systems

Authors
  • Abba M. BALA

    Department of Computer Science, Faculty of Computing, Northwest University, Kano, Kano State, Nigeria

    Author

  • Abdurrauf G. SHARIFAI

    Department of Computer Science, Faculty of Computing, Northwest University, Kano, Kano State, Nigeria

    Author

  • Umar S. HARUNA

    Department of Cybersecurity, Faculty of Computing, Northwest University, Kano, Kano State, Nigeria

    Author

Keywords:
CNN, ViT, Swin-Transformer, VAD, Attention Mechanism, Feature Fusion.
Abstract

Traditional CNNs often face challenges in capturing long-range dependencies and contextual relationships within data, which limits their effectiveness in complex tasks like anomaly detection. To overcome these limitations, we propose an innovative enhanced attention-mechanism hybrid model that combines the strengths of CNNs with Transformer architecture. This hybrid model leverages the powerful feature extraction capabilities of four distinct CNN architectures, VGG16, DenseNet121, ResNet50, and MobileNetV2. To process the training data comprehensively, the extracted features are fused and passed through Swin Transformer which integrates attention mechanisms to capture long-range dependencies within the data effectively, and focuses on the most relevant regions of the input data. The effectiveness of this approach is evaluated on the UCF-Crime benchmark dataset using performance metrics such as ROC-AUC, achieving an outstanding accuracy of 99.2% surpassing existing state-of-the-art methods. Moreover, the model’s ability to handle complex video data and extract semantically rich features highlights its potential for real-time surveillance applications where timely and accurate anomaly detection is critical.

References
Cover Image
Downloads
Published
02-03-2026
Section
Articles
License

Copyright (c) 2026 FUDMA Journal of Engineering and Technology

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

How to Cite

A Hybrid Deep Learning Model for Efficient Anomaly Detection in Video Surveillance Systems. (2026). FUDMA Journal of Engineering and Technology, 2(1), 41-50. https://doi.org/10.33003/q1mej406

Similar Articles

1-10 of 14

You may also start an advanced similarity search for this article.