Abstract
Real-time fine-grained human activity recognition (HAR) remains a challenging problem due to rapid spatial–temporal variations, subtle motion differences, and dynamic environmental conditions. Addressing this difficulty, we propose NovAc-DL, a unified deep learning framework designed to accurately classify short human-like actions, specifically, “pour” and “stir” from sequential video data. The framework integrates adaptive time-distributed convolutional encoding with temporal reasoning modules to enable robust recognition under realistic robotic-interaction conditions. A balanced dataset of 2000 videos was curated and processed through a consistent spatiotemporal pipeline. Three architectures, LRCN, CNN-TD, and ConvLSTM, were systematically evaluated. CNN-TD achieved the best performance, reaching 98.68% accuracy with the lowest test loss (0.0236), outperforming the other models in convergence speed, generalization, and computational efficiency. Grad-CAM visualizations further confirm that NovAc-DL reliably attends to motion-salient regions relevant to pouring and stirring gestures. These results establish NovAc-DL as a high-precision real-time-capable solution for deployment in healthcare monitoring, industrial automation, and collaborative robotics.
| Original language | English |
|---|---|
| Article number | 11 |
| Journal | Big Data and Cognitive Computing |
| Volume | 10 |
| Issue number | 1 |
| DOIs | |
| State | Published - Jan 2026 |
Keywords
- convolutional long short-term memory
- deep learning
- gradient-weighted class activation mapping
- human activity recognition
- long-term recurrent convolutional network
- time-distributed convolutional neural network
- visual tracking
Fingerprint
Dive into the research topics of 'NovAc-DL: Novel Activity Recognition Based on Deep Learning in the Real-Time Environment'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver