Download PDFOpen PDF in browserThe Enhanced Human Activity Recognition: A Context Based Two-Stream Framework for Learning Temporal and Spatial Features12 pages•Published: August 6, 2024AbstractHuman Activity Recognition (HAR) from video signals is increasingly crucial for surveillance, healthcare, robotics, and augmented reality applications. Accurately iden- tifying human actions is vital in our data-driven world, posing a significant technological challenge. This study introduces a comprehensive methodology for HAR, starting with the preprocessing of video frames using context-awareness. The context-aware frames are then fed into a two-stream framework, extracting spatial and temporal features in a complemen- tary manner. The spatial stream analyzes visual features from individual video frames, while the temporal stream focuses on dynamic aspects, capturing intricate motion patterns. This separation allows for a detailed analysis of video data, aligning with human perception of activities. The subsequent stage involves a late binding mechanism, enabling optimal interaction between spatial and temporal streams. Integration in a dense layer allows the model to harness interactions between these information streams, significantly improving recognition accuracy. Rigorous experimental validation confirms the efficacy and reliabil- ity of the proposed approach in diverse scenarios using real-world datasets HMDB51 and UCF50. The results demonstrate high accuracy, precision, recall, and F-measure for the combined spatial and temporal model compared to individual streams. This research con- tributes to advancing HAR technology, improving how computers interpret and recognize human activities in videos for practical and beneficial applications.Keyphrases: context aware frames, human activity recognition, late binding, spatial stream, temporal stream, two stream framework In: Rajakumar G (editor). Proceedings of 6th International Conference on Smart Systems and Inventive Technology, vol 19, pages 82-93.
|