Md Shamimul Islam1, Jalal Uddin Md Akbar2, Md Mahedi Hasan3, N HM Arafat4, Sohaib Abdullah1, Saydul Akbar Murad5
1 Dept. of CSE, Asian University of Bangladesh, Dhaka, Bangladesh.
2 Dept. of CSE, International Islamic University Chittagong, Bangladesh.
3 IICT, BUET, Dhaka, Bangladesh.
4 Dept. of Computer Science and Technology, Henan Polytechnic University, China.
5 Faculty of Computing, Universiti Malaysia Pahang, Malaysia.
2 Dept. of CSE, International Islamic University Chittagong, Bangladesh.
3 IICT, BUET, Dhaka, Bangladesh.
4 Dept. of Computer Science and Technology, Henan Polytechnic University, China.
5 Faculty of Computing, Universiti Malaysia Pahang, Malaysia.
Abstract
Smart surveillance systems can play a significant role in detecting sexual harassment in real-time for law enforcement which can reduce the sexual harassment activities. Real-time detecting of sexual harassment from video is a complex computer vision task because of various factors such as clothing or carrying variation, illumination variation, partial occlusion, low resolution, view angle variation, etc. Due to the advancement of convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM), human action recognition tasks have achieved great success in recent years. In this work, to address this problem, we build a video dataset of sexual harassment, namely Sexual Harassment Video (SHV) dataset which consists of harassment and non-harassment videos collected from YouTube. Besides, we build a CNN-LSTM network to detect the sexual harassment in which CNN and RNN are employed for extracting spatial features and temporal features, respectively. State-of-the-art pretrained models are also employed as a spatial feature extractor with an LSTM and three dense layers to classify harassment activities. Moreover, to find the robustness of our proposed model, we have conducted several experiments with our proposed method on two other benchmark datasets, such as Hockey Fight dataset and Movie Violence dataset and achieved state-of-the-art accuracy.
Index Terms: sexual harassment, surveillance systems, deep learning, CNN-LSTM, action recognition.