Page 77 - 2025S
P. 77
70 UEC Int’l Mini-Conference No.54
Anomaly Classification with Scene
Description in Public Roads
2
2
Jose Emilio Vera Cordero , Mariko Nakano , Hiroki Takahashi 1
1. UNIVERSITY OF ELECTRO-COMMUNICATIONS - TOKYO, JAPAN
2. NATIONAL POLYTECHNIC INSTITUTE - MEXICO CITY, MEXICO
Introduction Background
Problem statement: Binary classification:
Anomalous events are those that deviate from common The UCF-Crime dataset is most often treated as a binary
behavior and therefore infrequent. classification problem, therefore, most works only classify
In urban environments, there is a high probability that between abnormal and normal events (binary classification),
anomalous events will be harmful to those involved, as they however, multi-class classification can provide more
are infrequent. information about the detected anomalies.
Classifying these events can be beneficial as it could inform
authorities on how to respond to them. State of the art in UCF-Crime dataset in multi-class classification
Method Video-Level Accuracy (%)
UCF-Crime:
Consists of surveillance videos which cover 13 real-world Sultani et al. (C3D) 23
anomalies: Arrest, Arson, Assault, Burglary, Explosion, Fighting, Sultani et al. (TCNN) 28.40
Robbery, Shooting, Shoplifting, Stealing, Vandalism, etc. Wu et al. 41.43
Is weakly supervised so the authors only provide temporal
anomalous annotations for the test set videos.
ABUSE
Proposal
UCA limitations:
Compare the UCA dataset time segments with the
segments detected by the anomaly detector.
CAR ACCIDENT
Analyze the sentences used in UCA to extract only the
descriptions of abnormal events.
Video and Textual description Fusion
The descriptions should provide semantic understanding
of abnormal events, which could help to better classify
Fig. 1. Examples of different anomalies from videos in the UCF- them.
Crime dataset.
Abnormal CLIP Abnormal
Video Frame Class
Image I3D Feature Abnormal
Image Anomaly time
Frame
Extraction
Detector frame
time time Abnormal
Fig. 2. UCF-Crime anomalous segments extraction. (1) Event
Description
UCA dataset: Fig. 4. Anomaly Classification.
Provides descriptions for the events in the UCF-Crime dataset Expected results
videos. Fuse visual and textual features.
Provides the time at which each event occurs in each video. Improve the classification accuracy of the anomalies
proposed in the UCF-Crime dataset using video and
textual features.
Improve the explainability of anomalies detections
through classification and description.
Start End
Point Point
1:34.04 1:45.06
Sentence: The man in red took out his gun and shot at the References
striped man. At the same time, the four people sitting on the [1] Waqas Sultani, Chen Chen, Mubarak Shah; “Real-World Anomaly Detection in Surveillance Videos,” Proceedings of the
bench in the front stood up and hid in the corner of the room. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 6479-6488.
[2] T. Yuan et al., "Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges,"
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2024, pp. 22052-22061,
Fig. 3. UCA annotation examples. doi: 10.1109/CVPR52733.2024.02082.
[3] Wu, P., Zhou, X., Pang, G., Sun, Y., Liu, J., Wang, P., & Zhang, Y. (2024). Open-vocabulary video anomaly detection. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 18297-18307).