plos tem fr 2
Next-Generation Video AI: Unlocking Efficiency with Psychovisual Perception Models

Dr Anastasia Mozhaeva, Eastern Institute of Technology
Associate Professor Patrice Delmas, Auckland University
BCS Student Steven Rogers, Eastern Institute of Technology

Access to the complete annotated dataset is available upon request from
AMozhaeva@eit.ac.nz


Abstract:

While our world is highly dependent on computer vision, and despite Artificial Intelligence's impressive success in various vision tasks, we still struggle with heavy computation and intensive memory costs. Performing inference with deep learning models for video remains challenging due to the significant computational resources required to achieve reliable recognition. While machine learning has revolutionised computer vision, current systems still struggle to perceive the world as humans do. Videos inherently contain redundant information that human observers do not consciously perceive, which can lead to potential misclassification when models are penalised for identifying explicitly unlabeled yet irrelevant information. The fundamental challenge in video processing is determining which visual stimuli are reliably perceived by human observers. This work creates a novel approach to improving prediction vision AI algorithms, namely, reducing training video dataset redundancy by leveraging insights into human visual perception by 15%.
This project will create a core of visual knowledge necessary to train Visual Artificial Intelligence.

Code and data [Github]