Research Project

Next-Generation Video AI: Unlocking Efficiency with Psychovisual Perception Models

This project develops perception-aware methods for visual artificial intelligence, using models of human visual perception to reduce video dataset redundancy and improve the efficiency of deep learning-based video analysis.

Visual AI Deep Learning Psychovisual Models Video Processing Data Efficiency

Key Results

15% reduction in training video dataset redundancy
3.5% improvement in model accuracy on the optimised dataset
Faster training convergence with reduced computational requirements

Abstract

Modern computer vision systems achieve remarkable performance, yet video-based deep learning still requires significant computational power, memory, and data throughput. Current AI systems process large amounts of visually redundant information that human observers do not consciously perceive, increasing computational cost and limiting the efficiency of real-time applications.


This project introduces a psychovisual approach to Visual Artificial Intelligence by integrating models of human visual perception into AI-driven video processing workflows. The proposed methodology reduces redundancy within training video datasets by approximately 15%, improves deep learning model accuracy by up to 3.5%, accelerates convergence during training, and decreases computational requirements for inference.


The research demonstrates how perception-aware processing can support more efficient and scalable AI systems for real-time vision applications, particularly in embedded, robotic, and autonomous environments where bandwidth, energy, and hardware resources are constrained. This work contributes to the development of human-inspired visual AI systems that combine deep learning with psychovisual modelling principles.

Research Focus

Technical Contribution

  • Psychovisual dataset optimisation to remove visually redundant video information.
  • Deep learning efficiency through perception-aware preprocessing and training data reduction.
  • Visual AI performance improvement with reduced computational load and faster convergence.
  • Human-inspired modelling for more efficient video understanding systems.

Application Context

  • Embedded AI for resource-constrained vision systems.
  • Robotics and autonomous platforms requiring efficient real-time video analysis.
  • Video-based recognition where computational cost and memory demand remain major barriers.
  • Future visual AI systems that process information closer to the way humans perceive visual scenes.

Publication, Code and Data

Publication

This project has been published as “Visual Artificial Intelligence: Unlocking Efficiency with Psychovisual Models”.

Dataset Access

Access to the complete annotated dataset is available upon request from amozhaeva@eit.ac.nz.