Special Session @ IJCNN 2025

Overview

Vision Transformers (ViTs) have emerged as a breakthrough architecture in the field of computer vision, achieving state-of-the-art results across a variety of tasks. However, ViTs are data-hungry, often requiring large datasets to achieve optimal performance. This has led to challenges in data-scarce scenarios, where collecting and labelling large amounts of training data is impractical. The "Data-efficient Vision Transformers" special session aims to explore new techniques, challenges, and applications that address the issue of data efficiency in Vision Transformers.

This special session will gather experts from academia, industry, and research institutions to discuss cutting-edge solutions for improving the data efficiency of ViTs, such as novel training setups and architecture, self-supervised learning, transfer learning, and augmentation strategies. The goal is to bridge the gap between the high-performance potential of ViTs and their real-world applicability in scenarios with limited training data.

Objectives

The special session will aim to:

Identify challenges in deploying Vision Transformers (ViTs) with limited training data.
Present recent advances and research findings on data-efficient vision transformer techniques.
Foster collaboration between researchers and industry professionals working on ViTs in data-scarce environments.
Highlight real-world applications of data-efficient ViTs and showcase successful case studies.
Encourage open discussions on potential solutions, future directions, and research opportunities.
Demonstrate the broader relevance of this special session by appealing to audiences beyond the typical scope of IJCNN, including professionals from healthcare, autonomous systems, smart cities, and more.

Scope and Topics

As the field of artificial intelligence continues to advance, the deployment of deep learning models, especially Vision Transformers (ViTs), is expanding into sectors such as healthcare, autonomous vehicles, satellite imaging, and more. However, many of these applications need more labelled data, posing a major barrier to achieving high accuracy.

This special session will provide a platform to discuss innovative strategies that mitigate data scarcity challenges in Vision Transformers, including but not limited to:

Data-efficient training of vision transformers – Techniques that reduce the dependence on large datasets for ViTs.
Self-supervised and unsupervised learning – Exploring methods to pre-train vision transformers using unlabeled data.
Few-shot and transfer learning – Approaches for transferring knowledge across different tasks and domains with minimal labelled data.
Model compression and optimisation – Techniques to make vision transformers more lightweight and computationally efficient.
Applications in resource-constrained environments – ViT-based solutions in healthcare, agriculture, smart cities, and autonomous driving with limited data.
Benchmarking and evaluation – Analysing performance metrics and benchmarks for data-efficient ViTs.
Generative models for data augmentation – Using the latest generative models (e.g., StableDiffusion) to create synthetic data for training.

Target Audience

This special session will attract:

Researchers and practitioners: Those in computer vision and deep learning, particularly working with vision transformers.
AI professionals and data scientists: Individuals interested in developing efficient models for data-scarce environments.
Industry professionals: Stakeholders looking for practical applications of vision transformers in fields like healthcare, autonomous systems, and IoT.
Graduate students and early-career researchers: Those exploring new frontiers in neural network-based computer vision.

Submission Guidelines

Please follow the instructions carefully for submitting your paper to IJCNN 2025:

Double-Blind Reviewing: Ensure anonymity by removing author names and references to prior work in the first person. Papers revealing author identities may be rejected.
Submission System: Papers must be submitted through the IEEE IJCNN 2025 online submission system. All papers must be submitted through the IEEE IJCNN 2025 Microsoft CMT submission system. For special session papers, please select the respective corresponding title 'Data-efficient Vision Transformers: Challenges & Applications' as primary topic in the list of research topics in the submission system.
AI-Generated Text Disclosure: If using AI-generated text, disclose it in the acknowledgements and cite the AI system used.
Plagiarism Check: Papers will be checked for plagiarism and may be desk-rejected if suspected of plagiarism.
Paper Length: Each paper is limited to 8 pages. Up to 2 additional pages are allowed for a fee of $100 per extra page.
Submission System: IJCNN 2025 uses Microsoft CMT for paper submission. The system is available at Microsoft CMT.

For more details, visit https://2025.ijcnn.org/authors/initial-author-instructions.

Important Dates

30 January 2025: Extended Paper Submission Deadline
1̶5̶ J̶a̶n̶u̶a̶r̶y̶ 2̶0̶2̶5̶: Paper Submission Deadline
31 March 2025: Paper Acceptance Notification
1 May 2025: Camera-Ready Paper Submission Deadline
1 May 2025: Early Registration Deadline
TBA: Special Session: Data-efficient Vision Transformers

Organizers

Dr. Haider Raza

University of Essex, UK

Dr. Raza is a Senior Lecturer with a strong background in AI, deep learning, and computer vision. His research focuses on developing AI solutions for healthcare, autonomous systems, and digital technology. He has published extensively on efficient AI models and has a track record of organising successful special sessions and conferences.

Dr. Muhammad Haris Khan

MBZUAI, UAE

Dr. Khan is an Assistant Professor at MBZUAI with expertise in computer vision, Vision Transformers, and data-efficient learning methods. His research addresses the challenges of model generalizability to new domains, data and label scarcity, and efficiency in AI models.

Prof. John Q Gan

University of Essex, UK

Prof. Gan is a professor of artificial intelligence with extensive experience in deep learning and its applications in image and video classification, medical image analysis, and understanding.

Mohsin Ali

University of Essex, UK

Mohsin Ali is a PhD scholar at the University of Essex, specializing in computer vision with a focus on Vision Transformers (ViTs) and explainable AI. His research aims to enhance the interpretability and efficiency of deep learning models in real-world applications.

Contact Information

For inquiries regarding the special session, feel free to contact:

Dr. Haider Raza (Email: h.raza@essex.ac.uk)
Mohsin Ali (Email: ma22159@essex.ac.uk)

This special session is co-located at the IJCNN 2025. More details can be found on the official conference website: https://2025.ijcnn.org/

Institutes

University of Essex

Colchester, United Kingdom

MBZUAI

Abu Dhabi, United Arab Emirates

We believe this special session will offer significant insights and foster discussions on the future of Vision Transformers in data-efficient settings, aligning with the goals of IJCNN 2025. We look forward to contributing to the conference with this engaging and timely topic.

Special Session @

Special Session: Data-efficient Vision Transformers: Challenges & Applications

Overview

Objectives

Scope and Topics

Target Audience

Submission Guidelines

Important Dates

Organizers

Contact Information

Institutes