[Event at CIG] [CFP] VINALDO: 3rd edition of International workshop on Machine vision and NLP for Document Analysis

Tue Mar 31 12:18:11 CEST 2026

*VINALDO: 3rd edition of International workshop on Machine vision and NLP
for Document Analysis*
*Website:*
https://sites.google.com/view/vinaldo3rdeditionofinternation/call-for-papers

*In conjonction with ICDAR 2026 <https://icdar2026.org/> - Aug 30- Sep 04,
2026 • Vienna, Austria*
https://icdar2026.org/

Document understanding is essential in areas such as invoice extraction,
medical record analysis, and legal document processing. While many
workshops focus on pure vision-based tasks (OCR, layout analysis) or pure
NLP tasks, VINALDO emphasizes the synergistic integration of computer
vision and natural language processing for structured information
extraction and semantic understanding of documents.

This third edition of VINALDO highlights structured knowledge extraction
from documents using multimodal approaches, with a focus on:

   -

   Knowledge Graphs (KGs) built from visual and textual cues
   -

   Integration of Large Language Models (LLMs) with visual document
   understanding
   -

   Multimodal representation learning for semantic retrieval

Our goal is to move beyond traditional document analysis by exploring how
vision and language jointly enable structured, relational understanding
particularly in complex documents like invoices, forms, and reports.

Novelty for this edition:

After the success of VINALDO 2023
<https://sites.google.com/view/vinaldo-workshop-icdar-2023/home>, and VINALDO
2024 <https://sites.google.com/view/vinaldo-workshop-icdar-2024/home>,  in this
third edition of the VINALDO workshop, we encourage the description of
novel problems or applications for document analysis in the area of
information retrieval that has emerged in recent years. In the last
edition VINALDO
2024 <https://sites.google.com/view/vinaldo-workshop-icdar-2024/home> we
highlighted a particular topic namely “Knowledge Graphs and Multimodal
approaches”.

In this new edition, we aim to encourage novel and recent research on
document analysis including, but not limited to, approaches that intersect
with areas such as Large Language Models (LLMs), Knowledge Graphs (KGs),
and Natural Language Processing (NLP). The VINALDO workshop focuses on
the joint
exploitation of visual and textual information for document understanding,
while remaining open to a wide range of methods and perspectives.

In particular, we highlight the growing importance of structured
representations such as Knowledge Graphs extracted from document context,
which are still underexplored despite their relevance across many
application domains. We therefore welcome contributions that explore the
combination of computer vision, NLP, and structured knowledge
representations, as well as works that integrate NLP and vision techniques
in innovative ways.

We also encourage submissions that introduce new datasets, benchmarks, or
real-world applications related to document analysis. Overall, the VINALDO
workshop aims to bring together researchers and practitioners from academia,
industry, and applied research to exchange ideas, share experiences, and
discuss ongoing challenges and advances in document analysis at the
intersection of Computer Vision and NLP

Researchers and practitioners all over the world, from both academia and
industry, working in the areas of document and textual analysis. Topics of
interest include, but are not limited to, the following:

   -

   Multimodal Knowledge Graph Construction from documents
   -

   Vision-Language Models for Document Understanding
   -

   Joint Entity and Relation Extraction from visual and textual content
   -

   Structured Document Understanding with LLMs
   -

   Multimodal Document Representation Learning
   -

   Graph-Based Spatial and Semantic Reasoning in documents
   -

   Integration of Knowledge Graphs and Vision Transformers
   -

   Multimodal Invoice and Form Analysis
   -

   Cross-modal Retrieval in Document Collections
   -

   Benchmarks and Datasets for Multimodal Document Understanding

Note: Topics that are purely vision-based (e.g., OCR, table detection,
handwriting recognition) or purely NLP-based are better suited to other
ICDAR workshops. VINALDO focuses on their intersection.

*Submission via easychair : https://easychair.org/cfp/VINALDO2026
<https://easychair.org/cfp/VINALDO2026>*

*Important Dates*

Submission Deadline: May 5th, 2026 at 11:59pm AoE Time

Decisions Announced: May 15th, 2026, at 11:59pm AoE Time

Camera Ready Deadline: July 1, 2026

Contact:

Rim Hantach <rim.hantach at gmail.com>

Rafika Boutalbi <boutalbi.rafika at gmail.com>

Karima Boutalbi <karima.boutalbi1 at gmail.com>