Introducing Donut: The OCR-Free Document Understanding Transformer Revolutionising Visual Document Understanding

Research Paper Summary

5 min readJan 10, 2023

Introducing Donut: The OCR-Free Document Understanding Transformer Revolutionising Visual Document Understanding — Image from Source

In this blog, we will be doing a deep dive of the paper OCR-free Document Understanding Transformer.

Outline

Introduction: OCR Free Document Understanding Transformer (Donut)
Synthetic Document Generator: Generating Data for Pre-training
Pre-training of Donut Model
Results and Performance
Conclusion

Introduction: OCR Free Document Understanding Transformer (Donut)

The task of understanding document images such as invoices has been a core but challenging problem. Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf OCR engines and focus on understanding the task with the OCR output. This can lead to high computational cost, lack of…

Introducing Donut: The OCR-Free Document Understanding Transformer Revolutionising Visual Document Understanding

Research Paper Summary

Outline

Introduction: OCR Free Document Understanding Transformer (Donut)

Written by Prakhar Mishra