Introducing Donut: The OCR-Free Document Understanding Transformer Revolutionising Visual Document Understanding
Research Paper Summary
In this blog, we will be doing a deep dive of the paper OCR-free Document Understanding Transformer.
Outline
Introduction: OCR Free Document Understanding Transformer (Donut)
Synthetic Document Generator: Generating Data for Pre-training
Pre-training of Donut Model
Results and Performance
Conclusion
Introduction: OCR Free Document Understanding Transformer (Donut)
The task of understanding document images such as invoices has been a core but challenging problem. Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf OCR engines and focus on understanding the task with the OCR output. This can lead to high computational cost, lack of…