OCRBench is a comprehensive evaluation benchmark designed to assess the OCR capabilities of Large Multimodal Models. It comprises five components: Text Recognition, SceneText-Centric VQA, Document-Oriented VQA, Key Information Extraction, and Handwritten Mathematical Expression Recognition. The benchmark includes 1000 question-answer pairs, and all the answers undergo manual verification and correction to ensure a more precise evaluation. More details can be found in OCRBench README.

OCRBench v2 is a large-scale bilingual text-centric benchmark with currently the most comprehensive set of tasks (4× more tasks than the previous multi-scene benchmark OCRBench), the widest coverage of scenarios (31 diverse scenarios including street scene, receipt, formula, diagram, and so on), and thorough evaluation metrics, with a total of 10, 000 human-verified question-answering pairs and a high proportion of difficult samples. More details can be found in OCRBench v2 README.

News

2024.12.31 🚀 OCRBench v2 is released.
2024.12.11 🚀 OCRBench has been accepted by Science China Information Sciences.
2024.5.19 🚀 We realese DTVQA, to explore the Capabilities of Large Multimodal Models on Dense Text.
2024.5.01 🚀 Thanks to SWHL for releasing ChineseOCRBench.
2024.3.26 🚀 OCRBench is now supported in lmms-eval.
2024.3.12 🚀 We plan to construct OCRBench v2 to include more ocr tasks and data. Any contribution will be appreciated.
2024.2.25 🚀 OCRBench is now supported in VLMEvalKit.

Data	Link	Description
EST-VQA Dataset (CVPR 2020, English and Chinese)	Link	On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering.
Swahili Dataset (ICDAR 2024)	Link	The First Swahili Language Scene Text Detection and Recognition Dataset.
Urdu Dataset (ICDAR 2024)	Link	Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering.
MTVQA (9 languages)	Link	MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering.
EVOBC (Oracle Bone Script Evolution Dataset)	Link	We systematically collected ancient characters from authoritative texts and websites spanning six historical stages.
HUST-OBC (Oracle Bone Script Character Dataset)	Link	For deciphering oracle bone script characters.

Citation

If you wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{Liu_2024,
   title={OCRBench: on the hidden mystery of OCR in large multimodal models},
   volume={67},
   ISSN={1869-1919},
   url={http://dx.doi.org/10.1007/s11432-024-4235-6},
   DOI={10.1007/s11432-024-4235-6},
   number={12},
   journal={Science China Information Sciences},
   publisher={Springer Science and Business Media LLC},
   author={Liu, Yuliang and Li, Zhang and Huang, Mingxin and Yang, Biao and Yu, Wenwen and Li, Chunyuan and Yin, Xu-Cheng and Liu, Cheng-Lin and Jin, Lianwen and Bai, Xiang},
   year={2024},
   month=dec }

README.md Unescape Escape

OCRBench & OCRBench v2

News

Other Related Multilingual Datasets

Citation

README.md