From b115d52e67bf5e51852c02187af281394fe115b8 Mon Sep 17 00:00:00 2001 From: qywh2023 <134821122+qywh2023@users.noreply.github.com> Date: Fri, 20 Jun 2025 21:02:29 +0800 Subject: [PATCH] Update README.md --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index 91c27b8..2a87f67 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,12 @@ OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Vis
+
+> **OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models**
+> Yuliang Liu, Zhang Li, Mingxin Huang, Biao Yang, Wenwen Yu, Chunyuan Li, Xucheng Yin, Cheng-lin Liu, Lianwen Jin, Xiang Bai
+[](https://arxiv.org/abs/2305.07895)
+[](https://github.com/qywh2023/OCRbench/blob/main/OCRBench/README.md)
+
**OCRBench** is a comprehensive evaluation benchmark designed to assess the OCR capabilities of Large Multimodal Models. It comprises five components: Text Recognition, SceneText-Centric VQA, Document-Oriented VQA, Key Information Extraction, and Handwritten Mathematical Expression Recognition. The benchmark includes 1000 question-answer pairs, and all the answers undergo manual verification and correction to ensure a more precise evaluation. More details can be found in [OCRBench README](./OCRBench/README.md).