# OwlEval We have compiled some examples and their corresponding questions from recent open-source work, and organized them into OwlEval. Following we will introduce the OwlEval and the data format in this document. ## Data Format ### questions `questions.jsonl` contains case images and information about their corresponding questions Each row contains the following field: - `image`: Indicates the name of the picture - `question_id`: Indicate the question id number, there are 82 questions - `question`: Represents specific problem information - `type`:Indicate whether the problem is a single-turn problem or a multi-turn problem For example: ```json {"image": "1.jpg", "question_id": 1, "question": "What is funny about this image? Describe it panel by panel.", "type": ["single"]} ``` ### answer This contains the responses of each model for each question, integrated into six jsonl: `llava_13b_answer.jsonl` `minigpt4_13b_answer.jsonl` `MMreact_answer.jsonl` `mPLUG_Owl_7b_answer.jsonl` `BLIP2_13b_answer.jsonl` `openflamingo_answer.jsonl` For each `answer/xxx.jsonl` it contains the following information: - `image`: Indicates the name of the picture - `question_id`: Indicate the question id number, there are 82 questions - `question`: Represents specific problem information - `answer`: Replie given by the model - `model_id`: The ID of the model the answer is generated by For example: ```json {"image": "10.jpg", "question_id": 15, "question": "How many bedrooms are there in this floor plan?", "answer": "There are three bedrooms in this floor plan.", "model_id": "llava-13b"} ``` ### cases This folder contains 50 evaluation pictures, where 21 from mini GPT-4, 13 from mm-react, 9 from blip-2, 3 from GPT-4 and 4 collected by us