# OwlEval

We have compiled some examples and their corresponding questions from recent open-source work, and organized them into OwlEval.

Following we will introduce the OwlEval and the data format in this document.

## Data Format

### questions

`questions.jsonl` contains case images and information about their corresponding questions

Each row contains the following field:

- `image`: Indicates the name of the picture
- `question_id`: Indicate the question id number, there are 82 questions

- `question`: Represents specific problem information
- `type`：Indicate whether the problem is a single-turn problem or a multi-turn problem

For example:

```json
{"image": "1.jpg", "question_id": 1, "question": "What is funny about this image? Describe it panel by panel.", "type": ["single"]}
```

### answer

This contains the responses of each model for each question, integrated into six jsonl:

`llava_13b_answer.jsonl`

`minigpt4_13b_answer.jsonl`

`MMreact_answer.jsonl`

`mPLUG_Owl_7b_answer.jsonl`

`BLIP2_13b_answer.jsonl`

`openflamingo_answer.jsonl`

For each `answer/xxx.jsonl` it contains the following information:

- `image`: Indicates the name of the picture
- `question_id`: Indicate the question id number, there are 82 questions

- `question`: Represents specific problem information
- `answer`: Replie given by the model
- `model_id`: The ID of the model the answer is generated by

For example:

```json
{"image": "10.jpg", "question_id": 15, "question": "How many bedrooms are there in this floor plan?", "answer": "There are three bedrooms in this floor plan.", "model_id": "llava-13b"}
```

### cases

This folder contains 50 evaluation pictures, where 21 from mini GPT-4, 13 from mm-react, 9 from blip-2, 3 from GPT-4 and 4 collected by us