add
This commit is contained in:
@@ -0,0 +1,18 @@
|
||||
A harbor filled with lots of boats next to a building.
|
||||
A bicycle parked in front of several boats at a dock.
|
||||
A red bicycle in front of a line of docked white yachts
|
||||
A bike sits before boats which sit before a long building.
|
||||
A bicycle is a convenient means of land transportation when you live on a boat.
|
||||
|
||||
bicycle: [0.287, 0.641, 0.507, 0.874]
|
||||
bicycle: [0.566, 0.667, 0.63, 0.731]
|
||||
boat: [0.318, 0.579, 0.575, 0.724]
|
||||
boat: [0.704, 0.607, 0.818, 0.727]
|
||||
boat: [0.818, 0.601, 0.942, 0.744]
|
||||
boat: [0.002, 0.53, 0.243, 0.71]
|
||||
boat: [0.541, 0.611, 0.668, 0.731]
|
||||
person: [0.778, 0.527, 0.797, 0.57]
|
||||
cup: [0.708, 0.733, 0.724, 0.758]
|
||||
boat: [0.236, 0.532, 0.404, 0.64]
|
||||
boat: [0.81, 0.632, 0.836, 0.676]
|
||||
boat: [0.957, 0.526, 1.0, 0.752]
|
@@ -0,0 +1,3 @@
|
||||
It is a harbor filled with numerous boats of various sizes docked next to a long building. Among the boats, there are a few white yachts lined up, standing out from the rest. There is a red bicycle prominently parked in front of the line of docked boats, serving as a convenient means of land transportation for those living on the boats. Another bicycle can be seen further back in the scene, near the middle of the harbor.
|
||||
|
||||
A person is visible near the right side of the harbor, possibly enjoying the view or attending to their boat. Additionally, there is a cup placed on a surface near the middle of the scene.
|
@@ -0,0 +1,18 @@
|
||||
A group of people standing outside of a black vehicle with various luggage.
|
||||
Luggage surrounds a vehicle in an underground parking area
|
||||
People try to fit all of their luggage in an SUV.
|
||||
The sport utility vehicle is parked in the public garage, being packed for a trip
|
||||
Some people with luggage near a van that is transporting it.
|
||||
|
||||
person: [0.681, 0.242, 0.774, 0.694]
|
||||
person: [0.63, 0.222, 0.686, 0.516]
|
||||
person: [0.444, 0.233, 0.487, 0.34]
|
||||
backpack: [0.384, 0.696, 0.485, 0.914]
|
||||
backpack: [0.755, 0.413, 0.846, 0.692]
|
||||
suitcase: [0.758, 0.413, 0.845, 0.69]
|
||||
suitcase: [0.1, 0.497, 0.173, 0.579]
|
||||
bicycle: [0.282, 0.363, 0.327, 0.442]
|
||||
car: [0.786, 0.25, 0.848, 0.322]
|
||||
car: [0.783, 0.27, 0.827, 0.335]
|
||||
car: [0.86, 0.254, 0.891, 0.3]
|
||||
car: [0.261, 0.101, 0.787, 0.626]
|
@@ -0,0 +1,5 @@
|
||||
The image is an underground parking area with a black sport utility vehicle (SUV) parked. There are three people in the scene, with one person standing closer to the left side of the vehicle, another person in the middle, and the third person on the right side. They are all working together to pack their luggage into the SUV for a trip.
|
||||
|
||||
In the parking area, various luggage items are scattered around the vehicle. There are two backpacks, one located near the left rear wheel and the other closer to the right side of the vehicle. Additionally, there are two suitcases, one on the right side of the car and another further away near the center of the parking area. A bicycle can also be seen on the left side of the vehicle.
|
||||
|
||||
Other cars are parked around the main SUV, with one car positioned behind it and slightly to the left, another behind and slightly to the right, and the third car further behind on the right side.
|
@@ -0,0 +1,15 @@
|
||||
A man holds a Wii-mote above his head while another looks on.
|
||||
A guy and his friend are playing Nintendo Wii.
|
||||
A young man is holding a video game remote over his head.
|
||||
two men standing in a room while one plays with a wii mote
|
||||
Some guys standing and playing a video game.
|
||||
|
||||
couch: [0.697, 0.759, 0.995, 1.0]
|
||||
dining table: [0.426, 0.755, 1.0, 0.987]
|
||||
person: [0.082, 0.252, 0.342, 1.0]
|
||||
person: [0.399, 0.085, 0.742, 0.982]
|
||||
remote: [0.477, 0.135, 0.516, 0.187]
|
||||
sink: [0.016, 0.501, 0.063, 0.52]
|
||||
potted plant: [0.798, 0.384, 0.888, 0.645]
|
||||
refrigerator: [0.305, 0.389, 0.414, 0.547]
|
||||
chair: [0.72, 0.509, 0.858, 0.725]
|
@@ -0,0 +1,3 @@
|
||||
The image shows two men standing in a room, engaged in playing a video game on a Nintendo Wii console. One of the men is holding a Wii remote above his head with enthusiasm, while the other man looks on, likely enjoying the friendly competition.
|
||||
|
||||
The room appears to be a living space with a couch located in the background and a dining table nearby. A potted plant can be seen placed close to the couch, and a chair is situated in the middle of the room. The room also features a kitchen area with a sink and a refrigerator visible in the background.
|
@@ -0,0 +1,7 @@
|
||||
You are an AI visual assistant that can analyze a single image. You receive five sentences, each describing the same image you are observing. In addition, specific object locations within the image are given, along with detailed coordinates. These coordinates are in the form of bounding boxes, represented as (x1, y1, x2, y2) with floating numbers ranging from 0 to 1. These values correspond to the top left x, top left y, bottom right x, and bottom right y.
|
||||
|
||||
Using the provided caption and bounding box information, describe the scene in a detailed manner.
|
||||
|
||||
Instead of directly mentioning the bounding box coordinates, utilize this data to explain the scene using natural language. Include details like object counts, position of the objects, relative position between the objects.
|
||||
|
||||
When using the information from the caption and coordinates, directly explain the scene, and do not mention that the information source is the caption or the bounding box. Always answer as if you are directly looking at the image.
|
Reference in New Issue
Block a user