UniVG-R1 Demo

Use our provided examples or upload your own local images for universal visual grounding.

Project Page      GitHub      arXiv

Select Example

Examples

Reasoning

Instruction: Locate the one appropriate object in Image-2 that can rotate the object of Image-1. Find it and locate it in the second image.

Correspondence

Instruction: You are now presented with two objects. For the area marked by the red bounding box in the first image, identify and locate the corresponding area in the second image that serves a similar function or shares a similar meaning.

Difference

Instruction: Compare these two images carefully and give me the coordinates of their real difference in the second image. Find it and locate it in the second image.

Refer Grounding

Instruction: Find and locate where does the object in image-1 locate in the image-2.

Group Grounding

Instruction: Please find the bounding box coordinates for the area described by: <|object_ref_start|>a white truck with a crane on top<|object_ref_end|>.

Region Locating

Instruction: You are given a source image followed by its several regions. Please locate the 1th region picture in the source image.

Multi View

Instruction: These images share one object in common(the object marked with red bounding box in the first image(<|box_start|>(439,57),(689,999)<|box_end|>). Recognize and locate this object in the 2th image.

Common Object

Instruction: These images share one object in common. Recognize and locate this object in the 2th image.