Grounded-SAM-2/README.md

# Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything with Grounding DINO, Grounding DINO 1.5 and SAM 2


## Contents
- [Installation](#installation)
- [Grounded-SAM-2 Demo](#grounded-sam-2-demo)
  - [Grounded-SAM-2 Image Demo](#grounded-sam-2-image-demo-with-grounding-dino)
  - [Grounded-SAM-2 Image Demo (with Grounding DINO 1.5)](#grounded-sam-2-image-demo-with-grounding-dino-15--16)

## Installation

Since we need the CUDA compilation environment to compile the `Deformable Attention` operator used in Grounding DINO, we need to check whether the CUDA environment variables have been set correctly (which you can refer to [Grounding DINO Installation](https://github.com/IDEA-Research/GroundingDINO?tab=readme-ov-file#hammer_and_wrench-install) for more details). You can set the environment variable manually as follows if you want to build a local GPU environment for Grounding DINO to run Grounded SAM 2:

```bash
export CUDA_HOME=/path/to/cuda-12.1/
```

Install `Segment Anything 2`:

```bash
pip install -e .
```

Install `Grounding DINO`:

```bash
pip install --no-build-isolation -e grounding_dino
```

Downgrade the version of the `supervision` library to `0.6.0` to use its original API for visualization (we will update our code to be compatible with the latest version of `supervision` in the future release):

```bash
pip install supervision==0.6.0
```

Download the pretrained `SAM 2` checkpoints:

```bash
cd checkpoints
bash download_ckpts.sh
```

Download the pretrained `Grounding DINO` checkpoints:

```bash
cd gdino_checkpoints
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth
```

## Grounded-SAM-2 Demo
### Grounded-SAM-2 Image Demo (with Grounding DINO)
Note that `Grounding DINO` has already been supported in [Huggingface](https://huggingface.co/IDEA-Research/grounding-dino-tiny), so we provide two choices for running `Grounded-SAM-2` model:
- Use huggingface API to inference Grounding DINO (which is simple and clear)

```bash
python grounded_sam2_hf_model_demo.py
```

- Load local pretrained Grounding DINO checkpoint and inference with Grounding DINO original API (make sure you've already downloaded the pretrained checkpoint)

```bash
python grounded_sam2_local_demo.py
```

### Grounded-SAM-2 Image Demo (with Grounding DINO 1.5 & 1.6)

We've already released our most capable open-set detection model [Grounding DINO 1.5 & 1.6](https://github.com/IDEA-Research/Grounding-DINO-1.5-API), which can be combined with SAM 2 for stronger open-set detection and segmentation capability. You can apply the API token first and run Grounded-SAM-2 with Grounding DINO 1.5 as follows:

Install the latest DDS cloudapi:

```bash
pip install dds-cloudapi-sdk
```

Apply your API token from our official website here: [request API token](https://deepdataspace.com/request_api).

```bash
python grounded_sam2_gd1.5_demo.py
```
Initial commit 2024-08-01 14:56:29 +08:00			`# Grounded-SAM-2`
support 1.5 image demo 2024-08-01 21:30:56 +08:00			`Grounded SAM 2: Ground and Track Anything with Grounding DINO, Grounding DINO 1.5 and SAM 2`
support gsam2 image predictor model 2024-08-01 17:05:01 +08:00

			`## Contents`
support 1.5 image demo 2024-08-01 21:30:56 +08:00			`- [Installation](#installation)`
			`- [Grounded-SAM-2 Demo](#grounded-sam-2-demo)`
			`- [Grounded-SAM-2 Image Demo](#grounded-sam-2-image-demo-with-grounding-dino)`
			`- [Grounded-SAM-2 Image Demo (with Grounding DINO 1.5)](#grounded-sam-2-image-demo-with-grounding-dino-15--16)`
support gsam2 image predictor model 2024-08-01 17:05:01 +08:00
			`## Installation`

			Since we need the CUDA compilation environment to compile the `Deformable Attention` operator used in Grounding DINO, we need to check whether the CUDA environment variables have been set correctly (which you can refer to [Grounding DINO Installation](https://github.com/IDEA-Research/GroundingDINO?tab=readme-ov-file#hammer_and_wrench-install) for more details). You can set the environment variable manually as follows if you want to build a local GPU environment for Grounding DINO to run Grounded SAM 2:

			```bash
			`export CUDA_HOME=/path/to/cuda-12.1/`
			```

support 1.5 image demo 2024-08-01 21:30:56 +08:00			Install `Segment Anything 2`:
support gsam2 image predictor model 2024-08-01 17:05:01 +08:00
			```bash
			`pip install -e .`
			```

support 1.5 image demo 2024-08-01 21:30:56 +08:00			Install `Grounding DINO`:
support gsam2 image predictor model 2024-08-01 17:05:01 +08:00
			```bash
			`pip install --no-build-isolation -e grounding_dino`
			```
support gdino local model (load local ckpt) 2024-08-01 17:58:42 +08:00
support 1.5 image demo 2024-08-01 21:30:56 +08:00			Downgrade the version of the `supervision` library to `0.6.0` to use its original API for visualization (we will update our code to be compatible with the latest version of `supervision` in the future release):

			```bash
			`pip install supervision==0.6.0`
			```

			Download the pretrained `SAM 2` checkpoints:
support gdino local model (load local ckpt) 2024-08-01 17:58:42 +08:00
			```bash
			`cd checkpoints`
			`bash download_ckpts.sh`
			```

support 1.5 image demo 2024-08-01 21:30:56 +08:00			Download the pretrained `Grounding DINO` checkpoints:

support gdino local model (load local ckpt) 2024-08-01 17:58:42 +08:00			```bash
			`cd gdino_checkpoints`
			`wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth`
			`wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth`
			```

support 1.5 image demo 2024-08-01 21:30:56 +08:00			`## Grounded-SAM-2 Demo`
			`### Grounded-SAM-2 Image Demo (with Grounding DINO)`
support gdino local model (load local ckpt) 2024-08-01 17:58:42 +08:00			Note that `Grounding DINO` has already been supported in [Huggingface](https://huggingface.co/IDEA-Research/grounding-dino-tiny), so we provide two choices for running `Grounded-SAM-2` model:
			`- Use huggingface API to inference Grounding DINO (which is simple and clear)`

			```bash
			`python grounded_sam2_hf_model_demo.py`
			```

			`- Load local pretrained Grounding DINO checkpoint and inference with Grounding DINO original API (make sure you've already downloaded the pretrained checkpoint)`

			```bash
			`python grounded_sam2_local_demo.py`
			```
support 1.5 image demo 2024-08-01 21:30:56 +08:00
			`### Grounded-SAM-2 Image Demo (with Grounding DINO 1.5 & 1.6)`

			`We've already released our most capable open-set detection model [Grounding DINO 1.5 & 1.6](https://github.com/IDEA-Research/Grounding-DINO-1.5-API), which can be combined with SAM 2 for stronger open-set detection and segmentation capability. You can apply the API token first and run Grounded-SAM-2 with Grounding DINO 1.5 as follows:`

			`Install the latest DDS cloudapi:`

			```bash
			`pip install dds-cloudapi-sdk`
			```

			`Apply your API token from our official website here: [request API token](https://deepdataspace.com/request_api).`

			```bash
			`python grounded_sam2_gd1.5_demo.py`
			```