Files
Grounded-SAM-2/demo/README.md

174 lines
4.7 KiB
Markdown
Raw Permalink Normal View History

# SAM 2 Demo
Welcome to the SAM 2 Demo! This project consists of a frontend built with React TypeScript and Vite and a backend service using Python Flask and Strawberry GraphQL. Both components can be run in Docker containers or locally on MPS (Metal Performance Shaders) or CPU. However, running the backend service on MPS or CPU devices may result in significantly slower performance (FPS).
## Prerequisites
Before you begin, ensure you have the following installed on your system:
- Docker and Docker Compose
- [OPTIONAL] Node.js and Yarn for running frontend locally
- [OPTIONAL] Anaconda for running backend locally
### Installing Docker
To install Docker, follow these steps:
1. Go to the [Docker website](https://www.docker.com/get-started)
2. Follow the installation instructions for your operating system.
### [OPTIONAL] Installing Node.js and Yarn
To install Node.js and Yarn, follow these steps:
1. Go to the [Node.js website](https://nodejs.org/en/download/).
2. Follow the installation instructions for your operating system.
3. Once Node.js is installed, open a terminal or command prompt and run the following command to install Yarn:
```
npm install -g yarn
```
### [OPTIONAL] Installing Anaconda
To install Anaconda, follow these steps:
1. Go to the [Anaconda website](https://www.anaconda.com/products/distribution).
2. Follow the installation instructions for your operating system.
## Quick Start
To get both the frontend and backend running quickly using Docker, you can use the following command:
```bash
docker compose up --build
```
> [!WARNING]
> On macOS, Docker containers only support running on CPU. MPS is not supported through Docker. If you want to run the demo backend service on MPS, you will need to run it locally (see "Running the Backend Locally" below).
This will build and start both services. You can access them at:
- **Frontend:** [http://localhost:7262](http://localhost:7262)
- **Backend:** [http://localhost:7263/graphql](http://localhost:7263/graphql)
## Running Backend with MPS Support
MPS (Metal Performance Shaders) is not supported with Docker. To use MPS, you need to run the backend on your local machine.
### Setting Up Your Environment
1. **Create Conda environment**
Create a new Conda environment for this project by running the following command or use your existing conda environment for SAM 2:
```
conda create --name sam2-demo python=3.10 --yes
```
This will create a new environment named `sam2-demo` with Python 3.10 as the interpreter.
2. **Activate the Conda environment:**
```bash
conda activate sam2-demo
```
3. **Install ffmpeg**
```bash
conda install -c conda-forge ffmpeg
```
4. **Install SAM 2 demo dependencies:**
Install project dependencies by running the following command in the SAM 2 checkout root directory:
```bash
pip install -e '.[interactive-demo]'
```
### Running the Backend Locally
Download the SAM 2 checkpoints:
```bash
(cd ./checkpoints && ./download_ckpts.sh)
```
Use the following command to start the backend with MPS support:
```bash
cd demo/backend/server/
```
```bash
PYTORCH_ENABLE_MPS_FALLBACK=1 \
APP_ROOT="$(pwd)/../../../" \
SAM 2 Update 12/11/2024 -- full model compilation for a major VOS speedup and a new SAM2VideoPredictor to better handle multi-object tracking (#486) This PR provides new features and updates for SAM 2: - We now support `torch.compile` of the entire SAM 2 model on videos, which can be turned on by setting `vos_optimized=True` in `build_sam2_video_predictor` (it uses the new `SAM2VideoPredictorVOS` predictor class in `sam2/sam2_video_predictor.py`). * Compared to the previous setting (which only compiles the image encoder backbone), the new full model compilation gives a major speedup in inference FPS. * In the VOS prediction script `tools/vos_inference.py`, you can specify this option in `tools/vos_inference.py` via the `--use_vos_optimized_video_predictor` flag. * Note that turning on this flag might introduce a small variance in the predictions due to numerical differences caused by `torch.compile` of the full model. * **PyTorch 2.5.1 is the minimum version for full support of this feature**. (Earlier PyTorch versions might run into compilation errors in some cases.) Therefore, we have updated the minimum PyTorch version to 2.5.1 accordingly in the installation scripts. - We also update the implementation of the `SAM2VideoPredictor` class for the SAM 2 video prediction in `sam2/sam2_video_predictor.py`, which allows for independent per-object inference. Specifically, in the new `SAM2VideoPredictor`: * Now **we handle the inference of each object independently** (as if we are opening a separate session for each object) while sharing their backbone features. * This change allows us to relax the assumption of prompting for multi-object tracking. Previously (due to the batching behavior in inference), if a video frame receives clicks for only a subset of objects, the rest of the (non-prompted) objects are assumed to be non-existent in this frame (i.e., in such frames, the user is telling SAM 2 that the rest of the objects don't appear). Now, if a frame receives clicks for only a subset of objects, we do not make any assumptions about the remaining (non-prompted) objects (i.e., now each object is handled independently and is not affected by how other objects are prompted). As a result, **we allow adding new objects after tracking starts** after this change (which was previously a restriction on usage). * We believe that the new version is a more natural inference behavior and therefore switched to it as the default behavior. The previous implementation of `SAM2VideoPredictor` is backed up to in `sam2/sam2_video_predictor_legacy.py`. All the VOS inference results using `tools/vos_inference.py` should remain the same after this change to the `SAM2VideoPredictor` class.
2024-12-11 15:00:55 -08:00
API_URL=http://localhost:7263 \
MODEL_SIZE=base_plus \
DATA_PATH="$(pwd)/../../data" \
DEFAULT_VIDEO_PATH=gallery/05_default_juggle.mp4 \
gunicorn \
--worker-class gthread app:app \
--workers 1 \
--threads 2 \
--bind 0.0.0.0:7263 \
--timeout 60
```
Options for the `MODEL_SIZE` argument are "tiny", "small", "base_plus" (default), and "large".
> [!WARNING]
> Running the backend service on MPS devices can cause fatal crashes with the Gunicorn worker due to insufficient MPS memory. Try switching to CPU devices by setting the `SAM2_DEMO_FORCE_CPU_DEVICE=1` environment variable.
### Starting the Frontend
If you wish to run the frontend separately (useful for development), follow these steps:
1. **Navigate to demo frontend directory:**
```bash
cd demo/frontend
```
2. **Install dependencies:**
```bash
yarn install
```
3. **Start the development server:**
```bash
yarn dev --port 7262
```
This will start the frontend development server on [http://localhost:7262](http://localhost:7262).
## Docker Tips
- To rebuild the Docker containers (useful if you've made changes to the Dockerfile or dependencies):
```bash
docker compose up --build
```
- To stop the Docker containers:
```bash
docker compose down
```
## Contributing
Contributions are welcome! Please read our contributing guidelines to get started.
## License
See the LICENSE file for details.
---
By following these instructions, you should have a fully functional development environment for both the frontend and backend of the SAM 2 Demo. Happy coding!