Grounded-SAM-2

Author	SHA1	Message	Date
Ronghang Hu	393ae336a7	SAM 2 Update 12/11/2024 -- full model compilation for a major VOS speedup and a new SAM2VideoPredictor to better handle multi-object tracking (#486 ) This PR provides new features and updates for SAM 2: - We now support `torch.compile` of the entire SAM 2 model on videos, which can be turned on by setting `vos_optimized=True` in `build_sam2_video_predictor` (it uses the new `SAM2VideoPredictorVOS` predictor class in `sam2/sam2_video_predictor.py`). * Compared to the previous setting (which only compiles the image encoder backbone), the new full model compilation gives a major speedup in inference FPS. * In the VOS prediction script `tools/vos_inference.py`, you can specify this option in `tools/vos_inference.py` via the `--use_vos_optimized_video_predictor` flag. * Note that turning on this flag might introduce a small variance in the predictions due to numerical differences caused by `torch.compile` of the full model. * PyTorch 2.5.1 is the minimum version for full support of this feature. (Earlier PyTorch versions might run into compilation errors in some cases.) Therefore, we have updated the minimum PyTorch version to 2.5.1 accordingly in the installation scripts. - We also update the implementation of the `SAM2VideoPredictor` class for the SAM 2 video prediction in `sam2/sam2_video_predictor.py`, which allows for independent per-object inference. Specifically, in the new `SAM2VideoPredictor`: * Now we handle the inference of each object independently (as if we are opening a separate session for each object) while sharing their backbone features. * This change allows us to relax the assumption of prompting for multi-object tracking. Previously (due to the batching behavior in inference), if a video frame receives clicks for only a subset of objects, the rest of the (non-prompted) objects are assumed to be non-existent in this frame (i.e., in such frames, the user is telling SAM 2 that the rest of the objects don't appear). Now, if a frame receives clicks for only a subset of objects, we do not make any assumptions about the remaining (non-prompted) objects (i.e., now each object is handled independently and is not affected by how other objects are prompted). As a result, we allow adding new objects after tracking starts after this change (which was previously a restriction on usage). * We believe that the new version is a more natural inference behavior and therefore switched to it as the default behavior. The previous implementation of `SAM2VideoPredictor` is backed up to in `sam2/sam2_video_predictor_legacy.py`. All the VOS inference results using `tools/vos_inference.py` should remain the same after this change to the `SAM2VideoPredictor` class.	2024-12-11 15:00:55 -08:00
Ronghang Hu	29267c8e39	[doc] Check and raise an error if the user is running Python from the parent directory of the sam2 repo (#359 ) If the user has "sam2/sam2" in their path, they are likey importing the repo itself as "sam2" rather than importing the "sam2" python package (i.e. "sam2/sam2" directory). This typically happens because the user is running Python from the parent directory that contains the sam2 repo they cloned. In general, the user should not run Python from the parent dir when the repo is cloned into (same is true for e.g. Numpy repo that contains names like `numpy/numpy` where the module and the repo have the same name), as the user encountered in https://github.com/facebookresearch/sam2/issues/346. (close https://github.com/facebookresearch/sam2/issues/346)	2024-10-05 00:34:06 -07:00
Haitham Khedr	52198ead0e	Merge pull request #2 from kit1980/patch-1 Use `weights_only` for loading	2024-10-01 22:32:58 +02:00
Haitham Khedr	aa9b8722d0	SAM2.1 SAM2.1 checkpoints + training code + Demo	2024-09-29 05:49:56 +00:00
Sergii Dymchenko	0f6515ae85	Merge branch 'main' into patch-1	2024-08-26 15:49:40 -07:00
Haitham Khedr	1191677e1e	Fix HF image predictor	2024-08-12 23:41:41 +00:00
Niels	8f15c6255a	Format using ufmt	2024-08-06 22:43:35 +02:00
Niels	e815f70a38	Address comment	2024-08-06 08:32:36 +02:00
Niels	6aeee34775	Make huggingface_hub soft dependency	2024-08-05 09:37:53 +02:00
Niels	0c28c630c2	Do not load config from the hub	2024-08-03 14:45:20 +02:00
Niels	3af4e82263	Add model_id_to_filenames	2024-08-03 14:18:23 +02:00
Niels	b72a8a97f0	First draft	2024-08-03 12:57:05 +02:00
Sergii Dymchenko	658aaba327	Use `weights_only` for loading sam2/build_sam.py:81:14: TOR102 [*] `torch.load` without `weights_only` parameter is unsafe. Explicitly set `weights_only` to False only if you trust the data you load and full pickle functionality is needed, otherwise set `weights_only=True`. Found with https://github.com/pytorch-labs/torchfix/	2024-07-29 16:54:54 -07:00
Haitham Khedr	0c5f8c5432	Initial commit	2024-07-29 21:54:20 +00:00

14 Commits