Commit Graph

25 Commits

Author SHA1 Message Date
Ronghang Hu
393ae336a7 SAM 2 Update 12/11/2024 -- full model compilation for a major VOS speedup and a new SAM2VideoPredictor to better handle multi-object tracking (#486)
This PR provides new features and updates for SAM 2:

- We now support `torch.compile` of the entire SAM 2 model on videos, which can be turned on by setting `vos_optimized=True` in `build_sam2_video_predictor` (it uses the new `SAM2VideoPredictorVOS` predictor class in `sam2/sam2_video_predictor.py`).
  * Compared to the previous setting (which only compiles the image encoder backbone), the new full model compilation gives a major speedup in inference FPS.
  * In the VOS prediction script `tools/vos_inference.py`, you can specify this option in `tools/vos_inference.py` via the `--use_vos_optimized_video_predictor` flag.
  * Note that turning on this flag might introduce a small variance in the predictions due to numerical differences caused by `torch.compile` of the full model.
  * **PyTorch 2.5.1 is the minimum version for full support of this feature**. (Earlier PyTorch versions might run into compilation errors in some cases.) Therefore, we have updated the minimum PyTorch version to 2.5.1 accordingly in the installation scripts.
- We also update the implementation of the `SAM2VideoPredictor` class for the SAM 2 video prediction in `sam2/sam2_video_predictor.py`, which allows for independent per-object inference. Specifically, in the new `SAM2VideoPredictor`:
  * Now **we handle the inference of each object independently** (as if we are opening a separate session for each object) while sharing their backbone features.
  * This change allows us to relax the assumption of prompting for multi-object tracking. Previously (due to the batching behavior in inference), if a video frame receives clicks for only a subset of objects, the rest of the (non-prompted) objects are assumed to be non-existent in this frame (i.e., in such frames, the user is telling SAM 2 that the rest of the objects don't appear). Now, if a frame receives clicks for only a subset of objects, we do not make any assumptions about the remaining (non-prompted) objects (i.e., now each object is handled independently and is not affected by how other objects are prompted). As a result, **we allow adding new objects after tracking starts** after this change (which was previously a restriction on usage).
  * We believe that the new version is a more natural inference behavior and therefore switched to it as the default behavior. The previous implementation of `SAM2VideoPredictor` is backed up to in `sam2/sam2_video_predictor_legacy.py`. All the VOS inference results using `tools/vos_inference.py` should remain the same after this change to the `SAM2VideoPredictor` class.
2024-12-11 15:00:55 -08:00
Ronghang Hu
29267c8e39 [doc] Check and raise an error if the user is running Python from the parent directory of the sam2 repo (#359)
If the user has "sam2/sam2" in their path, they are likey importing the repo itself as "sam2" rather than importing the "sam2" python package (i.e. "sam2/sam2" directory). This typically happens because the user is running Python from the parent directory that contains the sam2 repo they cloned.

In general, the user should not run Python from the parent dir when the repo is cloned into (same is true for e.g. Numpy repo that contains names like `numpy/numpy` where the module and the repo have the same name), as the user encountered in https://github.com/facebookresearch/sam2/issues/346.

(close https://github.com/facebookresearch/sam2/issues/346)
2024-10-05 00:34:06 -07:00
Ronghang Hu
98fcb164bf Update links after renaming the repo from segment-anything-2 to sam2 (#341)
This PR update repo links after we renamed the repo from `segment-anything-2` to `sam2`. It also changes `NAME` in setup.py to `SAM-2` (which is already the named used in pip setup since python packages don't allow whitespace)
2024-09-30 20:27:44 -07:00
Ronghang Hu
05d9e57fb3 [docs] add a release note and new installation instructions for SAM 2.1 (#338) 2024-09-30 09:55:58 -07:00
Ronghang Hu
429a2c7360 minor update README.md 2024-09-28 23:32:25 -07:00
Haitham Khedr
aa9b8722d0 SAM2.1
SAM2.1 checkpoints + training code + Demo
2024-09-29 05:49:56 +00:00
Ronghang Hu
dce7b5446f improving warning message and adding further tips for installation (#204) 2024-08-12 11:37:41 -07:00
Ronghang Hu
d421e0b040 add Colab support to the notebooks; pack config files in sam2_configs package during installation (#176) 2024-08-08 11:03:22 -07:00
Ronghang Hu
6ecb5ff8d0 Add interface for box prompt in SAM 2 video predictor (#174)
This PR adds an example to provide box prompt in SAM 2 as inputs to the `add_new_points_or_box` API (renamed from`add_new_points`, which is kept for backward compatibility). If `box` is provided, we add it as the first two points with labels 2 and 3, along with the user-provided points (consistent with how SAM 2 is trained).

The video predictor notebook `notebooks/video_predictor_example.ipynb` is updated to include segmenting from box prompt as an example.
2024-08-07 11:54:30 -07:00
Niels
9b58611e24 Address comment 2024-08-07 17:48:12 +02:00
Niels
322aa3e7e5 Revert code snippet 2024-08-06 22:57:07 +02:00
Niels
27a167c004 Update README 2024-08-06 22:41:32 +02:00
Niels
fbf7e3a664 Add link 2024-08-05 22:12:15 +02:00
Niels
e9503c96fe Move HF to separate section 2024-08-05 22:10:57 +02:00
Niels
c3393d8b5f Include original code snippet 2024-08-05 22:08:54 +02:00
Niels
e93be7f6aa Update README 2024-08-05 09:43:04 +02:00
Niels
cb48213066 Update links 2024-08-05 09:41:40 +02:00
Ronghang Hu
57bc94b739 Merge pull request #119 from facebookresearch/ronghanghu/installation_faq
[doc] add `INSTALL.md` as an installation FAQ page
2024-08-02 15:07:25 -07:00
Ronghang Hu
b744a3c084 [doc] add INSTALL.md as an installation FAQ page 2024-08-02 22:05:46 +00:00
Haitham Khedr
59550d4deb Update README.md 2024-08-02 12:58:23 -07:00
Haitham Khedr
de4db16676 Update README.md 2024-08-02 12:56:06 -07:00
Danwand
fa2796bb47 Change git repo url from SSH to HTTPS
The change is made for researchers to easly clone the project to check out from systems and platforms with SSH not in sync with github. Eg : Google Colab, Remote GPU Servers etc
2024-07-31 13:55:06 +05:30
CharlesCNorton
e62ec497b8 Fix: Hyphenate to "model-in-the-loop"
The phrase "model-in-the-loop" is now hyphenated to align with standard practices in technical literature, where hyphenation of compound adjectives clarifies their function as a single descriptor.
2024-07-30 08:35:59 -04:00
CharlesCNorton
662fd3d90e Fix typo in README: "Aything" corrected to "Anything"
Corrected a typo in the Segment Anything Video Dataset section of the README file. The word "Aything" has been updated to "Anything."
2024-07-29 22:15:41 -04:00
Haitham Khedr
0c5f8c5432 Initial commit 2024-07-29 21:54:20 +00:00