Grounded-SAM-2

Author	SHA1	Message	Date
Ronghang Hu	2b90b9f5ce	remove `.pin_memory()` in `obj_pos` of `SAM2Base` to resolve and error in MPS (#495 ) In this PR, we remove `.pin_memory()` in `obj_pos` of `SAM2Base` to resolve and error in MPS. Investigations show that `.pin_memory()` causes an error of `Attempted to set the storage of a tensor on device "cpu" to a storage on different device "mps:0"`, as originally reported in https://github.com/facebookresearch/sam2/issues/487. (close https://github.com/facebookresearch/sam2/issues/487)	2024-12-15 16:47:17 -08:00
Ronghang Hu	722d1d1511	patch for the case of `offload_state_to_cpu=True` in the new `SAM2VideoPredictor` (#490 ) This PR adds a pathc for the case of `offload_state_to_cpu=True` where `pred_masks` might have been offload to CPU device (close https://github.com/facebookresearch/sam2/issues/489)	2024-12-12 15:12:13 -08:00
Ronghang Hu	393ae336a7	SAM 2 Update 12/11/2024 -- full model compilation for a major VOS speedup and a new SAM2VideoPredictor to better handle multi-object tracking (#486 ) This PR provides new features and updates for SAM 2: - We now support `torch.compile` of the entire SAM 2 model on videos, which can be turned on by setting `vos_optimized=True` in `build_sam2_video_predictor` (it uses the new `SAM2VideoPredictorVOS` predictor class in `sam2/sam2_video_predictor.py`). * Compared to the previous setting (which only compiles the image encoder backbone), the new full model compilation gives a major speedup in inference FPS. * In the VOS prediction script `tools/vos_inference.py`, you can specify this option in `tools/vos_inference.py` via the `--use_vos_optimized_video_predictor` flag. * Note that turning on this flag might introduce a small variance in the predictions due to numerical differences caused by `torch.compile` of the full model. * PyTorch 2.5.1 is the minimum version for full support of this feature. (Earlier PyTorch versions might run into compilation errors in some cases.) Therefore, we have updated the minimum PyTorch version to 2.5.1 accordingly in the installation scripts. - We also update the implementation of the `SAM2VideoPredictor` class for the SAM 2 video prediction in `sam2/sam2_video_predictor.py`, which allows for independent per-object inference. Specifically, in the new `SAM2VideoPredictor`: * Now we handle the inference of each object independently (as if we are opening a separate session for each object) while sharing their backbone features. * This change allows us to relax the assumption of prompting for multi-object tracking. Previously (due to the batching behavior in inference), if a video frame receives clicks for only a subset of objects, the rest of the (non-prompted) objects are assumed to be non-existent in this frame (i.e., in such frames, the user is telling SAM 2 that the rest of the objects don't appear). Now, if a frame receives clicks for only a subset of objects, we do not make any assumptions about the remaining (non-prompted) objects (i.e., now each object is handled independently and is not affected by how other objects are prompted). As a result, we allow adding new objects after tracking starts after this change (which was previously a restriction on usage). * We believe that the new version is a more natural inference behavior and therefore switched to it as the default behavior. The previous implementation of `SAM2VideoPredictor` is backed up to in `sam2/sam2_video_predictor_legacy.py`. All the VOS inference results using `tools/vos_inference.py` should remain the same after this change to the `SAM2VideoPredictor` class.	2024-12-11 15:00:55 -08:00
Haitham Khedr	c2ec8e14a1	remove unused paths (#384 )	2024-10-14 10:40:54 -04:00
Roman Rädle	c98aa6bea3	Merge pull request #364 from facebookresearch/pr364 [sam2][demo][1/x] Fix file upload Summary: The Strawberry GraphQL library recently disabled multipart requests by default. This resulted in a video upload request returning "Unsupported content type" instead of uploading the video, processing it, and returning the video path. This issue was raised in #361. A forward fix is to add `multipart_uploads_enabled=True` to the endpoint view. Test Plan: Tested locally with cURL and upload succeeds Request ``` curl http://localhost:7263/graphql \ -F operations='{ "query": "mutation($file: Upload!){ uploadVideo(file: $file) { path } }", "variables": { "file": null } }' \ -F map='{ "file": ["variables.file"] }' \ -F file=@video.mov ``` Response ``` {"data": {"uploadVideo": {"path": "uploads/<HASH>.mp4"}}} ```	2024-10-08 15:28:14 -07:00
Roman Rädle	ff9704fc0e	[sam2][demo][1/x] Fix file upload Summary: The Strawberry GraphQL library recently disabled multipart requests by default. This resulted in a video upload request returning "Unsupported content type" instead of uploading the video, processing it, and returning the video path. This issue was raised in #361. A forward fix is to add `multipart_uploads_enabled=True` to the endpoint view. Test Plan: Tested locally with cURL and upload succeeds Request ``` curl http://localhost:7263/graphql \ -F operations='{ "query": "mutation($file: Upload!){ uploadVideo(file: $file) { path } }", "variables": { "file": null } }' \ -F map='{ "file": ["variables.file"] }' \ -F file=@video.mov ``` Response ``` {"data": {"uploadVideo": {"path": "uploads/<HASH>.mp4"}}} ```	2024-10-08 14:58:28 -07:00
Ronghang Hu	29267c8e39	[doc] Check and raise an error if the user is running Python from the parent directory of the sam2 repo (#359 ) If the user has "sam2/sam2" in their path, they are likey importing the repo itself as "sam2" rather than importing the "sam2" python package (i.e. "sam2/sam2" directory). This typically happens because the user is running Python from the parent directory that contains the sam2 repo they cloned. In general, the user should not run Python from the parent dir when the repo is cloned into (same is true for e.g. Numpy repo that contains names like `numpy/numpy` where the module and the repo have the same name), as the user encountered in https://github.com/facebookresearch/sam2/issues/346. (close https://github.com/facebookresearch/sam2/issues/346)	2024-10-05 00:34:06 -07:00
Ronghang Hu	e22521832f	[demo] add GPU to resources (#355 ) This small PR adds GPU specification in `docker-compose.yaml` for the SAM 2 interactive webdemo, following https://docs.docker.com/compose/how-tos/gpu-support/#example-of-a-compose-file-for-running-a-service-with-access-to-1-gpu-device. It fixes a GPU access error as reported in https://github.com/facebookresearch/sam2/issues/354. (close https://github.com/facebookresearch/sam2/issues/354)	2024-10-03 16:48:56 -07:00
Haitham Khedr	8bf0920e66	Add MANIFEST.in (#353 )	2024-10-03 10:40:13 -07:00
Haitham Khedr	52198ead0e	Merge pull request #2 from kit1980/patch-1 Use `weights_only` for loading	2024-10-01 22:32:58 +02:00
Ronghang Hu	98fcb164bf	Update links after renaming the repo from `segment-anything-2` to `sam2` (#341 ) This PR update repo links after we renamed the repo from `segment-anything-2` to `sam2`. It also changes `NAME` in setup.py to `SAM-2` (which is already the named used in pip setup since python packages don't allow whitespace)	2024-09-30 20:27:44 -07:00
Ronghang Hu	05d9e57fb3	[docs] add a release note and new installation instructions for SAM 2.1 (#338 )	2024-09-30 09:55:58 -07:00
Ronghang Hu	429a2c7360	minor update README.md	2024-09-28 23:32:25 -07:00
Chay Ryali	3a7889d905	Merge pull request #335 from facebookresearch/sam2.1 SAM 2.1	2024-09-28 23:01:29 -07:00
Haitham Khedr	aa9b8722d0	SAM2.1 SAM2.1 checkpoints + training code + Demo	2024-09-29 05:49:56 +00:00
Sergii Dymchenko	0f6515ae85	Merge branch 'main' into patch-1	2024-08-26 15:49:40 -07:00
Ronghang Hu	7e1596c0b6	open `README.md` with unicode (to support Hugging Face emoji); fix various typos (#218 ) (close #217, #66, #67, #69, #91, #126, #127, #145)	2024-08-14 09:06:25 -07:00
Haitham Khedr	0db838b117	Merge pull request #205 from facebookresearch/haitham/fix_hf_image_predictor Fix HF image predictor	2024-08-12 17:04:04 -07:00
Haitham Khedr	fd5125b97a	accept kwargs in auto_mask_generator	2024-08-13 00:02:36 +00:00
Haitham Khedr	1191677e1e	Fix HF image predictor	2024-08-12 23:41:41 +00:00
Ronghang Hu	dce7b5446f	improving warning message and adding further tips for installation (#204 )	2024-08-12 11:37:41 -07:00
Ronghang Hu	1034ee2a1a	better support for non-CUDA devices (CPU, MPS) (#192 )	2024-08-12 10:46:50 -07:00
Chay Ryali	778e112740	Merge pull request #167 from arun477/patch-1 remove unused attributes from hieradet.py	2024-08-09 10:31:56 -07:00
Arun	8f607e2de1	Merge branch 'main' into patch-1	2024-08-09 11:14:43 +05:30
Arun	46945a2122	Update hieradet.py ufmt formatting fixed.	2024-08-09 11:14:11 +05:30
Ronghang Hu	d421e0b040	add Colab support to the notebooks; pack config files in `sam2_configs` package during installation (#176 )	2024-08-08 11:03:22 -07:00
Arun	102ddb8899	Merge branch 'main' into patch-1	2024-08-08 09:59:47 +05:30
Ronghang Hu	6186d1529a	also catch errors during installation in case `CUDAExtension` cannot be loaded (#175 ) Previously we only catch build errors in `BuildExtension` in https://github.com/facebookresearch/segment-anything-2/pull/155. However, in some cases, the `CUDAExtension` instance might not load. So in this PR, we also catch such errors for `CUDAExtension`.	2024-08-07 12:26:11 -07:00
Ronghang Hu	6ecb5ff8d0	Add interface for box prompt in SAM 2 video predictor (#174 ) This PR adds an example to provide box prompt in SAM 2 as inputs to the `add_new_points_or_box` API (renamed from`add_new_points`, which is kept for backward compatibility). If `box` is provided, we add it as the first two points with labels 2 and 3, along with the user-provided points (consistent with how SAM 2 is trained). The video predictor notebook `notebooks/video_predictor_example.ipynb` is updated to include segmenting from box prompt as an example.	2024-08-07 11:54:30 -07:00
Arun	086daf0641	Merge branch 'main' into patch-1	2024-08-07 21:50:26 +05:30
Haitham Khedr	6ba4c65cb2	Merge pull request #128 from NielsRogge/add_hf Integrate with Hugging Face	2024-08-07 08:54:49 -07:00
Niels	9b58611e24	Address comment	2024-08-07 17:48:12 +02:00
Arun	6ec8560436	Update hieradet.py Not used head_dim = dim_out // num_heads self.scale = head_dim**-0.5 F.scaled_dot_product_attention takes care of this automatically.	2024-08-07 11:35:46 +05:30
Niels	43c385c263	Update docstrings	2024-08-06 23:00:26 +02:00
Niels	322aa3e7e5	Revert code snippet	2024-08-06 22:57:07 +02:00
Diego Garcia	511199d7a9	Updated INSTALL.md with CUDA_HOME-related troubleshooting (#140 ) This is referring to https://github.com/facebookresearch/segment-anything-2/issues/137 , which in itself refers to a common problem during installation, mentioned on https://github.com/facebookresearch/segment-anything-2/issues/19 . Some users may encounter significant trouble installing the project, running into the error `OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.`. Simply adding the `--no-build-isolation` flag to the pip install, e.g. `pip install --no-build-isolation -e .`, usually solves this problem. However, this fix is not mentioned anywhere within the readmes or installation troubleshooting docs. This PR adds this recommendation into the INSTALL.md file under the "My installation failed with `CUDA_HOME environment variable is not set` " section, ensuring that more users are aware of this potential fix. Examples of users experiencing related difficulties when installing: https://github.com/facebookresearch/segment-anything-2/issues/19 https://github.com/facebookresearch/segment-anything-2/issues/41 https://github.com/facebookresearch/segment-anything-2/issues/99 https://github.com/facebookresearch/segment-anything-2/issues/133	2024-08-06 13:45:15 -07:00
Niels	8f15c6255a	Format using ufmt	2024-08-06 22:43:35 +02:00
jhj0517	0bac418736	Update INSTALL.md (#156 ) This PR suggests a way to resolve the error of `unsupported Microsoft Visual Studio version!` in INSTALL.md. Adding `-allow-unsupported-compiler` argument for the `nvcc` worked. Editing [setup.py](https://github.com/facebookresearch/segment-anything-2/blob/main/setup.py) is required to add the `-allow-unsupported-compiler` argument for `nvcc`. ```python def get_extensions(): srcs = ["sam2/csrc/connected_components.cu"] compile_args = { "cxx": [], "nvcc": [ "-DCUDA_HAS_FP16=1", "-D__CUDA_NO_HALF_OPERATORS__", "-D__CUDA_NO_HALF_CONVERSIONS__", "-D__CUDA_NO_HALF2_OPERATORS__", "-allow-unsupported-compiler" # Add this argument ], } ext_modules = [CUDAExtension("sam2._C", srcs, extra_compile_args=compile_args)] return ext_modules ```	2024-08-06 13:43:12 -07:00
Niels	27a167c004	Update README	2024-08-06 22:41:32 +02:00
Ronghang Hu	6f7e700c37	Make it optional to build CUDA extension for SAM 2; also fallback to all available kernels if Flash Attention fails (#155 ) In this PR, we make it optional to build the SAM 2 CUDA extension, in observation that many users encounter difficulties with the CUDA compilation step. 1. During installation, we catch build errors and print a warning message. We also allow explicitly turning off the CUDA extension building with `SAM2_BUILD_CUDA=0`. 2. At runtime, we catch CUDA kernel errors from connected components and print a warning on skipping the post processing step. We also fall back to the all available kernels if the Flash Attention kernel fails.	2024-08-06 10:52:01 -07:00
Niels	a36edf1e01	Clean up	2024-08-06 08:34:42 +02:00
Niels	e815f70a38	Address comment	2024-08-06 08:32:36 +02:00
Niels	fbf7e3a664	Add link	2024-08-05 22:12:15 +02:00
Niels	e9503c96fe	Move HF to separate section	2024-08-05 22:10:57 +02:00
Niels	c3393d8b5f	Include original code snippet	2024-08-05 22:08:54 +02:00
Haitham Khedr	0230c5ff93	Merge pull request #152 from haithamkhedr/main Configure a workflow for format checking	2024-08-05 10:02:32 -07:00
Haitham Khedr	5e3d6ca6b5	Merge pull request #1 from haithamkhedr/CI	2024-08-05 09:36:54 -07:00
Haitham Khedr	3b0fd9e4a9	Update workflow	2024-08-05 09:28:28 -07:00
Haitham Khedr	acd3939f88	Add workflow	2024-08-05 09:16:29 -07:00
Niels	841cc1f015	Update docstring	2024-08-05 09:44:03 +02:00

1 2

76 Commits