diff --git a/README.md b/README.md index 21104c7..bdc88e2 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,22 @@ Then SAM 2 can be used in a few lines as follows for image and video prediction. SAM 2 has all the capabilities of [SAM](https://github.com/facebookresearch/segment-anything) on static images, and we provide image prediction APIs that closely resemble SAM for image use cases. The `SAM2ImagePredictor` class has an easy interface for image prompting. +```python +import torch +from sam2.build_sam import build_sam2 +from sam2.sam2_image_predictor import SAM2ImagePredictor + +checkpoint = "./checkpoints/sam2_hiera_large.pt" +model_cfg = "sam2_hiera_l.yaml" +predictor = SAM2ImagePredictor(build_sam2(model_cfg, checkpoint)) + +with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16): + predictor.set_image() + masks, _, _ = predictor.predict() +``` + +or from Hugging Face, as follows: + ```python import torch from sam2.sam2_image_predictor import SAM2ImagePredictor @@ -94,6 +110,19 @@ with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16): ... ``` +or from Hugging Face, as follows: + +```python +import torch +from sam2.sam2_video_predictor import SAM2VideoPredictor + +predictor = SAM2VideoPredictor.from_pretrained("facebook/sam2-hiera-large") + +with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16): + predictor.set_image() + masks, _, _ = predictor.predict() +``` + Please refer to the examples in [video_predictor_example.ipynb](./notebooks/video_predictor_example.ipynb) for details on how to add prompts, make refinements, and track multiple objects in videos. ## Model Description