add Colab support to the notebooks; pack config files in sam2_configs
package during installation (#176)
This commit is contained in:
@@ -29,14 +29,83 @@
|
||||
"- propagating clicks (or box) to get _masklets_ throughout the video\n",
|
||||
"- segmenting and tracking multiple objects at the same time\n",
|
||||
"\n",
|
||||
"We use the terms _segment_ or _mask_ to refer to the model prediction for an object on a single frame, and _masklet_ to refer to the spatio-temporal masks across the entire video. \n",
|
||||
"We use the terms _segment_ or _mask_ to refer to the model prediction for an object on a single frame, and _masklet_ to refer to the spatio-temporal masks across the entire video. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a887b90f-6576-4ef8-964e-76d3a156ccb6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/facebookresearch/segment-anything-2/blob/main/notebooks/video_predictor_example.ipynb\">\n",
|
||||
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
|
||||
"</a>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "26616201-06df-435b-98fd-ad17c373bb4a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Environment Set-up"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8491a127-4c01-48f5-9dc5-f148a9417fdf",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If running locally using jupyter, first install `segment-anything-2` in your environment using the [installation instructions](https://github.com/facebookresearch/segment-anything-2#installation) in the repository.\n",
|
||||
"\n",
|
||||
"If running locally using jupyter, first install `segment-anything-2` in your environment using the [installation instructions](https://github.com/facebookresearch/segment-anything-2#installation) in the repository."
|
||||
"If running from Google Colab, set `using_colab=True` below and run the cell. In Colab, be sure to select 'GPU' under 'Edit'->'Notebook Settings'->'Hardware accelerator'. Note that it's recommended to use **A100 or L4 GPUs when running in Colab** (T4 GPUs might also work, but could be slow and might run out of memory in some cases)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "f74c53be-aab1-46b9-8c0b-068b52ef5948",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"using_colab = False"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "d824a4b2-71f3-4da3-bfc7-3249625e6730",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"if using_colab:\n",
|
||||
" import torch\n",
|
||||
" import torchvision\n",
|
||||
" print(\"PyTorch version:\", torch.__version__)\n",
|
||||
" print(\"Torchvision version:\", torchvision.__version__)\n",
|
||||
" print(\"CUDA is available:\", torch.cuda.is_available())\n",
|
||||
" import sys\n",
|
||||
" !{sys.executable} -m pip install opencv-python matplotlib\n",
|
||||
" !{sys.executable} -m pip install 'git+https://github.com/facebookresearch/segment-anything-2.git'\n",
|
||||
"\n",
|
||||
" !mkdir -p videos\n",
|
||||
" !wget -P videos https://dl.fbaipublicfiles.com/segment_anything_2/assets/bedroom.zip\n",
|
||||
" !unzip -d videos videos/bedroom.zip\n",
|
||||
"\n",
|
||||
" !mkdir -p ../checkpoints/\n",
|
||||
" !wget -P ../checkpoints/ https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_large.pt"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "22e6aa9d-487f-4207-b657-8cff0902343e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Set-up"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "e5318a85-5bf7-4880-b2b3-15e4db24d796",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -50,7 +119,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 5,
|
||||
"id": "08ba49d8-8c22-4eba-a2ab-46eee839287f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -74,7 +143,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 6,
|
||||
"id": "f5f3245e-b4d6-418b-a42a-a67e0b3b5aec",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -89,7 +158,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 7,
|
||||
"id": "1a5320fe-06d7-45b8-b888-ae00799d07fa",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -143,17 +212,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"execution_count": 8,
|
||||
"id": "b94c87ca-fd1a-4011-9609-e8be1cbe3230",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"<matplotlib.image.AxesImage at 0x7f884825eef0>"
|
||||
"<matplotlib.image.AxesImage at 0x7fdeec360250>"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
},
|
||||
@@ -206,7 +275,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 9,
|
||||
"id": "8967aed3-eb82-4866-b8df-0f4743255c2c",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -214,7 +283,7 @@
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"frame loading (JPEG): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:05<00:00, 33.78it/s]\n"
|
||||
"frame loading (JPEG): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:05<00:00, 35.92it/s]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -242,7 +311,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"execution_count": 10,
|
||||
"id": "d2646a1d-3401-438c-a653-55e0e56b7d9d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -272,7 +341,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"execution_count": 11,
|
||||
"id": "3e749bab-0f36-4173-bf8d-0c20cd5214b3",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -333,7 +402,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"execution_count": 12,
|
||||
"id": "e1ab3ec7-2537-4158-bf98-3d0977d8908d",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -399,7 +468,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"execution_count": 13,
|
||||
"id": "ab45e932-b0d5-4983-9718-6ee77d1ac31b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -407,7 +476,7 @@
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 23.85it/s]\n"
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 23.76it/s]\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -591,7 +660,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"execution_count": 14,
|
||||
"id": "1a572ea9-5b7e-479c-b30c-93c38b121131",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -664,7 +733,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"execution_count": 15,
|
||||
"id": "baa96690-4a38-4a24-aa17-fd2f4db0e232",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -672,7 +741,7 @@
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 23.94it/s]\n"
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 23.93it/s]\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -862,7 +931,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"execution_count": 16,
|
||||
"id": "6dbe9183-abbb-4283-b0cb-d24f3d7beb34",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -882,7 +951,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"execution_count": 17,
|
||||
"id": "1cbfb273-4e14-495b-bd89-87a8baf52ae7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -932,7 +1001,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"execution_count": 18,
|
||||
"id": "54906315-ab4c-4088-b866-4c22134d5b66",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -986,7 +1055,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"execution_count": 19,
|
||||
"id": "9cd90557-a0dc-442e-b091-9c74c831bef8",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -994,7 +1063,7 @@
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 24.05it/s]\n"
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 23.71it/s]\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1158,6 +1227,14 @@
|
||||
" show_mask(out_mask, plt.gca(), obj_id=out_obj_id)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e023f91f-0cc5-4980-ae8e-a13c5749112b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Note that in addition to clicks or boxes, SAM 2 also supports directly using a **mask prompt** as input via the `add_new_mask` method in the `SAM2VideoPredictor` class. This can be helpful in e.g. semi-supervised VOS evaluations (see [tools/vos_inference.py](https://github.com/facebookresearch/segment-anything-2/blob/main/tools/vos_inference.py) for an example)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "da018be8-a4ae-4943-b1ff-702c2b89cb68",
|
||||
@@ -1176,7 +1253,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"execution_count": 20,
|
||||
"id": "29b874c8-9f39-42d3-a667-54a0bd696410",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -1204,7 +1281,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"execution_count": 21,
|
||||
"id": "e22d896d-3cd5-4fa0-9230-f33e217035dc",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -1224,7 +1301,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"execution_count": 22,
|
||||
"id": "d13432fc-f467-44d8-adfe-3e0c488046b7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -1276,7 +1353,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"execution_count": 23,
|
||||
"id": "95ecf61d-662b-4f98-ae62-46557b219842",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -1334,7 +1411,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"execution_count": 24,
|
||||
"id": "86ca1bde-62a4-40e6-98e4-15606441e52f",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -1407,7 +1484,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"execution_count": 25,
|
||||
"id": "17737191-d62b-4611-b2c6-6d0418a9ab74",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -1415,7 +1492,7 @@
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:10<00:00, 19.93it/s]\n"
|
||||
"propagate in video: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:10<00:00, 19.77it/s]\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Reference in New Issue
Block a user