SAM 2 — screenshot of github.com

SAM 2

SAM 2 is Facebook AI's latest, extending the Segment Anything Model to both images and video. It's built on a simple transformer with streaming memory, designed for real-time video segmentation.

Visit github.com →

Questions & Answers

What is SAM 2?
SAM 2 (Segment Anything Model 2) is a foundation model developed by Meta AI for promptable visual segmentation across both images and videos. It extends the capabilities of the original SAM model to efficiently handle video data.
Who can benefit from using SAM 2?
Researchers, developers, and practitioners working on computer vision tasks involving object segmentation in both static images and real-time video streams can benefit from SAM 2's capabilities.
How does SAM 2 improve upon previous segmentation models?
SAM 2 extends the original Segment Anything Model by explicitly incorporating video segmentation capabilities. It uses a simple transformer architecture with streaming memory, optimized for real-time video processing, and was trained on the large SA-V dataset.
When should I consider using SAM 2 for a project?
You should consider using SAM 2 when your project requires robust, promptable object segmentation in either still images or dynamic video sequences, especially for tasks demanding real-time performance.
What are the core technical requirements for SAM 2?
SAM 2 requires Python 3.10 or newer, PyTorch 2.5.1 or newer, and torchvision 0.20.1 or newer. For GPU inference and custom CUDA kernel compilation, a compatible CUDA toolkit is recommended.