Skip to main content
Have a personal or library account? Click to login
SAMannot: A Memory-Efficient, Local, Open-Source Framework for Interactive Video Instance Segmentation Based on SAM2 Cover

SAMannot: A Memory-Efficient, Local, Open-Source Framework for Interactive Video Instance Segmentation Based on SAM2

Open Access
|Apr 2026

References

  1. Jocher G, Qiu J. Ultralytics YOLO11 (Version 11.0.0) [Computer software]. GitHub; 2024. https://github.com/ultralytics/ultralytics (Accessed 16 January 2026).
  2. Lauer J, Zhou M, Ye S, Menegas W, Schneider S, Nath T, Rahman MM, Di Santo V, Soberanes D, Feng G, Murthy VN, Lauder G, Dulac C, Mathis M, Mathis A. Multi-animal pose estimation, identification and tracking with deeplabcut. Nature Methods. 2022;19:496504. DOI: 10.1038/s41592-022-01443-0
  3. Dwyer B, Nelson J, Hansen T, et al. Roboflow (version 1.0) [software]. Computer vision platform; 2025.
  4. Encord Technologies, Inc. Encord [Computer Software]. (n.d). https://encord.com (Accessed 16 January 2026).
  5. Tensor Matics, Inc. Labellerr [Computer software]. (n.d.). https://www.labellerr.com/ (Accessed 16 January 2026).
  6. Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, Xiao T, Whitehead S, Berg AC, Lo W-Y, Dollár P, Girshick R. Segment anything; 2023. DOI: 10.1109/ICCV51070.2023.00371
  7. Computer Vision Annotation Tool (CVAT) (Version 2.25.0) [Computer software]. https://cvat.ai/ (Accessed 16 January 2026).
  8. Kanbertay O, Vogg R, Karakoc E, Kappeler PM, Fichtel C, Ecker AS. Silvi: Simple interface for labeling video interactions; 2025. Accessed 21 December 2025.
  9. Dutta A, Zisserman A. The VIA annotation software for images, audio and video. In Proceedings of the 27th ACM International Conference on Multimedia, MM ‘19. New York, NY, USA: ACM; 2019. DOI: 10.1145/3343031.3350535
  10. Ravi N, Gabeur V, Hu Y-T, Hu R, Ryali C, Ma T, Khedr H, Rädle R, Rolland C, Gustafson L, Mintun E, Pan J, Alwala KV, Carion N, Wu C-Y, Girshick R, Dollár P, Feichtenhofer C. Sam 2: Segment anything in images and videos. arXiv preprint; 2024.
  11. Python Software Foundation. tkinter — Python interface to Tcl/Tk. Python 3 Documentation. Accessed 20 December 2025.
  12. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 2010;88(2):303338. DOI: 10.1007/s11263-009-0275-4
  13. aza1200. Are there any method for reducing gpu memory overhead? (issue #196). GitHub issue in facebookresearch/sam2; August 2024. Accessed 21 December 2025.
  14. aendrs. Sam2 for segmenting a 2 hour video? (issue #264). GitHub issue in facebookresearch/sam2; August 2024. Accessed 21 December 2025.
  15. Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016. DOI: 10.1109/CVPR.2016.85
  16. Pont-Tuset J, Perazzi F, Caelles S, Arbeláez P, Sorkine-Hornung A, Van Gool L. The 2017 davis challenge on video object segmentation. arXiv:1704.00675; 2017.
  17. Hong L, Liu Z, Chen W, Tan C, Feng Y, Zhou X, Guo P, Li J, Chen Z, Gao S, Zhang W, Zhang W. Lvos: A benchmark for large-scale long-term video object segmentation. arXiv preprint arXiv:2404.19326; 2024. DOI: 10.1109/ICCV51070.2023.01240
  18. Van Rijsbergen CJ. Information retrieval. 2nd ed. Oxford: Butterworth-Heinemann; 1979.
  19. Rand WM. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association. 1971;66(336):846850. DOI: 10.1080/01621459.1971.10482356
  20. NVIDIA Corporation. NVIDIA System Management Interface (nvidia-smi) (Version 535.183.01) [Computer software]; 2024.
  21. Carion N, Gustafson L, Hu Y-T, Debnath S, Hu R, Suris D, Ryali C, Alwala KV, Khedr H, Huang A, Lei J, Ma T, Guo B, Kalla A, Marks M, Greer J, Wang M, Sun P, Rädle R, Afouras T, Mavroudi E, Xu K, Wu T-H, Zhou Y, Momeni L, Hazra R, Ding S, Vaze S, Porcher F, Li F, Li S, Kamath A, Cheng HK, Dollár P, Ravi N, Saenko K, Zhang P, Feichtenhofer C. Sam 3: Segment anything with concepts; 2025.
DOI: https://doi.org/10.5334/jors.680 | Journal eISSN: 2049-9647
Language: English
Submitted on: Jan 16, 2026
Accepted on: Mar 26, 2026
Published on: Apr 20, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Gergely Dinya, András Gelencsér, Krisztina Kupán, Clemens Küpper, Kristóf Karacs, Anna Gelencsér-Horváth, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.