当前位置: 首页 > article >正文

Segment Anything论文阅读笔记

Abstract

Segment Anything (SA) project: a new task, model, and dataset for image segmentation.

we built the largest segmentation dataset to date (by far:迄今为止), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks.The Segment Anything Model (SAM) and corresponding dataset (SA-1B) releasing at SA to foster research into foundation models for computer vision.

Introduction

 Large language models pre-trained on web-scale datasets are revolutionizing NLP (彻底改变)with strong zero-shot and few-shot generalization. These “foundation models” can generalize to tasks and data distributions beyond those seen during training. (zero-shot and few-shot generalization零样本和少样本泛化

Foundation models have also been explored in computer vision ,albeit to a lesser extent. (尽管程度较小)

Our goal is to build a foundation model for image segmentation. That is, we seek to develop a promptable model and pre-train it on a broad dataset using a task that enables powerful generalization. With this model, we aim to solve a range of downstream segmentation problems on new data distributions using prompt engineering. 

The success of this plan hinges on(取决于) three components: task, model, and data. To develop them, we address the following questions about image segmentation:

1. What task will enable zero-shot generalization?

2. What is the corresponding model architecture?

3. What data can power this task and model?

These questions are entangled and require a comprehen- sive solution.(错综复杂需要一个综合的解决方案。)

Surprisingly, we find that a simple design satisfies all three constraints: a powerful image encoder computes an image embedding, a prompt encoder embeds prompts, and then the two information sources are combined in a lightweight mask decoder that predicts segmentation masks. We refer to this model as the Segment Anything Model, or SAM .

data engine has three stages:

  1. assisted-manual
  2. semi-automatic
  3. and fully automatic

2. Segment Anything Task


http://www.kler.cn/a/9675.html

相关文章:

  • 【RabbitMQ】08-延迟消息
  • 模型压缩相关技术概念澄清(量化/剪枝/知识蒸馏)
  • rust模式和匹配
  • 关于Oracle数据库密码复杂度检查的一些概念
  • 希尔排序(C语言)
  • 【SpringBoot】18 上传文件到数据库(Thymeleaf + MySQL)
  • HummerRisk 使用教程:操作审计
  • Qt·核心机制
  • 商汤科技推出“日日新SenseNova”,大模型体系赋能人工智能新未来
  • Elasticsearch:ESQL 简介 — 一种用于灵活、迭代分析的新查询语言
  • 使用模板窗口生成测试数据
  • TypeScript由浅到深(上篇)
  • 工程管理系统软件 自主研发,工程行业适用
  • 【国内chatgpt最全使用方法合集】(总有一个适合你)
  • GaussDB行存储表列存储表相关
  • 本地安装WSL的发行版后,导出到另一台计算机安装的办法
  • 自然语言处理(七): Deep Learning for NLP: Recurrent Networks
  • Python第三方库安装
  • 人脑体内扩散张量分布MRI的新框架
  • Diffusion模型系列文章
  • midjourney注册教程
  • 浏览器表单自动填充调研
  • 企业资源规划(ERP)监控工具
  • Python 进阶指南(编程轻松进阶):一、处理错误和寻求帮助
  • AttributeError: ‘HowNetDict‘ object has no attribute ‘en_map‘ 解决方法
  • 医疗耗材缺陷视觉检测的应用