【新闻】HELIX AI 机器人“S1”解析:4 项新 AI 自动化突破超越特斯拉
视频内容介绍了Figure AI最新发布的Helix AI系统的第二个演示,包含四个新的技术突破和一个特别奖励,以下是英中字幕:
These AI robots can now move even faster than humans as Figure AI just released the second demo of its newest Helix AI system packed with four new tech breakthroughs plus a special bonus but how close to AGI are they now and what can they do as a result to answer that Figure tested Helix's System One visual motor control in several real world scenarios demoing the following four new tech breakthroughs.
这些人工智能机器人现在可以比人类移动得更快,因为Figure AI刚刚发布了其最新Helix AI系统的第二个演示,展示了四个新的技术突破和一个特别的奖励。但它们现在离通用人工智能(AGI)有多近?它们因此能做什么?为了回答这个问题,Figure在几个真实世界场景中测试了Helix的System One视觉电机控制,演示了以下四个新的技术突破。
Number One: Implicit Stereo Vision
The first important intelligence feature of the Helix S1 is its adoption of implicit stereo Vision giving the system a rich three-dimensional understanding of its environment and unlike its predecessor which relied on monocular visual input the upgraded S1 uses a stereovision backbone paired with a multiscale feature extraction Network this architectural shift allows the robot to merge visual data from two cameras into a cohesive depth aware picture before processing it through a cross attention Transformer the result is the robots except ability to perceive fine environmental details while also maintaining a broader understanding of the scene such as the layout of a bustling conveyor belt in a factory.
Helix S1的第一个重要智能特性是采用了隐式立体视觉,这为系统提供了对其环境丰富的三维理解。与其前身依赖单目视觉输入不同,升级后的S1使用了立体视觉主干,并搭配多尺度特征提取网络。这种架构上的转变使机器人能够在通过交叉注意力Transformer处理之前,将来自两个摄像头的视觉数据融合成一个具有深度感知的连贯画面。结果是,机器人能够感知细微的环境细节,同时保持对场景的更广泛理解,例如工厂中繁忙的传送带的布局。
To prove this the robot's stereo Vision capability was put to the test in Figure Logistics demo with packages of varying sizes shapes and weights streaming across a conveyor belt requiring precise depth perception to grasp and reorient them correctly and the data from these tests showed a 60% increase in throughput compared to non-stereo baselines with the system even generalizing to flat envelopes that it wasn't explicitly trained on and this humanlike spatial awareness serves as a key visual Foundation to enable the next breakthrough that Helix has to offer.
为了证明这一点,机器人的立体视觉能力在Figure Logistics的演示中接受了测试,传送带上有各种尺寸、形状和重量的包裹,需要精确的深度感知来正确抓取和重新定向它们。这些测试的数据显示,与非立体基线相比,吞吐量提高了60%,系统甚至能推广到未明确训练过的扁平信封上。这种类似人类的空间意识为Helix提供的下一个突破奠定了关键的视觉基础。
Number Two: Multiscale Visual Representation
This second major advancement builds on top of Helix's stereo Vision Foundation allowing Helix S1 to capture both granular details and high-level contextual cues simultaneously plus rather than processing images from each camera independently the system fuses stereo inputs into a multiscale stereo Network producing a compact set of visual tokens that feed into the Transformer without increasing its overall computational load this balance of detail and context enables more reliable control whether the robot is picking up a tiny parcel or navigating a crowded workspace.
这一重大进展建立在Helix的立体视觉基础之上,使Helix S1能够同时捕捉细粒度细节和高层上下文线索。此外,系统不是独立处理每个摄像头的图像,而是将立体输入融合到多尺度立体网络中,生成一组紧凑的视觉标记,输入到Transformer中,而不增加其整体计算负载。这种细节和上下文的平衡使控制更加可靠,无论是机器人拿起一个小包裹还是在拥挤的工作空间中导航。
And in the logistics use case this capability is invaluable because packages on a conveyor belt often vary wildly with some items being small and rigid while others might be large and deformable forcing the robot to decide the optimal grasp point and method on the fly but the new multiscale approach ensures Helix doesn't lose sight of the bigger picture while zeroing in on critical specifics like a shipping label's orientation and the results show that the addition of multiscale feature extraction significantly boosts the effective throughput referred to as TF which is a metric that compares the robot's package handling speed to human package handling speed all of this enables the next great leap in robot intelligence.
在物流用例中,这种能力非常宝贵,因为传送带上的包裹差异很大,有些物品小而坚硬,而其他物品可能大而可变形,迫使机器人在动态中决定最佳抓取点和方法。但新的多尺度方法确保Helix在关注关键细节(如运输标签的方向)时不会忽视大局。结果显示,添加多尺度特征提取显著提高了有效吞吐量,即TF,这是一个将机器人的包裹处理速度与人类包裹处理速度进行比较的指标。所有这些都为机器人智能的下一个巨大飞跃奠定了基础。
Number Three: Learned Visual Proprioception
This third breakthrough tackles another long-standing challenge in robotics which is successfully scaling a single policy across multiple humanoid robots this is partially because hardware variations like slight differences in sensor calibration or joint responses can disrupt performance when a policy trained on one robot is applied to another and while manual calibration used to be the traditional fix it just doesn't scale for an entire fleet but Helix S1 sidesteps this with a self-calibration system that estimates the six-dimensional poses of its end effectors such as its hands using only onboard visual input with no external tools required.
这一突破解决了机器人技术中的另一个长期挑战,即成功地将单个策略扩展到多个人形机器人上。这部分是因为硬件差异,如传感器校准或关节响应的细微差异,会在将训练于一个机器人的策略应用于另一个机器人时破坏性能。虽然手动校准曾经是传统的解决方法,但对于整个机队来说,它并不能扩展。但Helix S1通过一个自校准系统绕过了这个问题,该系统仅使用板载视觉输入估计其末端执行器(如手)的六维姿势,无需外部工具。
As a result this feature allows Figure to deploy the same policy trained on a single robot across multiple units with minimal performance drop and despite small hardware quirks the robots demonstrate consistent dexterity flipping packages to expose labels and transferring them between conveyors at high speed importantly this cross-robot transfer capability slashes downtime and calibration costs to make large-scale deployments feasible but what comes next pushes Helix into superhuman territory.
因此,这一特性使Figure能够将训练于单个机器人的相同策略部署到多个单元上,性能下降最小。尽管硬件存在细微差异,机器人仍能展示出一致的灵巧性,翻转包裹以露出标签,并高速在传送带之间转移它们。重要的是,这种跨机器人转移能力大大减少了停机时间和校准成本,使大规模部署变得可行。但接下来的是将Helix推向超人领域。
Number Four: Sport Mode
This fourth breakthrough catapults Helix S1 into faster-than-human territory by using a simple yet ingenious test-time technique where Figure speeds up the robot's actions by resampling its output action chunks which are sequences of movements generated at 200 Hz for example a chunk representing a 1-second trajectory can be compressed to 0.8 seconds and executed at the original rate yielding a 20% speed boost without retraining the model in testing Figure pushed this to a 50% speed up achieving a TF value greater than one meaning the robot outpaced its human demonstrators.
这一突破通过使用一种简单而巧妙的测试时间技术,将Helix S1推向了比人类更快的领域,其中Figure通过对机器人的输出动作块进行重采样来加速其动作,这些动作块是以200赫兹生成的运动序列。例如,代表1秒轨迹的块可以压缩到0.8秒,并以原始速率执行,从而在不重新训练模型的情况下实现20%的速度提升。在测试中,Figure将这一速度提升到50%,实现了TF值大于1,意味着机器人超越了其人类演示者。
And sport mode enables Helix's AI to handle packages with incredible efficiency maintaining high success rates even as throughput soared however the demo also showed that while a 50% speed up maximizes performance pushing beyond this upper threshold sacrifices precision and requires frequent resets but the demo still highlights a tantalizing prospect humanoid robots that don't just match human speed but surpass it all while retaining the dexterity needed for complex tasks like label orientation.
运动模式使Helix的AI能够以惊人的效率处理包裹,即使吞吐量激增仍能保持高成功率。然而,演示还显示,虽然50%的速度提升最大化了性能,但超过这一上限会牺牲精度,并需要频繁重置。但演示仍然突出了一个诱人的前景:人形机器人不仅能匹配人类的速度,还能超越它,同时保留执行复杂任务(如标签定向)所需的灵巧性。
Why It Matters
And here's why it all matters so much these advancements stereo Vision multiscale representation visual proprioception and sport mode aren't just incremental upgrades but a paradigm shift by solving core challenges like depth perception contextual awareness fleet scalability and speed Figure has unlocked a future where humanoid robots can seamlessly integrate into human workflows the logistics demo is just the beginning these generic improvements to Helix S1 will enhance every use case Figure pursues from manufacturing to healthcare.
这就是为什么这些进展如此重要:立体视觉、多尺度表示、视觉本体感受和运动模式不仅仅是渐进式升级,而是一种范式转变。通过解决深度感知、上下文意识、机队可扩展性和速度等核心挑战,Figure开启了一个人形机器人可以无缝融入人类工作流程的未来。物流演示只是一个开始。Helix S1的这些通用改进将增强Figure追求的每一个用例,从制造业到医疗保健。
And the numbers back up the hype a TF above 1.1 means Helix is already 10% faster than its human trainers in some scenarios with sport mode pushing that edge further what's more is that the stereo model's 60% throughput jump coupled with the proprioception module's cross-robot consistency both signal that this technology is ready for prime time.
数据也支持了这一炒作:TF值超过1.1意味着Helix在某些场景中已经比其人类训练者快10%,运动模式进一步推动了这一优势。更重要的是,立体模型的吞吐量提高了60%,再加上本体感受模块的跨机器人一致性,都表明这项技术已经准备好进入黄金时段。
Pricing and Future Outlook
As for price Figure AI's CEO suggests their AI robots are already generating revenue possibly at a premium for early adopters like BMW but for broader markets such as logistics or home use economies of scale could eventually push the price closer to $50,000 as production ramps up aligning with Adcock's vision of deploying 100,000 robots in the next four years but another price strategy may be a subscription-based model which could make humanoid robots even more accessible.
至于价格,Figure AI的CEO表示,他们的AI机器人已经在产生收入,可能对早期采用者如宝马收取溢价。但对于更广泛的市场,如物流或家庭使用,随着生产规模扩大,规模经济最终可能将价格推至接近5万美元,符合Adcock在未来四年部署10万台机器人的愿景。但另一种价格策略可能是订阅模式,这可能使人形机器人更加容易获得。
This could allow businesses and consumers to just pay a monthly fee in a leasing structure instead of a purchasing model along this trajectory Figure might follow the robotics-as-a-service model where they'd essentially rent out robots for tasks like warehouse sorting or package handling with pricing based on hours of use or tasks completed and beyond this Figure would also potentially maintain ownership of the data which they could monetize and then pass down savings to users lowering the entry barrier for smaller firms and consumers while ensuring more predictable revenue for Figure.
这可以让企业和消费者只需支付月费,采用租赁结构,而不是购买模式。沿着这一轨迹,Figure可能会采用机器人即服务模式,实质上出租机器人用于仓库分拣或包裹处理等任务,定价基于使用小时数或完成的任务数量。除此之外,Figure还可能保留数据的所有权,他们可以将其货币化,然后将节省的费用传递给用户,降低小型企业和消费者的进入门槛,同时确保Figure获得更可预测的收入。
And if BMW and other enterprise clients are already seeing value Figure could even introduce tiered pricing offering premium AI capabilities faster updates or dedicated support at higher subscription levels anyways tell us how much you'd be willing to pay for one of these robots in the comments below and click here to listen to Figure's CEO talk about his plans for 2025 and beyond.
如果宝马和其他企业客户已经看到了价值,Figure甚至可以引入分级定价,提供高级AI功能、更快的更新或专用的支持,以更高的订阅级别。无论如何,请在下方评论中告诉我们,您愿意为这些机器人中的一个支付多少钱,并点击此处听取Figure CEO谈论他2025年及以后的计划。