Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Anthropic Makes 'Fetch' Happen As New Claude Models Beat Human Teams on Robotics Planning Tasks

Anthropic Makes 'Fetch' Happen As New Claude Models Beat Human Teams on Robotics Planning Tasks
Anthropic reran its "Project Fetch" robotics test and found its newer Claude models could outperform the previous generation. In its latest write-up, Anthropic detailed three trials using Claude Opus 4.7 inside Claude Code, with a researcher mainly connecting a laptop to the robot, entering the initial prompt, and approving commands and task transitions. The original experiment, run in August 2025, compared Anthropic employees with Claude support against a group limited to web research and their own problem-solving. Anthropic said Opus 4.7, running without human help, finished the subset of tested objectives at speeds it described as roughly 20 times faster than the quickest human team on tasks participants had completed less than a year earlier. The company also said that for any step at least one human team completed in phase one, Opus 4.7 finished that same step at least 10 times faster. According to Anthropic, Opus 4.7 moved quickly through choices that slowed humans down in 2025, such as deciding how to interface with the robot's sensors, and much of its code worked on the first attempt. The model produced far less code than the Claude-assisted human team while still matching or exceeding both teams' outcomes on the tasks tested. Anthropic cautioned that the results do not constitute a robotics breakthrough, noting that the model still had trouble with the "fetching" portion, which required precisely guiding a beach ball back to a start area using environmental feedback. Opus 4.7 could position the robot to attempt the push, but the motion control was not accurate enough to complete the job, Anthropic explained. "This doesn't mean that LLMs have now solved robotics. Far from it. The latest Claude models still struggled with using the robot to precisely move the beach ball—the "fetching" part of Project Fetch. Targeted robotics did not drive improvements. Instead, they stemmed from broader model scaling, reflecting a familiar progression: models first assist people, then people assist models, and eventually models can complete more work on their own. Anthropic stated that "more research is needed to understand the models' ability to make these physical tools more bespoke, whether by writing control policies tailored to particular tasks or by designing robotic systems." "There may be substantial barriers to this more generalized vision of physically capable and adaptable language models. But as we have seen, apparently, large distances in model capability can be traversed quickly. Models building their own software tools might have seemed outlandish not long ago, but it is happening. It would be unwise to rule out the same trajectory in hardware," the company wrote.

Source: Benzinga

Read Original Source →

კატეგორიები

თეგები

Cart (0 items)