Link Copied!

60분간의 완벽: Figure AI의 제조 시험

더 이상 걷는 것만이 아닙니다. Figure 02는 BMW에서 11개월간의 파일럿을 완료하여 휴식 없이 90,000개의 부품을 처리했습니다. '유용한 휴머노이드' 시대가 공식적으로 시작되었습니다.

🌐
언어 참고

이 기사는 영어로 작성되었습니다. 제목과 설명은 편의를 위해 자동으로 번역되었습니다.

BMW 조립 라인에서 자동차 부품을 설치하는 Figure 02 휴머노이드 로봇

For the last three years, the humanoid robot industry has been a contest of viral videos. Can it do a backflip? Can it make coffee? Can it fold a shirt? These demonstrations were impressive, but they were also carefully staged, heavily edited, and ultimately devoid of economic reality.

On December 12, 2025, that era ended.

Figure AI and BMW Manufacturing released the results of an 11-month, rigorous pilot program at the Spartanburg, South Carolina plant. The headline isn’t that the robot worked. The headline is that it was boring.

There were no backflips. There were no dances. There was just a Figure 02 robot standing at a chassis line, inserting sheet metal parts into a fixture with sub-millimeter precision, 24 hours a day, for weeks on end. It handled over 90,000 specific placements. It didn’t take a smoke break. It didn’t complain about repetitive strain injury.

This is the “Exam” that every robotics company has been dreading, and Figure just aced it.

The Metrics of Reality

The pilot focused on a specific, high-frequency task: inserting sheet metal parts into a chassis fixture. This is a task that is notoriously difficult for traditional automation because the chassis varies slightly in position, and the parts are flimsy and difficult to grip.

Unlike a rigid robotic arm which expects the world to be perfect, a humanoid must adapt to an imperfect world. The data released by BMW confirms that Figure 02 achieved:

  • Placement Accuracy: >99% success rate per shift.
  • Cycle Time: 84 seconds per multi-part operation.
  • Intervention Rate: Zero human interventions required during standard operational shifts.

The 5mm Tolerance

The most critical metric in the report is the placement tolerance. The robot was required to place parts within a 5mm tolerance window.

Tolerance5mm\text{Tolerance} \le 5\text{mm}

To a human, 5mm is huge. To a robot relying on computer vision in a factory with changing lighting conditions (shadows from overhead conveyors, welding sparks), 5mm is a canyon of uncertainty. The fact that the Figure 02 could consistently hit this target using only onboard neural networks, without external motion capture markers, validates the “End-to-End” neural network approach.

The Neural Architecture: Generative AI Models in Action

The operational logic of Figure 02 represents a fundamental departure from deterministic programming. Instead of running on “code” in the traditional sense, the robot executes actions via a Vision-Action Model (VAM) trained end-to-end.

In a classical robotic paradigm, a controls engineer would write explicit logic: if (sensor_A > 5) { move_arm(x, y, z) }

This approach is brittle. If the sensor reading drifts to 4.9, the code fails.

In contrast, Figure 02 operates like a Generative Pre-trained Transformer (GPT) for physical action. The robot’s onboard cameras feed a stream of visual tokens into a transformer model. Simultaneously, the model receives a high-level semantic instruction such as “Place the bracket.” The systems processes these inputs and predicts the next token in the sequence. However, instead of predicting the next word in a sentence, it predicts the next joint velocity command.

This “pixels-to-actions” architecture allows the robot to generalize. If the part is rotated 10 degrees, the neural network doesn’t error out; it simply predicts a slightly different hand trajectory, just as a human would intuitively adjust their grip. A traditional robot would simply crash.

VAM Training Pipeline details

The model isn’t just watching video; it’s ingesting a multimodal stream. The “reasoning” part of the VAM allows it to plan. When given a command, the model decomposes the task into a latent plan. This plan isn’t a hard-coded sequence but a probabilistic cloud of potential actions. As the robot moves, it constantly re-evaluates the probability of success for the next 100 milliseconds of movement. This 10Hz control loop is what allows it to correct for micro-slippages or unexpected vibrations on the chassis line. The “reasoning” layer sits above the “motor control” layer, effectively acting as a prefrontal cortex guiding the motor cortex.

This is why the “Sim-to-Real” gap was the biggest hurdle.

Crossing the Sim-to-Real Chasm

Reinforcement Learning (RL) allows a robot to train equivalent to thousands of years in a simulation in just a few days. However, simulations are notoriously “clean.” Friction is constant, lighting is uniform, and physics engines rarely capture the chaotic reality of a factory floor.

When you take a brain trained in “The Matrix” and put it in a dirty, noisy BMW factory, it usually suffers from distribution shift, essentially a digital seizure where the sensors don’t match the training data. The BMW pilot proves that Figure has solved the Domain Randomization problem. By training the model on millions of variations of lighting, noise, and friction in the simulation, the real world just looks like “another variation” to the neural net. This “Sim-to-Real” transfer is the holy grail of general-purpose robotics, allowing the fleet to learn from failure in the cloud without breaking expensive hardware in reality.

The Economics of “General Purpose”

Why use a $150,000 humanoid when a $30,000 KUKA arm could do this?

This is the central question that skeptics (and short sellers) have raised since the project’s inception. The answer lies in the specific constraints of the BMW pilot setup. The Figure 02 was deployed into an existing human workspace.

  • No Cages: The robot worked alongside people, without the expensive yellow safety cages that define traditional automation.
  • No Re-tooling: The chassis line wasn’t rebuilt for the robot. The robot adapted to the line.
  • Flexibility: If the line changes next week, the humanoid learns the new task via a software update. A dedicated arm would need to be physically unbolted, moved, and reprogrammed by a specialized integrator.

This is the definition of Brownfield Automation. It allows manufacturers to automate plants that were built in the 1990s without tearing them down. The “Robot Tax” (the hidden cost of modifying the factory environment to suit the robot) drops to near zero. Furthermore, the robot is not an asset tied to a specific product line. When the BMW X5 lifecycle ends, the robot walks to the X7 line. A fixed arm goes to the scrapyard.

The Cost of Labor Parity

At $150,000, assuming a 5-year lifespan and modest maintenance, the hourly cost of a Figure 02 is roughly $10-$15/hour. The fully burdened cost of a UAW worker in Spartanburg often exceeds $65/hour. The ROI calculation has shifted from “experimental technology” to “aggressive capital expenditure.” With 24/7 uptime (swapping batteries rather than shifts), a single robot effectively replaces 3 or 4 human shifts, pushing the payback period under 12 months.

Forward-Looking Analysis: The 2026 Ramp

The Spartanburg success is the green light for mass deployment. Industry analysts expect three major shifts in 2026:

1. The “Whites of Their Eyes” Phase

Robots will move from “caged pilots” to “collaborative distinct zones.” They won’t be shoulder-to-shoulder with humans yet (safety regulations lag behind technology), but they will share the same air.

2. The Compute Bottleneck

Figure 02 carries a massive onboard computer to run its VLM. As these fleets scale, the demand for edge-inference chips (like Nvidia’s Jetson Thor) will skyrocket. The limiting factor for deployment won’t be the robot hardware; it will be the availability of localized compute.

3. The Union Response

The UAW and German works councils have been quiet about these pilots because they were “experimental.” Now that the experiment is a success, the labor conversation will shift from “if” to “how many.” Expect contracts in 2026 to explicitly define “Humanoid Ratios” per assembly line.

The Verdict

The Figure 02 pilot at BMW wasn’t a science project. It was a job interview. And for the first time in history, a bipedal robot walked in, did the job, and didn’t crash.

The “Useful Humanoid” is no longer a promise. It is a clocked-in employee.

Sources

🦋 Discussion on Bluesky

Discuss on Bluesky

Searching for posts...