AnyTask: an Automated Task and Data Generation Framework for Advancing Sim-to-Real Policy Learning

AnyTask

An Automated Task and Data Generation Framework for Advancing Sim-to-Real Policy Learning

Ran Gong^1*, Xiaohan Zhang^1*, Jinghuan Shang^1*, Maria Vittoria Minniti^1*, Jigarkumar Patel¹, Valerio Pepe¹, Riedana Yan¹, Ahmet Gundogdu¹, Ivan Kapelyukh¹, Ali Abbas¹, Xiaoqiang Yan¹, Harsh Patel¹, Laura Herlant¹, Karl Schmeckpeper¹

¹Robotics and AI Institute, Boston, MA, USA

* Equal Contribution

Abstract

Generalist robot learning remains constrained by data: large-scale, diverse, and high‐quality interaction data are expensive to collect in the real world. While simulation has become a promising way for scaling up data collection, the related tasks, including simulation task design, task-aware scene generation, expert demonstration synthesis, and sim-to-real transfer, still demand substantial human effort.

We present AnyTask, an automated framework that pairs massively parallel GPU simulation with foundation models to design diverse manipulation tasks and synthesize robot data. We introduce three AnyTask agents for generating expert demonstrations aiming to solve as many tasks as possible:

ViPR: A novel task and motion planning agent with VLM-in-the-loop Parallel Refinement.
ViPR-Eureka: A reinforcement learning agent with generated dense rewards and LLM-guided contact sampling.
ViPR-RL: A hybrid planning and learning approach that jointly produces high-quality demonstrations with only sparse rewards.

We train behavior cloning policies on generated data, validate them in simulation, and deploy them directly on real robot hardware. The policies generalize to novel object poses, achieving 44% average success across a suite of real-world pick-and-place, drawer opening, contact-rich pushing, and long-horizon manipulation tasks.

System Overview

Figure 1: AnyTask System Overview. The pipeline first produces simulated manipulation tasks using an object database and high-level task types. It automatically generates task descriptions and simulation code, then efficiently collects data via ViPR, ViPR-RL, and ViPR-Eureka agents within massively parallel environments. Online domain randomization ensures diverse scenes and visual observations, allowing policies trained on this simulated data to transfer zero-shot to the real world.

Object Database

Figure 2: Object Database. We generate diverse manipulation tasks using an object database.

AnyTask Agents

1. ViPR Agent: A novel task and motion planning agent with VLM-in-the-loop Parallel Refinement.

2. ViPR-Eureka Agent: A reinforcement learning agent with generated dense rewards and LLM-guided contact sampling.

3. ViPR-RL Agent: A hybrid planning and learning approach that jointly produces high-quality demonstrations with only sparse rewards.

Sim Real Comparison

Simulation

Real World

↔

Drag the slider to compare Simulation (Left) vs Real World (Right)

Results & Sim-to-Real

Lift Peach

Lift Banana

Push Pear to Center

Put Strawberry Into Bowl

Stack Banana on Can

Sim-to-Real Transfer: We directly deploy the policies trained in simulation to the real robot. All videos are shown at original speed (1x).

The policies generalize to novel object poses, achieving 44% average success across a suite of real-world pick-and-place, drawer opening, contact-rich pushing, and long-horizon manipulation tasks.