Play Datasets for Inverse Dynamics Model Training

We train four inverse dynamics models that are used to label the human video demonstrations with actions. Each inverse model is trained on a robot play dataset and is shared for one environment generalization experiment and one task generalization experiment. When training the inverse models, we mask a fixed portion of every input image and train the models only on masked data. Sample clips from the four play datasets we collected are shown below, along with their masked versions. All videos are sped up by 3x.

Play Dataset #1
(Tasks: reaching & cube stacking)

Play Dataset #2
(Tasks: cube grasping & cube pick-and-place)

Play Dataset #3
(Tasks: plate clearing)

Play Dataset #4
(Tasks: toy packing)

Expert Video Demonstrations for Behavioral Cloning

For each environment generalization and task generalization task, we collect expert robot video demonstrations and expert human video demonstrations. The robot demonstrations are narrow, while the human demonstrations are visually and/or behaviorally diverse. We mask a fixed portion of every image (as done with the play datasets) and show both the original and masked versions of the videos below. The masked versions are used for training policies under our method. All videos are sped up by 3x.

Reaching Task (Environment Generalization)

Left: sample robot demonstrations with no distractors (only red cube)
Right: sample human demonstrations with red cube and two distractors (blue cube and green sponge)

robot demonstrations

human demonstrations

Cube Grasping Task (Environment Generalization)

Left: sample robot demonstrations where the cube rests on a plain white background
Right: sample human demonstrations with each of the following backgrounds: rainbow floral texture, green floral texture, blue floral texture, orange plate, green plate, and blue plate

robot demonstrations

human demonstrations

Plate Clearing Task (Environment Generalization)

Left: sample robot demonstrations where the only target object is a green sponge
Right: sample human demonstrations with each of the following target objects: yellow sponge, blue towel, and pink towel

robot demonstrations

human demonstrations

Toy Packing Task (Environment Generalization)

Left: sample robot demonstrations where the only target object is a black suit vampire toy
Right: sample human demonstrations with each of the following toys: white mummy toy, orange body jack-o'-lantern toy, red cape vampire toy, purple body green zombie toy, and crazy witch toy

robot demonstrations

human demonstrations

Cube Stacking Task (Task Generalization)

Left: sample robot demonstrations where the expert only grasps the red cube
Right: sample human demonstrations where the expert performs portions of the stacking task after the grasp

robot demonstrations

human demonstrations

Cube Pick-and-Place Task (Task Generalization)

Left: sample robot demonstrations where the expert only grasps the cube
Right: sample human demonstrations where the expert performs the full cube pick-and-place task

robot demonstrations

human demonstrations

Plate Clearing Task (Task Generalization)

Left: sample robot demonstrations where the expert only grasps the target object (green sponge)
Right: sample human demonstrations where expert performs portions of the plate clearing task after the grasp

robot demonstrations

human demonstrations

Toy Packing Task (Task Generalization)

Left: sample robot demonstrations where the expert only grasps the target object (black suit vampire toy)
Right: sample human demonstrations where expert performs portions of the toy packing task after the grasp

robot demonstrations

human demonstrations

Method Overview

Generalizing to New Environments

Toy Packing Task

Sample Policy Rollouts in Other Tasks

Generalizing to New Tasks

Toy Packing Task

Sample Policy Rollouts in Other Tasks

Eye-in-Hand Video Replay of Toy Packing Task

Play Datasets for Inverse Dynamics Model Training

Expert Video Demonstrations for Behavioral Cloning

Reaching Task (Environment Generalization)

Cube Grasping Task (Environment Generalization)

Plate Clearing Task (Environment Generalization)

Toy Packing Task (Environment Generalization)

Cube Stacking Task (Task Generalization)

Cube Pick-and-Place Task (Task Generalization)

Plate Clearing Task (Task Generalization)

Toy Packing Task (Task Generalization)