Bimanual Data Collection
The DK1 is designed from the ground up for bimanual teleoperation data collection. This guide covers wiring both arms and cameras, running the leader/follower recording procedure, understanding the bimanual dataset schema, and getting your data ready for ACT training.
Hardware Connections for Bimanual Recording
Bimanual recording requires more connections than a single-arm setup. Verify every connection before starting LeRobot — missing a connection mid-session corrupts the episode.
Leader Arm (Dynamixel XL330)
USB-C from leader arm to host PC. This arm is moved by the operator's hand. Use a short cable (1 m) to avoid accidental disconnections during teleop. Verify: ls /dev/ttyACM0
Follower Arm (DM4340 + power)
USB-C from follower arm to host PC plus DC power supply. The follower arm requires external power — never run on USB power alone. Verify: ls /dev/ttyACM1
Wrist Camera (follower arm)
Mount a USB webcam to the follower arm's end-effector. This is the primary manipulation camera. Connect via USB 3.0. Verify: ls /dev/video0
Overhead / Workspace Camera
Fixed camera above the bimanual workspace at ~70 cm height, angled 30° down. Captures both arms simultaneously. Second USB 3.0 port. Verify: ls /dev/video2
Critical: bimanual synchronization. With two arms and two cameras, synchronization is the most important data quality factor for the DK1. LeRobot timestamps all streams from the host PC clock. To minimize timestamp skew: (1) use separate USB bus controllers for cameras and arms, (2) use USB 3.0 hubs with stable clocks, (3) set CPU governor to performance mode. Target: <5 ms skew between all four streams. A 10 ms desync between left and right arm states can cause ACT training failures on contact-rich tasks.
Leader/Follower Teleoperation Recording Procedure
Follow these steps for every DK1 recording session. The bimanual procedure has a few extra steps compared to single-arm collection.
Pre-session safety check
Clear the shared workspace between both arms (1.5 m × 1 m). Verify both arms reach the shared workspace without collision. Test E-stop before recording. See Safety page.
Connect and verify both arms
# Verify serial ports are available
ls /dev/ttyACM*
# Expected: /dev/ttyACM0 (leader) and /dev/ttyACM1 (follower)
# Quick connection test
python -m lerobot.scripts.control_robot \
--robot.type=bi_dk1_follower \
--robot.config=~/.lerobot/robots/dk1_bimanual.yaml \
--control.type=none
Verify camera feeds
Both cameras must be streaming before starting LeRobot. A missing camera will silently produce episodes with null image frames.
python3 -c "
import cv2
for i in [0, 2]:
cap = cv2.VideoCapture(i)
if cap.isOpened():
ret, frame = cap.read()
print(f'Camera {i}: OK ({frame.shape[1]}x{frame.shape[0]})')
else:
print(f'Camera {i}: FAILED')
cap.release()
"
Move arms to starting position
Manually move the leader arm to the starting teleop position. The follower arm will mirror it. Hold the leader arm steady for 2–3 seconds to confirm synchronization before the warmup period starts.
Set up the task scene
Place objects in consistent starting positions for both arms. Photograph the starting configuration. For bimanual tasks, mark exact positions with tape — scene consistency is even more critical because both arm trajectories must be compatible.
Start bimanual LeRobot recording
source ~/.venvs/dk1/bin/activate
python -m lerobot.scripts.control_robot \
--robot.type=bi_dk1_follower \
--robot.config=~/.lerobot/robots/dk1_bimanual.yaml \
--control.type=record \
--control.fps=30 \
--control.repo_id=your-username/dk1-bimanual-pick-place-v1 \
--control.num_episodes=50 \
--control.single_task="Pick up block with left arm, place in bin with right arm" \
--control.warmup_time_s=5 \
--control.reset_time_s=15
Use a longer reset_time_s for bimanual tasks — resetting two arms and the scene takes more time than single-arm setups.
Review and replay episodes
After each batch of 10 episodes, replay and review before continuing. Pay attention to arm coordination — lag between left and right arms will appear as jitter in the follower's movements.
python -m lerobot.scripts.visualize_dataset \
--repo_id=your-username/dk1-bimanual-pick-place-v1 \
--episode_index=0
Push to HuggingFace Hub
huggingface-cli login
python -m lerobot.scripts.push_dataset_to_hub \
--repo_id=your-username/dk1-bimanual-pick-place-v1
LeRobot Dataset Format for Bimanual (DK1)
The DK1 bimanual dataset schema doubles the joint state fields compared to a single-arm recording. Each episode contains synchronized observations from both leader and follower arms plus all cameras.
Directory structure
your-username/dk1-bimanual-pick-place-v1/
├── meta/
│ ├── info.json # Dataset metadata, fps, shapes, robot_type
│ ├── episodes.jsonl # Per-episode metadata (task, length, outcome)
│ └── stats.json # Min/max/mean/std for all fields
├── data/
│ └── chunk-000/
│ ├── episode_000000.parquet
│ └── ...
└── videos/
└── chunk-000/
├── observation.images.wrist_cam/
│ ├── episode_000000.mp4
│ └── ...
└── observation.images.overhead_cam/
└── ...
Episode data schema (bimanual)
Quality Checklist for Bimanual Demos
Bimanual datasets have stricter quality requirements than single-arm data. Poor coordination between arms is the leading cause of DK1 policy training failure.
-
1Arm synchronization delta is under 10 ms Check the
arm_sync_delta_msfield in each episode. Spikes above 10 ms indicate USB bus contention or a dropped serial packet. Delete episodes with sustained high deltas. -
2No follower arm oscillation during contact Review follower arm trajectories at contact points (grasp, handoff, placement). Oscillation appears as high-frequency noise in
observation.state. Reduce PD gains if present. See software troubleshooting. -
3Both arms complete the task in the same episode For bimanual tasks, an episode is only valid if both arms complete their assigned subtasks. If the left arm succeeded but the right arm dropped the object, mark the episode as failed and delete or annotate it.
-
4No missing camera frames Both camera streams must have the expected number of frames. Missing frames from either camera corrupt the visuomotor policy's input. Check with
lerobot.scripts.visualize_dataset. -
5Task scene was reset identically between episodes Both arms' workspace must be reset for each episode. Object position, arm starting configuration, and camera angles must all match. Use the photographed starting configuration as reference.
-
6Episode length is consistent All successful episodes should be within ±25% of median length. Bimanual tasks often have higher variance than single-arm tasks, but extreme outliers (3× median) should be discarded.
-
7Dataset stats are symmetric for both arms In
meta/stats.json, check thataction_leftandaction_rightstats are plausible for your task geometry. If one arm shows zero variance, that arm was not moving — check port assignments. -
8Teleop demonstration style is consistent All demonstrations should use the same approach path, grasp strategy, and handoff technique. Mixed strategies produce multimodal action distributions that confuse ACT training. Use a single operator per task version.
Training ACT on Your Bimanual Dataset
Once your dataset passes the quality checklist, train ACT or Diffusion Policy directly with LeRobot. ACT is recommended for DK1 bimanual tasks — its chunked action prediction handles the coordination between arms better than single-step policies.
Train ACT (recommended for bimanual)
python -m lerobot.scripts.train \
--policy.type=act \
--dataset.repo_id=your-username/dk1-bimanual-pick-place-v1 \
--policy.chunk_size=100 \
--policy.n_action_steps=100 \
--training.num_epochs=5000 \
--training.batch_size=8 \
--output_dir=outputs/dk1-act-bimanual
Train Diffusion Policy (for contact-rich tasks)
python -m lerobot.scripts.train \
--policy.type=diffusion \
--dataset.repo_id=your-username/dk1-bimanual-pick-place-v1 \
--training.num_epochs=8000 \
--output_dir=outputs/dk1-diffusion-bimanual
Go deeper: Read the full Data Collection Pipeline Overview in the Robotics Library for a thorough treatment of episode structure, dataset versioning, synchronization strategies, and multi-task bimanual dataset composition.