HARPER

Abstract

We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot’s perspective, i.e., on the data captured by the robot’s sensors. These make 3D body pose analysis challenging because being close to the ground captures humans only partially. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The Corpus contains not only the recordings of the built-in stereo cameras of Spot, but also those of a 6-camera OptiTrack system (all recordings are synchronized). This leads to ground-truth skeletal representations with a precision lower than a millimeter. In addition, the Corpus includes reproducible benchmarks on 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those we provide in this work.

The HARPER Dataset

We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot’s perspective, i.e., on the data captured by the robot’s sensors.

OptiTrack Data Spot + OptiTrack Data

Data

15 different interactions
17 participants
10 actions involve physical contact between the Spot and users
Recordings of the built-in stereo cameras of Spot
Recordings of a 6-camera OptiTrack system
Ground-truth skeletal representations with a precision lower than a millimeter

Sensors

Spot's stereo cameras (5 Greyscale + Depth and 1 RGB-D)
6-camera OptiTrack system
External RGB Camera

Annotations

Human 21-joint 3D skeletal model
Spot 21-joint 3D skeletal model
2D keypoints annotated on the Spot's cameras
Per-keypoint visibility

Benchmarks (from the Spot's Perspective)

3D Human Pose Estimation

Estimate the 3D human pose from the robot's perspective (camera + depth).

Human Pose Forecasting

Forecast the future human poses (all the keypoints) from the robot's perspective.

Collision Prediction

Collision estimation between the robot and the human on the forecasted poses.

3D Human Pose Estimation and Forecasting from the Robot’s Perspective

The HARPER Dataset

Abstract

Video

The HARPER Dataset

Data

Sensors

Annotations

Benchmarks (from the Spot's Perspective)

3D Human Pose Estimation

Human Pose Forecasting

Collision Prediction