Dataset

explanation_image.png

The contest uses DISCOMAN (Dataset of Indoor SCenes for Odometry, Mapping And Navigation) dataset generated from SUNCG realistic home layouts.  For each scene a randomized trajectory is generated using physically-based motion planner. Then a sequence of frames is generated at a video frame rate using ray traced rendering.

The data is split into train, validation and test parts. Train and validation parts contain ground truth annotation. Test part is used for evaluation of the methods on a test server.


DISCOMAN dataset contains sequences with the following modalities:

  • Emulated IMU sensor values at 150 Hz

  • 24 bpp color RGB images at 30 Hz

  • 16 bpp depth images at 30 Hz

All images are widescreen (640 x 360 px) with 70 degrees horizontal FOV.


The following ground truth annotation is provided:

  • Ground truth positions and orientations of the camera

  • 16 bpp images with per-pixel object instance labelling and corresponding semantic annotation as in COCO panoptic segmentation dataset

  • Ground truth occupancy grid for each sequence


The dataset has the following structure:
/train
/XXXXXX
YYYYYYYY_raycast.png - RGB image
YYYYYYYY_kinect.png - depth sensor output
imu_noised.csv - data from IMU
YYYYYYYY_depth.png - ground truth depth
YYYYYYYY_instance.png - ground truth instance masks
YYYYYYYY_category.png - ground truth semantic categories
YYYYYYYY_panoptic.png - vizualization of ground truth annotation for panoptic segmentation
...
cameragt.csv - ground truth camera poses
panoptic.json - annotation for panoptic segmentation in the same format as in
COCO dataset
/test
/XXXXXX
YYYYYYYY_raycast.png - RGB image
YYYYYYYY_kinect.png - depth sensor output
imu_noised.csv
- data from IMU

For the train part of the data different modalities are provided along with ground truth annotation. For the test part of data ground truth annotation is not provided.


IMU readings for each sequence are given in imu.csv. It contains the following fields:

  • id - frame index

  • accelerometer.x, accelerometer.y, accelerometer.z - accelerometer outputs

  • gyroscope.x, gyroscope.y, gyroscope.z - gyroscope outputs

IMU outputs are sampled at 150Hz.


Ground truth camera poses for each sequence are given in camera_gt.csv. It contains the following fields:

  • id - frame index

  • position.x, position.y, position.z - absolute coordinates of the camera

  • quaternion.w, quaternion.x, quaternion.y, quaternion.z - orientation of the camera in a form of quaternion

Poses are provided in a scene coordinate frame, which is linked to the layout of the house. The coordinate frame is right handed with the +Y axis up.Poses are provided in a scene coordinate frame, which is linked to the layout of the house. The coordinate frame is right handed with the +Y axis up. Ground truth poses are sampled at 150Hz similarly to IMU.


Dataset will be available after July 15 2019 [LINK]


Related links

SUNCG – a collection of realistic 3d scenes of indoor environments that we used for data generation