from Colonoscopy 3D Video Dataset with Paired Depth from 2D-3D Registration
Johns Hopkins University
Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at durr.jhu.edu/C3VD.
Please cite our publication if you use code or data from this site.
@article{bobrow2023,
title={Colonoscopy 3D video dataset with paired depth from 2D-3D registration},
author={Bobrow, Taylor L and Golhar, Mayank and Vijayan, Rohan and Akshintala, Venkata S and Garcia, Juan R and Durr, Nicholas J},
journal={Medical Image Analysis},
pages={102956},
year={2023},
publisher={Elsevier},
}
Colonoscopy video frames (left) are registered with rendered views of a ground truth 3D model (right). Edge features (overlay) are aligned by optimizing a loss function (bottom).
Real colonoscope frames are paired with registered ground truth depth, surface normals, occlusion, and optical flow frames
C3VD contains 22 registered videos with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes 4 screening colonoscopy videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. 3D model files and molds are also available for download. Registration and rendering code is made available on GitHub.
For each registered video frame, the dataset includes:
For each video sequence, we also provide:
Model | Texture | Video | # Frames | Download | |
---|---|---|---|---|---|
Cecum | 1 | a | 276 | Preview | cecum_t1_a.zip (2.86 GB) |
Cecum | 1 | b | 765 | Preview | cecum_t1_b.zip (8.36 GB) |
Cecum | 2 | a | 370 | Preview | cecum_t2_a.zip (3.71 GB) |
Cecum | 2 | b | 1,142 | Preview | cecum_t2_b.zip (11.06 GB) |
Cecum | 2 | c | 595 | Preview | cecum_t2_c.zip (6.13 GB) |
Cecum | 3 | a | 730 | Preview | cecum_t3_a.zip (6.80 GB) |
Cecum | 4 | a | 465 | Preview | cecum_t4_a.zip (5.04 GB) |
Cecum | 4 | b | 425 | Preview | cecum_t4_b.zip (4.41 GB) |
Descending Colon | 4 | a | 148 | Preview | desc_t4_a.zip (1.24 GB) |
Sigmoid Colon | 1 | a | 700 | Preview | sigmoid_t1_a.zip (5.20 GB) |
Sigmoid Colon | 2 | a | 514 | Preview | sigmoid_t2_a.zip (4.22 GB) |
Sigmoid Colon | 3 | a | 613 | Preview | sigmoid_t3_a.zip (4.58 GB) |
Sigmoid Colon | 3 | b | 536 | Preview | sigmoid_t3_b.zip (4.21 GB) |
Transcending Colon | 1 | a | 61 | Preview | trans_t1_a.zip (0.59 GB) |
Transcending Colon | 1 | b | 700 | Preview | trans_t1_b.zip (5.07 GB) |
Transcending Colon | 2 | a | 194 | Preview | trans_t2_a.zip (1.58 GB) |
Transcending Colon | 2 | b | 103 | Preview | trans_t2_b.zip (0.97 GB) |
Transcending Colon | 2 | c | 235 | Preview | trans_t2_c.zip (1.83 GB) |
Transcending Colon | 3 | a | 250 | Preview | trans_t3_a.zip (1.83 GB) |
Transcending Colon | 3 | b | 214 | Preview | trans_t3_b.zip (1.66 GB) |
Transcending Colon | 4 | a | 382 | Preview | trans_t4_a.zip (3.10 GB) |
Transcending Colon | 4 | b | 597 | Preview | trans_t4_b.zip (4.61 GB) |
In addition to the video sequence, each file also contains camera pose information saved in a file named pose.txt. Each line contains a homogenous pose (flattened in row-major order) corresponding to each frame.
Model | Texture | # Frames | Download | |
---|---|---|---|---|
Full Colon | 1 | 5,458 | Preview | screening_t1.zip (8.13 GB) |
Full Colon | 2 | 5,100 | Preview | screening_t2.zip (7.09 GB) |
Full Colon | 3 | 4,726 | Preview | screening_t3.zip (7.07 GB) |
Full Colon | 4 | 4,774 | Preview | screening_t4.zip (7.36 GB) |
Model | Object Download | Mold Download |
---|---|---|
Ascending Colon | ascend_model.obj (25.4 MB) | ascend_mold.zip (18.7 MB) |
Cecum | cecum_model.obj (54.8 MB) | cecum_mold.zip (24.9 MB) |
Descending Colon | desc_model.obj (38.0 MB) | desc_mold.zip (26.6 MB) |
Sigmoid Colon | sigmoid_model.obj (20.8 MB) | sigmoid_mold.zip (42.2 MB) |
Transcending Colon | trans_model.obj (18.3 MB) | trans_mold.zip (24.1 MB) |
Full Colon | full_model.obj (194.8 MB) |
10/14/2023 | Updated the dataset file names to reflect peer-review completion.
05/03/2023 | Revised ground truth surface normal frames and updated naming convention:
This work is licensed under CC BY-NC-SA 4.0