This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on?
Add a new dataset
«showing 524 tags of 524 total tags for 372 datasets (1.41) »
|353||COCO-Stuff||COCO-Stuff augments the COCO dataset with pixel-level stuff annotations for 10,000 images. These annotations can be used for scene understanding tasks like sema...||semantic segmentation stuff things COCO captioning annotation groundtruth benchmark||link||2017-02-16||211|
|351||CMLA Subpixel Stereo Dataset||A 66 stereo pairs dataset with their subpixel ground truths. The construction and improvement of algorithms for subpixel stereovision requires very precise t...||stereo stereovision subpixel groundtruth 3D pointcloud noise depth||link||2017-03-03||161|
|332||Multi-FoV - Large Field-of-View Cameras for Visual Odometry||The Multi-FoV synthetic datasets are two synthetic scenes (vehicle moving in a city, and flying robot hovering in a confined room). For each scene, three differ...||visual odometry camera fov synthetic groundtruth blender||link||2016-08-11||262|
|317||NYU Symmetry Database||The mirror symmetry database contains 176 single-symmetry and 63 multyple-symmetry images (.png files) with accompanying ground-truth annotations (.mat files). ...||symmetry detection mirror groundtruth||link||2016-04-15||291|
|308||TST Intake Monitoring dataBase||t is composed of food intake movements, recorded with Kinect V1 (320×240 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device...||human food intake monitoring behavior kinect pointcloud tracking age groundtruth||link||2016-02-11||328|
|303||1DSfM Landmarks||The 1DSfM Landmarks is a collection of community-based image reconstruction by Kyle Wilson and is comprised of 14 datasets with comparison to bundler ground tru...||3d reconstruction landmark groundtruth benchmark urban city||link||2015-08-05||486|
|298||Freiburg-Berkeley Motion Segmentation||The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is anno...||video segmentation benchmark object tracking pedestrian groundtruth motion||link||2017-03-21||696|
|296||Video Segmentation Benchmark||The Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divided into...||video segmentation benchmark object tracking pedestrian groundtruth motion||link||2017-03-21||779|
|287||INRIA Lafarge Benchmarks||Some datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting Line-ne...||3d surface reconstruction groundtruth pointcloud object detection line road network urban crowd pedestrian counting||link||2015-06-18||664|
|245||ETHZ CVL Video SumMe||The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno...||video summary benchmark human groundtruth action event||link||2016-10-21||996|
|225||California-ND||An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections Managing photo collections involves a variety of image quality assessment tas...||retrieval duplicate copyright groundtruth detection||link||2014-03-19||581|
|222||Ford Car Dataset||The Ford Car dataset is joint effort of Pandey et al. (for collecting images, Lidar points, calibration etc.) and us (for annotation of 2D and 3D objects). ...||car detection lidar 3d groundtruth sfm||link||2014-04-16||1459|
|204||UCF Person and Car VideoSeg||The UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big car, sm...||video segmentation object motion model camera groundtruth||link||2015-04-19||841|
|202||GaTech SegTrack||The SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used for accura...||video segmentation object proposal flow optical motion model camera stationary groundtruth||link||2013-10-09||662|
|135||Quad 6K||The Quad 6K dataset is a Structure-from-Motion dataset taken at Arts Quad at Cornell University campus and consists of 6514 images with ground truth positions o...||reconstruction, sfm, urban, groundtruth, landmark, 3d gps||link||2013-11-05||863|