This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on?
Add a new dataset
«showing 591 tags of 591 total tags for 425 datasets (1.39) »
|407||Inria Aerial Image Labeling||The Inria Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixelwise labeling of aerial imagery (link to paper). Dataset feature...||semantic segmentation aerial urban city groundtruth building footprint house||link||2017-10-20||144|
|403||Multispectral Imaging (MSI)||Multispectral Imaging (MSI) datasets were acquired using IRIS II which is a lightweight portable system comprising of a high resolution camera, a novel filter w...||multi-spectral illumination wavelength groundtruth registration matching alignment||link||2017-12-01||111|
|353||COCO-Stuff||COCO-Stuff augments the COCO dataset with pixel-level stuff annotations for 10,000 images. These annotations can be used for scene understanding tasks like sema...||semantic segmentation stuff things COCO captioning annotation groundtruth benchmark||link||2017-02-16||588|
|351||CMLA Subpixel Stereo Dataset||A 66 stereo pairs dataset with their subpixel ground truths. The construction and improvement of algorithms for subpixel stereovision requires very precise t...||stereo stereovision subpixel groundtruth 3D pointcloud noise depth||link||2017-09-15||431|
|346||LASIESTA (Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms)||LASIESTA is composed by many real indoor and outdoor sequences organized in different categories, each of one covering a specific challenge in moving object det...||dataset groundtruth motion object detection foreground background subtraction challenge stationary camera||link||2017-09-12||410|
|332||Multi-FoV - Large Field-of-View Cameras for Visual Odometry||The Multi-FoV synthetic datasets are two synthetic scenes (vehicle moving in a city, and flying robot hovering in a confined room). For each scene, three differ...||visual odometry camera fov synthetic groundtruth blender||link||2016-08-11||488|
|317||NYU Symmetry Database||The mirror symmetry database contains 176 single-symmetry and 63 multyple-symmetry images (.png files) with accompanying ground-truth annotations (.mat files). ...||symmetry detection mirror groundtruth||link||2016-04-15||498|
|308||TST Intake Monitoring dataBase||t is composed of food intake movements, recorded with Kinect V1 (320?40 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device ...||human food intake monitoring behavior kinect pointcloud tracking age groundtruth||link||2018-01-06||545|
|303||1DSfM Landmarks||The 1DSfM Landmarks is a collection of community-based image reconstruction by Kyle Wilson and is comprised of 14 datasets with comparison to bundler ground tru...||3d reconstruction landmark groundtruth benchmark urban city||link||2015-08-05||758|
|298||Freiburg-Berkeley Motion Segmentation||The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is anno...||video segmentation benchmark object tracking pedestrian groundtruth motion||link||2017-03-21||1007|
|296||Video Segmentation Benchmark||The Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divided into...||video segmentation benchmark object tracking pedestrian groundtruth motion||link||2017-03-21||1090|
|287||INRIA Lafarge Benchmarks||Some datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting Line-ne...||3d surface reconstruction groundtruth pointcloud object detection line road network urban crowd pedestrian counting||link||2015-06-18||927|
|245||ETHZ CVL Video SumMe||The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno...||video summary benchmark human groundtruth action event||link||2016-10-21||1630|
|225||California-ND||An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections Managing photo collections involves a variety of image quality assessment tas...||retrieval duplicate copyright groundtruth detection||link||2014-03-19||750|
|222||Ford Car Dataset||The Ford Car dataset is joint effort of Pandey et al. (for collecting images, Lidar points, calibration etc.) and us (for annotation of 2D and 3D objects). ...||car detection lidar 3d groundtruth sfm||link||2014-04-16||1928|
|204||UCF Person and Car VideoSeg||The UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big car, sm...||video segmentation object motion model camera groundtruth||link||2015-04-19||1084|
|202||GaTech SegTrack||The SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used for accura...||video segmentation object proposal flow optical motion model camera stationary groundtruth||link||2013-10-09||902|
|135||Quad 6K||The Quad 6K dataset is a Structure-from-Motion dataset taken at Arts Quad at Cornell University campus and consists of 6514 images with ground truth positions o...||reconstruction, sfm, urban, groundtruth, landmark, 3d gps||link||2013-11-05||1108|