This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at
Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.
Hey! If you're reading this, why not help and update the description of the dataset you're working on?
Add a new dataset
«showing 668 tags of 668 total tags for 472 datasets (1.42) »
|462||Taskonomy||The Taskonomy dataset consists of 3.9 Mil. Scenes, 600 Buildings, 25 Tags per Image, 1024 Resolution for taxonomy and transfer learning tasks. We provide a larg...||transfer learning taxonomy task deep indoor 3d mesh pose camera high-resolution||link||2018-08-08||36|
|454||SBM-RGBD Dataset||The SBM-RGBD dataset [provides] all facilities (data, ground truths, and evaluation scripts) in order to evaluate and compare scene background modelling metho...||background modeling rgbd kinect video color depth benchmark indoor surveillance||link||2018-04-18||156|
|411||ISR-UoL 3D Social Activity Dataset||This is a social interaction dataset between two subjects. This dataset consists of RGB and depth images and tracked skeleton data (i.e. joints 3D coordinates a...||Social, Activity, Interaction, Human, Indoor, Skeleton, RGBD, ROS action||link||2017-11-28||284|
|394||Matterport 2D-3D-Semantics Data||The 2D-3D-S dataset provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations. I...||3d panorama semantic segmentation depth normal indoor building reconstruction large-scale||link||2017-07-27||487|
|378||TVPR (Top View Person Re-identification)||The TVPR dataset includes 23 registration sessions. Each of the 23 folders contains the video of one registration session. Acquisitions have been performed duri...||person re-identification identification recognition people gender clothing video depth top-view indoor||link||2018-01-25||475|
|376||ScanNet||ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-le...||scene indoor synthetic cad room layout rendering realism 3d segmentation object recognition||link||2017-05-12||436|
|375||SUNCG: Indoor Scenes||The SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset of 3D s...||scene indoor synthetic room layout rendering realism 3d segmentation object recognition||link||2018-10-17||554|
|374||SceneNet RGB-D Synthetic Indoor||SceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work of Scene...||scene indoor synthetic robot navigation rendering 3d reconstruction trajectory lighting segmentation slam||link||2017-05-02||467|
|366||Multi-Camera Action Dataset||An indoor action recognition dataset which consists of 18 classes performed by 20 individuals. Each action is individually performed for 8 times (4 daytime and ...||indoor video Multi-Camera Action Recognition Cross-View Recognition Open-View Recognition||link||2017-09-12||457|
|355||IMPART multi-modal/multi-view||The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc...||multi-view multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion||link||2017-01-01||582|
|331||EuRoC MAV Dataset||This web page presents visual-inertial datasets collected on-board a Micro Aerial Vehicle (MAV). The datasets contain stereo images, synchronized IMU measuremen...||aerial vehicle, indoor, global shutter, slam||link||2017-11-28||1221|
|327||PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras||The PIROPO database (People in Indoor ROoms with Perspective and Omnidirectional cameras) comprises multiple sequences recorded in two different indoor rooms, u...||people surveillance perspective omnidirectional fisheye indoor room detection human||link||2017-02-16||1205|
|295||Rent3D||The Rent3D dataset comprises floorplans and images. The goal of this work is to enable a 3D virtual-tour of an apartment given a small set of monocular images o...||indoor building reconstruction layout floorplan apartment urban||link||2015-07-13||846|
|286||HDA Person Dataset - ISR Lisbon||The High Definition Analytics (HDA) dataset is a multi-camera High-Resolution image sequence dataset for research on High-Definition surveillance: Pedestrian De...||Video Surveillance Pedestrian Detection Re-Identification Multiview Tracking Benchmark Indoor High-Definition Camera Network lisbon human||link||2017-10-02||2393|
|271||Labeling in 3D Scenes||This dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: Detecti...||3d kinect reconstruction indoor depth object recognition||link||2015-03-16||1017|
|270||B3DO: Berkeley 3D Object Dataset||For the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only with the a...||3d kinect reconstruction indoor depth object recognition||link||2015-03-16||901|
|181||All I Have Seen (AIHS)||The All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera capturin...||video summary user study clustering similarity outdoor indoor scene 3d||link||2018-09-19||1006|
|168||Mall Dataset||The Mall dataset was collected from a publicly accessible webcam for crowd counting and profiling research. Ground truth: Over 60,000 pedestrians were label...||detection tracking crowd counting pedestrian indoor video webcam||link||2016-12-06||2180|
|166||ICG Multi-Camera Datasets||The ICG Multi-Camera datasets consist of Easy Data Set (just one person) Medium Data Set (3-5 persons, used for the experiments) Hard Data Set (crowded sc...||multiview pedestrian tracking detection object camera calibration graz indoor video multitarget||link||2015-06-19||1484|
|163||TUGRAZ ICG Longterm Pedestrian Dataset||The Longterm Pedestrian dataset consists of images from a stationary camera running 24 hours for 7 days at about 1 fps. It used for adaptive detection and back...||pedestrian change detection background illumination robust indoor coffee graz multitarget||link||2015-06-19||1260|
|104||Make3D Depth||The Make3D Depth dataset s designed to learn features to estimate scene depth from a single image. This dataset contains aligned image and range data: Make3...||depth, learning, single view, outdoor, indoor||link||2018-03-16||1640|
|15||PETS 2006||The PETS 2006 dataset contains 7 parts showing multi-sensor sequences containing left-luggage scenarios with increasing scene complexity at a train station scen...||frontview, indoor, pedestrian, detection, tracking, multitarget||link||2015-08-12||1392|