Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset



2d   3d   4d   aachen   abdomen   abrupt   accelerometer   accuracy   action   activity   address   adhead   adjustment   adult   aerial   aesthetics   affordance   age   aircraft   airplane   airport   alignment   amazon   ambiguous   analysis   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulation   artificial   aspect   atmospheric   attention   attribute   attributes   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   berlin   bike   bilateral   bim   binary   biology   biometric   biometry   blender   blur   boat   body   bone   bottle   boundingbox   brain   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   california   caltech   camera   canada   caption   captioning   capture   car   cardinal   categorization   category   cats   cbir   celebrity   cell   centered   chair   challenge   change   chemistry   chest   chicaco   chromaticity   church   circle   city   cityscapes   classification   clothing   cloud   clustering   clutter   cnn   co-localization   co-saliency   co-segmentation   co-skeletonization   coco   code   codebook   coffee   collaborative   color   community   comparison   computer   condition   constancy   context   contour   cooking   copyright   counting   cover   cow   crepe   crf   crop   cross-view   crowd   ct   cutting   daily   dance   data   dataset   day   daylight   decomposition   deep   defocus   deformation   denoising   dense   depth   description   descriptor   detail   detection   dichromatic   disease   disgust   disparity   dogs   domain   dped   driving   drone   dubrovnik   duplicate   dynamic   ear   edge   egocentric   ellipse   emotion   endtoend   enhancement   estimation   evaluation   event   expertise   expression   eye   facade   face   facial   fake   fashion   fear   feature   field   fine-grained   fingerprint   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   food   foot   footprint   foreground   fov   frames   frontview   fundus   gait   game   gan   gaze   gender   genetic   genome   geography   geometry   geoscience   geotag   geotagged   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   grayscale   graz   ground   groundtruth   group   growth   gsd   hand   handwritten   hd   head   heart   heat   hierarchy   high-definition   high-resolution   highlight   highway   holes   horse   house   howto   human   identification   illumination   illuminiation   illusion   image   imagenet   images   imdb   indoor   inertial   initialization   inserts   instance   intake   interaction   interactive   interest   internet   invariance   ir   isar   iso   joy   kernels   keyframe   kimia   kinect   kitchen   kitti   label   labeling   laboratory   land   landmark   lane   language   large   large-scale   laser   lattice   layout   leaf   learning   letter   leuven   lidar   lifespan   light   lightfield   lighting   limited   line   lip   lisbon   liver   local   localization   location   logo   lowlevel   machine   makeup   manhattan   map   maritime   mask   match   matching   material   medial   medical   medicine   memorability   mesh   metadata   milling   mirror   mobile   model   modeling   monitoring   mono   montage   motion   motorbike   mouse   mouth   movement   movie   mpeg   mser   mug   multi-camera   multi-class   multi-human   multi-mode   multi-sensor   multi-spectral   multi-view   multilabel   multimedia   multimodal   multiple   multitarget   multiview   naming   natural   nature   navigation   netherlands   network   neutral   newyork   night   nir   noise   normal   nude   number   object   occlusion   ocr   odometry   omnidirection   omnidirectional   online   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   pan   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   pedestrians   people   person   perspective   phase   photo   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   polygon   popularity   pornography   pose   potsdam   presentation   pressure   primitive   privacy   procedural   profile   project   proposal   pruning   ptz   quality   question   radar   random   rank   ranking   ransac   rate   ratio   re-identification   reading   real   real-world   realism   recipe   recognition   reconstruction   rectification   rectified   reflection   registration   regression   regular   remote   removal   rendering   repetition   resolution   restoration   retina   retinal   retrieval   rgb   rgbd   road   robot   robotic   robust   rome   room   ros   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   search   segmentation   selfdriving   semantic   sense   sensing   sequence   series   sfm   shadow   shape   sheffield   shoes   shots   shutter   sideview   sign   signs   similarity   simultaneous   single   singleview   size   skeleton   skeletonization   sketch   skin   sky   slam   smartphone   soccer   social   software   source   space   spain   spanish   speaker   speech   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   structure   structured   study   stuff   style   stylization   subpixel   subtraction   summarization   summary   superpixel   superresolution   supervised   supervisely   surface   surgery   surprise   surveillance   swan   switzerland   sydney   symmetry   synthetic   table   target   taxonomy   temporal   text   textile   texture   texture-less   therapy   thermal   things   time   timelapse   tiny   tokyo   tool   tools   top-view   tracking   tracklet   traffic   trajectory   transfer   transportation   trees   triangulation   truth   tuberculosis   turbulence   type   uas   uav   udacity   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vehicles   vessel   video   view   viewpoint   virtual   visible   vision   visual   voc   volleyball   vqa   vt   water   wavelength   weakly   wear   wearable   weather   webcam   white   wide   wiki   wikipedia   wild   workflow   world   worldwide   xray   year   youtube   zoom   zurich  
«showing 641 tags of 641 total tags for 454 datasets (1.41) »


human
DID Name Description Tags URL Date Views
441 Alzheimers Disease Neuroimaging Initiative (ADNI) The Alzheimers Disease Neuroimaging Initiative (ADNI) data are shared without embargo through the LONI Image & Data Archive (IDA), a secure research data repos... brain human medicine medical scan behavior disease link 2018-03-16 38
440 Human Connectome Project (HCP) The Human Connectome Project (HCP) has tackled one of the great scientific challenges of the 21st century: mapping the human brain, aiming to connect its struct... brain human medicine medical scan lifespan adult age behavior link 2018-03-16 28
439 Cornell Activity Datasets: CAD 60 & CAD 120 The CAD-60 and CAD-120 data sets comprise of RGB-D video sequences of humans performing activities which are recording using the Microsoft Kinect sensor. CAD... activity action affordance rgbd video daily human kinect link 2018-03-15 31
438 CAD 120 affordance This is the CAD 120 Affordance Segmentation Dataset based on the Cornell Activity Dataset CAD 120 (see http://pr.cs.cornell.edu/humanactivities/data.php). Co... segmentation affordance action cad attribute human link 2018-03-15 29
434 Online RGBD Action Dataset (ORGBD) The Online RGBD Action dataset targets for human aciton (human-object interaction) recognition based on RGBD video data. There are seven categories of human act... action rgbd online human recognition daily link 2018-03-15 34
433 20bn-Something-Something The 20BN-SOMETHING-SOMETHING dataset is a large collection of densely-labeled video clips that show humans performing pre-defined basic actions with everyday ob... action recognition human video daily link 2018-03-15 37
411 ISR-UoL 3D Social Activity Dataset This is a social interaction dataset between two subjects. This dataset consists of RGB and depth images and tracked skeleton data (i.e. joints 3D coordinates a... Social, Activity, Interaction, Human, Indoor, Skeleton, RGBD, ROS action link 2017-11-28 187
410 Charades Activity Dataset 10,000 30sec videos from 267 volunteers, each annotated with multiple activities, captions, objects, and temporal localizations. From "Hollywood in Homes: Cr... video activity recognition action object caption localization detection human daily link 2018-03-22 242
402 GeoFaces A large dataset of geotagged face images collected from Flickr. The zip file contains text files containing urls of the images. Face2GPS: Estimating Geograph... face localization geotagged classification gender age human link 2017-09-06 242
395 AWS Public Datasets AWS hosts a variety of public datasets that anyone can access for free. Previously, large datasets such as satellite imagery or genomic data have required hour... amazon classification deep learning segmentation recognition satellite human biology space image resolution link 2017-07-28 372
386 Utrecht University, ShakeFive2 ShakeFive2 A collection of 8 dyadic human interactions with accompanying skeleton metadata. The metadata is frame based xml data containing the skeleton join... human interaction Kinect video link 2017-06-26 204
355 IMPART multi-modal/multi-view The multi-modal/multi-view datasets are created in a cooperation between University of Surrey and Double Negative within the EU FP7 IMPART project. The sourc... multi-view multi-mode video rgbd lidar 3d model color indoor outdoor dynamic action face human emotion link 2017-01-01 498
354 Facial Expression Research Group Database (FERG-DB), University of Washington, Seattle FERG-DB is a database of stylized characters with annotated facial expressions. The database contains multiple face images of six stylized characters. The chara... Face, Facial expression, Animation, Stylization, annotation emotion, deep learning, anger, sad, joy, disgust, surprise, neutral, fear, cardinal classification, human transfer, image retrieval link 2017-02-27 750
340 Ljubljana CVL Face Database Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes. All images were taken in supervised conditi... face pedestrian person recognition biometry human illumination lighting link 2017-02-22 630
339 Annotated Web Ears Dataset (AWE Dataset) Dataset contains 1000 images of 100 persons, with 10 images per person and is freely available. All images were acquired by cropping ears from images from the i... ear biometry person pedestrian recognition human lighting link 2017-02-16 547
337 WIDER Attribute Dataset WIDER ATTRIBUTE dataset is a human attribute recognition benchmark dataset, of which images are selected from the publicly available WIDER dataset. There are a ... Attribute recognition, Human attribute link 2016-09-22 836
327 PIROPO Database: People in Indoor ROoms with Perspective and Omnidirectional cameras The PIROPO database (People in Indoor ROoms with Perspective and Omnidirectional cameras) comprises multiple sequences recorded in two different indoor rooms, u... people surveillance perspective omnidirectional fisheye indoor room detection human link 2017-02-16 1020
308 TST Intake Monitoring dataBase t is composed of food intake movements, recorded with Kinect V1 (320?40 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device ... human food intake monitoring behavior kinect pointcloud tracking age groundtruth link 2018-01-06 645
305 SPHERE human skeleton movements The SPHERE human skeleton movements dataset was created using a Kinect camera, that measures distances and provides a depth map of the scene instead of the clas... human action behavior motion movement video skeleton depth kinect link 2016-03-24 890
299 CAMP-TUM: Multiple Human Pose Estimation from Multiple Views We introduce the Shelf dataset for multiple human pose estimation from multiple views. In addition we annotate the body joints in the Campus dataset from CVLAB@... 3D human pose estimation multiple view motion capture link 2015-07-15 766
294 Happy People Images Database Group emotion recognition in images - Happiness Intensity labels for group of people in images. The images have been collected from Flickr using keyword search ... group, facial expression, emotion, wild, human, flickr, behavior link 2017-11-22 881
289 ETHZ CVL Clust MICCAI 2015 Challenge on Liver Ultrasound Tracking Munich, October 9, 2015 (Full Day) Outline Ultrasound (US) imaging is a widely used medical imaging techn... medical liver tracking ultrasound therapy human organ benchmark real link 2015-06-19 687
288 Berkeley Urban Street tracking The UrbanStreet dataset used in the paper can be downloaded here [188M] . It contains 18 stereo sequences of pedestrians taken from a stereo rig mounted on a ca... tracking detection segmentation multitarget recognition video pedestrian urban human link 2015-07-14 1372
286 HDA Person Dataset - ISR Lisbon The High Definition Analytics (HDA) dataset is a multi-camera High-Resolution image sequence dataset for research on High-Definition surveillance: Pedestrian De... Video Surveillance Pedestrian Detection Re-Identification Multiview Tracking Benchmark Indoor High-Definition Camera Network lisbon human link 2017-10-02 2075
276 TST TUG (Timed Up and Go) The TUG (Timed Up and Go test) dataset consists of actions performed three times by 20 volunteers. The people involved in the test are aged between 22 and 39, w... action recognition time kinect wearable accelerometer human video link 2015-05-02 719
275 TST fall detection It is composed of ADL (activity daily living) and fall actions simulated by 11 volunteers. The people involved in the test are aged between 22 and 39, with diff... action recognition detection depth kinect wearable accelerometer human video link 2017-03-14 1077
272 Stanford 40 Actions The Stanford 40 Actions dataset contains images of humans performing 40 actions. In each image, we provide a bounding box of the person who is performing the ac... human action recognition detection boundingbox link 2015-06-19 1141
265 Salient Montages: Human-centric Video Summarization The Salient Montages is a human-centric video summarization dataset from the paper [1]. In [1], we present a novel method to generate salient montages from u... video summarization montage saliency wearable human link 2015-05-02 823
264 Domain-specific Personal Videos Highlight Dataset The domain-specific personal videos highlight dataset from the paper [1] describes a fully automatic method to train domain-specific highlight ranker for raw p... video summarization saliency wearable human action recognition domain link 2015-05-02 950
263 Crowd Dataset The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense crowd in t... crowd video detection anomaly scene understanding human pedestrian link 2017-09-19 1791
261 MPI Multi-View Collection GVV datasets Welcome to the homepage of the gvvperfcapeva datasets. This site serves as a hub to access a wide range of datasets that have been created for projects of the G... video multiview tracking face mesh reconstruction depth human action pose link 2014-12-10 888
257 FaceScrub The FaceScrub dataset comprises a total of 107818 unconstrained face images of 530 celebrities crawled from the Internet, with about 200 images per person. M... face detection recognition celebrity people human link 2018-03-20 1088
255 Robotic 3D Scan Repository The Robotic 3D Scan Repository from Osnabrueck contains 23 different datasets showing a veriaty of 3D scans for objects, humans, cities, university campus, heat... 3d reconstruction scan laser heat urban city human aerial germany bremen lidar osnabrueck link 2015-04-10 934
254 ChokePoint Dataset We collected a video dataset, termed ChokePoint, designed for experiments in person identification/verification under real-world surveillance conditions using e... human pedestrian identification recognition multiview sequence face detection real world surveillance clustering link 2015-05-02 1389
247 PASCAL VOC Parts The PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation mask for 2... detection recognition pascal object part pedestrian human segmentation semantic link 2014-09-30 1414
245 ETHZ CVL Video SumMe The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno... video summary benchmark human groundtruth action event link 2016-10-21 1900
235 Kindergarten Video Surveillance The dataset consist of the about 50 hours obtained from kindergarten surveillance videos. Dataset, totally approximately 100 videos sequences (1000GB, 50 hours)... human action behavior segmentation video background surveillance link 2015-10-08 1628
232 Pratheepan Human Skin Detection Dataset The images in this dataset are downloaded randomly from Google for human skin detection research. It has been used in the paper: W.R. Tan, C.S. Chan, Y. Prathee... skin detection, skin segmentation, human detection, skin dataset link 2017-09-14 3264
227 Omnidirectional and panoramic image dataset We share our omnidirectional and panoramic image dataset (with annotations) to be used for human and car detection. Please reach through: http://cvrg.iyte.edu.... panorama detection car omnidirection human recognition link 2017-01-13 1538
219 JPL First-Person Interaction JPL First-Person Interaction dataset (JPL-Interaction dataset) is composed of human activity videos taken from a first-person viewpoint. The dataset particularl... video action recognition interactive motion human link 2014-02-03 793
213 ChairGest Gestures ChairGest is an open challenge / benchmark. The task consists in spotting and recognizing gestures from multiple synchronized sensors: 1 Kinect and 4 Xsens Ine... benchmark recognition kinect gesture detection human link 2014-06-06 802
212 Polo Instance Segmentation The Polo instance segmentation dataset is a semantic segmentation task for Hough transform based segmentation masks. It consists of supervised segmentation for ... semantic segmentation horse human outdoor mask scene understanding n/a 2016-01-21 1103
207 CASIA Gait Recognition Dataset Dataset A (former NLPR Gait Database) was created on Dec. 10, 2001, including 20 persons. Each person has 12 image sequences, 4 sequences for each of the three ... gait recognition biometry action classification motion human foot pressure link 2017-03-10 2475
192 Our Database of Faces The Our Database of Faces (ORL) dataset contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different tim... face recognition illumination human expression link 2013-09-23 1105
173 MuHAVi and MAS human action The Multicamera Human Action Video Data (MuHAVi) Manually Annotated Silhouette Data (MAS) are two datasets consisting of selected action sequences for the eval... human action behavior segmentation video background link 2017-07-25 2050
171 CHALEARN Multi-modal Gesture Challenge The CHALEARN Multi-modal Gesture Challenge is a dataset +700 sequences for gesture recognition using images, kinect depth, segmentation and skeleton data. ht... gesture, kinect, recognition, human, action, illumination, depth, segmentation, skeleton link 2013-08-09 966
170 Sheffield Kinect Gesture (SKIG) dataset The Sheffield Kinect Gesture (SKIG) dataset contains 2160 hand gesture sequences (1080 RGB sequences and 1080 depth sequences) collected from 6 subjects. ... gesture, kinect, recognition, human, action, illumination, depth link 2017-12-02 1289
153 MSRC Kinect Gesture Dataset The Microsoft Research Cambridge-12 Kinect gesture dataset consists of sequences of human movements, represented as body-part locations, and the associated gest... gesture, kinect, recognition, human, action link 2013-08-08 1088
138 Buffy The Buffy dataset contains images selected from the TV series, Buffy: the Vampire Slayer. We select a set of 452 images from the first two episodes for training... segmentation, detection, buffy, movie, human link 2015-02-07 836
16 PETS 2009 The PETS 2009 dataset contains 3 parts showing multi-view sequences containing pedestrians walking in an outdoor environment. The parts are used for person coun... frontview, outdoor, pedestrian, detection, tracking, overlap, occlusion multitarget, human link 2015-06-19 1554
14 INRIA People The INRIA People dataset from Navneet Dalal and Bill Triggs [DalalCVPR2005] consists of training and testing data. The training contains 1805 images and X peopl... detection, pedestrian, sideview, frontview, human, boundingbox link 2015-06-19 1508


total views: 49154 5 queries in 9.2983245849609E-5s 0.00010013580322266s 0.00016498565673828s 2.0027160644531E-5s 0.0012519359588623s and total 0.0069539546966553s