Yet Another Computer Vision Index To Datasets (YACVID)

This website provides a list of frequently used computer vision datasets. Wait, there is more!
There is also a description containing common problems, pitfalls and characteristics and now a searchable TAG cloud.
Plus, this is open for crowd editing (if you pass the ultimate turing test)! - Questions? yacvid [at] hayko [dot] at

Content, Design and Idea © by Hayko Riemenschneider, 2011-2016. Texts and Images are subject of copyright by the respective authors.

Hey! If you're reading this, why not help and update the description of the dataset you're working on?

Add a new dataset



2d   3d   4d   aachen   abdomen   abrupt   accelerometer   accuracy   action   activity   address   adhead   adjustment   adult   aerial   aesthetics   affordance   age   aircraft   airplane   airport   alignment   amazon   ambiguous   analysis   anger   animal   animation   annotation   anomaly   apartment   api   appearance   applelogo   architecture   articulation   artificial   aspect   atmospheric   attention   attribute   attributes   authentication   automatic   autonomous   avoid   axis   babyface   background   balance   baseline   behavior   belgium   benchmark   benchmarking   berlin   bike   bilateral   bim   binary   biology   biometric   biometry   blender   blur   boat   body   bone   bottle   boundingbox   brain   brand   bremen   buffy   building   bullseye   bundle   bunny   byu   cad   calibration   california   caltech   camera   canada   caption   captioning   capture   car   cardinal   categorization   category   cats   cbir   celebrity   cell   centered   chair   challenge   change   chemistry   chest   chicaco   chromaticity   church   circle   city   cityscapes   classification   clothing   cloud   clustering   clutter   cnn   co-localization   co-saliency   co-segmentation   co-skeletonization   coco   code   codebook   coffee   collaborative   color   community   comparison   computer   condition   constancy   context   contour   cooking   copyright   counting   cover   cow   crepe   crf   crop   cross-view   crowd   ct   cutting   daily   dance   dark   data   dataset   day   daylight   decomposition   deep   defocus   deformation   denoising   dense   depth   description   descriptor   detail   detection   dichromatic   disease   disgust   disparity   dogs   domain   dped   driving   drone   dubrovnik   duplicate   dynamic   ear   edge   egocentric   ellipse   emotion   empty   endtoend   enhancement   environment   estimation   evaluation   event   expertise   expression   eye   facade   face   facial   fake   fashion   fear   feature   field   fine-grained   fingerprint   fingertip   first-person   fish   fisheye   fitting   flickr   flight   floorplan   flow   fly   flying   fog   food   foot   footprint   foreground   fov   frames   frontview   fundus   gait   game   gan   gaze   gender   genetic   genome   geography   geometry   geoscience   geotag   geotagged   germany   gesture   getry   gif   giraffe   gis   global   google   gps   grammar   graphics   grayscale   graz   ground   groundtruth   group   growth   gsd   hand   handwritten   hd   head   heart   heat   hierarchy   high-definition   high-resolution   highlight   highway   holes   horse   house   howto   human   identification   illumination   illuminiation   illusion   image   imagenet   images   imdb   imu   indoor   inertial   initialization   inserts   instance   intake   intensity   interaction   interactive   interest   internet   invariance   ir   isar   iso   joy   kaggle   kernels   keyframe   kimia   kinect   kitchen   kitti   label   labeling   laboratory   land   landmark   lane   language   large   large-scale   laser   lattice   layout   leaf   learning   letter   leuven   lidar   lifespan   light   lightfield   lighting   limited   line   lip   lisbon   liver   local   localization   location   logo   low   lowlevel   machine   makeup   manhattan   map   maritime   mask   match   matching   material   medial   medical   medicine   memorability   mesh   metadata   milling   mirror   mobile   model   modeling   monitoring   mono   montage   motion   motorbike   mouse   mouth   movement   movie   mpeg   mser   mug   multi-camera   multi-class   multi-human   multi-mode   multi-sensor   multi-spectral   multi-view   multilabel   multimedia   multimodal   multiple   multispectral   multitarget   multiview   naming   natural   nature   navigation   netherlands   network   neutral   newyork   night   nir   noise   normal   nude   number   object   occlusion   ocr   odometry   omnidirection   omnidirectional   online   open-view   operation   optical   optimization   organ   original   osnabrueck   outdoor   overhead   overlap   oxford   pair   pairwise   pan   panchromatic   panorama   panoramio   parallel   paris   parsing   part   partial   pasadena   pascal   patch   path   pattern   pedestrian   pedestrians   people   person   perspective   phase   photo   photogrammetry   physics   pittsburgh   place   plane   planning   point   pointcloud   polygon   popularity   pornography   pose   potsdam   presentation   pressure   primitive   privacy   procedural   profile   project   proposal   pruning   ptz   quality   question   radar   random   rank   ranking   ransac   rate   ratio   re-identification   reading   real   real-world   realism   recipe   recognition   reconstruction   rectification   rectified   reflection   registration   regression   regular   remote   removal   rendering   repetition   resolution   restoration   retina   retinal   retrieval   rgb   rgbd   road   robot   robotic   robust   rome   room   ros   rotation   sad   saliency   sampling   sanfrancisco   satellite   scale   scan   scanner   scene   search   segmentation   selfdriving   semantic   sense   sensing   sequence   series   sfm   shadow   shape   sheffield   shoes   shots   shutter   sideview   sign   signs   similarity   simultaneous   single   singleview   size   skeleton   skeletonization   sketch   skin   sky   slam   smartphone   soccer   social   software   source   space   spain   spanish   speaker   speech   speed   sphere   sport   stability   stabilization   static   stationary   stereo   stereovision   stochastic   street   structure   structured   study   stuff   style   stylization   subpixel   subtraction   summarization   summary   superpixel   superresolution   supervised   supervisely   surface   surgery   surprise   surveillance   swan   switzerland   sydney   symmetry   synthetic   table   target   taxonomy   temporal   text   textile   texture   texture-less   therapy   thermal   things   time   timelapse   tiny   tokyo   tool   tools   top-view   topcoder   tracking   tracklet   traffic   trajectory   transfer   transportation   trees   triangulation   truth   tuberculosis   turbulence   type   uas   uav   udacity   ultrasound   understanding   uneven   unmanned   unsupervised   urban   user   vanishing   variation   vehicle   vehicles   vessel   video   view   viewpoint   virtual   visible   vision   visual   voc   volleyball   vqa   vt   water   wavelength   weakly   wear   wearable   weather   webcam   white   wide   wiki   wikipedia   wild   workflow   world   worldwide   xray   year   youtube   zoom   zurich  
«showing 653 tags of 653 total tags for 463 datasets (1.41) »


groundtruth
DID Name Description Tags URL Date Views
455 Darmstadt Noise Dataset The Darmstadt Noise dataset provides a benchmark for denoising performance. Lacking realistic ground truth data, image denoising techniques are traditionally e... noise denoising benchmark high-resolution groundtruth iso natural real link 2018-04-18 95
436 Aberystwyth Leaf Evaluation We are releasing the Aberystwyth Leaf Evaluation dataset acquired to support the work of the EPSRC funded project Dynamic Modelling of Plant Growth with Comput... segmentation leaf biology groundtruth timelapse nature growth link 2018-03-15 79
407 Inria Aerial Image Labeling The Inria Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixelwise labeling of aerial imagery (link to paper). Dataset feature... semantic segmentation aerial urban city groundtruth building footprint house link 2018-03-22 445
403 Multispectral Imaging (MSI) Multispectral Imaging (MSI) datasets were acquired using IRIS II which is a lightweight portable system comprising of a high resolution camera, a novel filter w... multi-spectral illumination wavelength groundtruth registration matching alignment link 2017-12-01 370
353 COCO-Stuff COCO-Stuff augments the COCO dataset with pixel-level stuff annotations for 10,000 images. These annotations can be used for scene understanding tasks like sema... semantic segmentation stuff things COCO captioning annotation groundtruth benchmark link 2017-02-16 821
351 CMLA Subpixel Stereo Dataset A 66 stereo pairs dataset with their subpixel ground truths. The construction and improvement of algorithms for subpixel stereovision requires very precise t... stereo stereovision subpixel groundtruth 3D pointcloud noise depth link 2018-03-12 608
346 LASIESTA (Labeled and Annotated Sequences for Integral Evaluation of SegmenTation Algorithms) LASIESTA is composed by many real indoor and outdoor sequences organized in different categories, each of one covering a specific challenge in moving object det... dataset groundtruth motion object detection foreground background subtraction challenge stationary camera link 2017-09-12 562
332 Multi-FoV - Large Field-of-View Cameras for Visual Odometry The Multi-FoV synthetic datasets are two synthetic scenes (vehicle moving in a city, and flying robot hovering in a confined room). For each scene, three differ... visual odometry camera fov synthetic groundtruth blender link 2016-08-11 712
317 NYU Symmetry Database The mirror symmetry database contains 176 single-symmetry and 63 multyple-symmetry images (.png files) with accompanying ground-truth annotations (.mat files). ... symmetry detection mirror groundtruth link 2016-04-15 630
308 TST Intake Monitoring dataBase t is composed of food intake movements, recorded with Kinect V1 (320?40 depth frame resolution), simulated by 35 volunteers for a total of 48 tests. The device ... human food intake monitoring behavior kinect pointcloud tracking age groundtruth link 2018-01-06 693
303 1DSfM Landmarks The 1DSfM Landmarks is a collection of community-based image reconstruction by Kyle Wilson and is comprised of 14 datasets with comparison to bundler ground tru... 3d reconstruction landmark groundtruth benchmark urban city link 2015-08-05 958
298 Freiburg-Berkeley Motion Segmentation The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is anno... video segmentation benchmark object tracking pedestrian groundtruth motion link 2017-03-21 1226
296 Video Segmentation Benchmark The Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divided into... video segmentation benchmark object tracking pedestrian groundtruth motion link 2017-03-21 1362
287 INRIA Lafarge Benchmarks Some datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting Line-ne... 3d surface reconstruction groundtruth pointcloud object detection line road network urban crowd pedestrian counting link 2015-06-18 1124
245 ETHZ CVL Video SumMe The Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of videos, anno... video summary benchmark human groundtruth action event link 2016-10-21 2126
225 California-ND An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections Managing photo collections involves a variety of image quality assessment tas... retrieval duplicate copyright groundtruth detection link 2014-03-19 905
222 Ford Car Dataset The Ford Car dataset is joint effort of Pandey et al. (for collecting images, Lidar points, calibration etc.) and us (for annotation of 2D and 3D objects). ... car detection lidar 3d groundtruth sfm link 2014-04-16 2224
204 UCF Person and Car VideoSeg The UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big car, sm... video segmentation object motion model camera groundtruth link 2015-04-19 1239
202 GaTech SegTrack The SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used for accura... video segmentation object proposal flow optical motion model camera stationary groundtruth link 2013-10-09 1101
135 Quad 6K The Quad 6K dataset is a Structure-from-Motion dataset taken at Arts Quad at Cornell University campus and consists of 6514 images with ground truth positions o... reconstruction, sfm, urban, groundtruth, landmark, 3d gps link 2013-11-05 1262


total views: 18542 5 queries in 3.0994415283203E-5s 2.6941299438477E-5s 9.2029571533203E-5s 0.00011706352233887s 0.0014259815216064s and total 0.0070810317993164s