Yet Another Computer Vision Index To Datasets (YACVID) - Details

Stand: 2017-04-28 000000m 21:46:16 - Overview

Attribute Current Content New
Name (Institute + Shorttitle)Leuven Stereo Scene 
Description (include details on usage, files and paper references)The Leuven Stereo Scene dataset is a scene and depth dataset. There exist two variants of this dataset - a CVPR 2007 paper [1] by Leibe et al. for detection and a subset from a BMVC 2010 paper [2] by Ladicky et al for semantic segmentation.

This data is originally from Nico Cornelis 3DPVT06 and IJCV08 paper,
and later used in the CVPR07 paper. The best reference is probably
the IJCV08 journal version.

Working links as of Nov 2013

The Leuven Stereo Scene dataset is a test sequence consists of 1175 image pairs recorded at 25fps and a resolution of 360288 pixels over a distance of about 500m. It contains a total of 77 (sufficiently visible) static cars parked on both sides of the street, 4 moving cars, but almost no pedestrians at sufficiently high resolutions. The main difficulties for object detection here lie in the relatively low resolution, strong partial occlusion between parked cars, frequently encountered motion blur, and extreme
contrast changes between brightly lit areas and dark shadows. Only the car detectors are used for this sequence.

The BMVC paper augments a subset of the Leuven stereo dataset of (Leibe et al, 2007) with object class segmentation and disparity annotations. The Leuven data set was chosen as it provides image pairs from two cameras, 150cm apart from each other, mounted on top of a moving vehicle, in a public urban setting. In comparison with other data sets, the larger distance between the two cameras allows better depth resolution, while the real world nature of the data set allows us to conrm our statistical model validity. However, the data set does not contain the object class or disparity annotations, we require to learn and quantitatively evaluate the effectiveness of our approach.

To augment the data set all image pairs were rectified, and cropped to 316 256, then the subset of 70 non-consecutive frames was selected for human annotation. The annotation procedure consisted of two parts. Firstly we manually labeled each pixel in every image with one of 7 object classes: Building, Sky, Car, Road, Person, Bike and Sidewalk. An 8th label, Void, is given to pixels that do not obviously belong to one of these classes. Secondly disparity maps were generated by manually matching by hand the corresponding planar polygons.

We believe our augmented subset of the Leuven stereo data set to be the first publicly available data set that contains both object class segmentation and dense stereo reconstruction ground truth for real world data.

[1] Dynamic 3D Scene Analysis from a Moving Vehicle
Bastian Leibe Nico Cornelis Kurt Cornelis Luc Van Gool, CVPR 2007

[2] Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction
Lubor Ladicky, Paul Sturgess, Chris Russell, Sunando Sengupta, Yalin Bastanlar, William Clocksin, Philip H.S. Torr
International Journal of Computer Vision, 2012.
first paper at BMVC 2010 
URL Link 
Files (#)1175 
References (SKIPPED)
Category (SKIPPED) 
Tags (single words, spaced)segmentation, semantic, reconstruction, urban, sfm, 3d, leuven, depth, stereo 
Last Changed2017-04-28 
Turing (2.12+3.25=?) :-)