Released Datasets


ICText Dataset

IJDAR 2021 Competition

The ICText is an Integrated Circuit Text Spotting and Aesthetic Assessment dataset with a collection of 20,000 images collected in real-world environment.


Total Text Dataset Star

IJDAR 2020

The Total-Text dataset is a collection of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.


Exclusively Dark Dataset Star

arXiv2017, CVIU

The Exclusively Dark (ExDARK) dataset is a collection of 7,363 natural low-light images with 12 object classes (similar to PASCAL VOC) annotated on both image class level and local object bounding boxes.


WikiArt Dataset Star

ICIP 2016

In order to replicate or to have a fair comparison to our ICIP2016 paper, we created a "new" Wikiart dataset. All the images were obtained from


MalayaKew Plant Dataset

ICIP 2015

MalayaKew (MK) Leaf dataset was collected at the Royal Botanic Gardens, Kew, England. It consists of scan-like images of leaves from 44 species classes. This dataset is very challenging as leaves from different species classes have very similar appearance.


CUTE80 Dataset

ESWA 2014

We introduce the first curved text dataset to be made public, namely CUTE80 that consists of 80 curved text line images with complex background, perspective distortion effect and poor resolution effect (in circle, S, Z shaped text lines).


Pratheepan Human Skin Dataset

T-II 2012

The images in this dataset are downloaded randomly from Google for human skin detection research. These images are captured with a range of different cameras using different colour enhancement and under different illuminations.