Description
Visual detection and classification is of the utmost importance in several applica-tions. Is there a human face in this image and if so, who is it? What is the person in this video doing? Has this photograph been taken inside or outside? Is there some defect in the textile in this image, or is it of acceptable quality? Does this microscope sample represent cancerous or healthy tissue?
To facilitate automated detection and classification in these types of questions, both good quality descriptors and strong classifiers are likely to be needed. In the appearance based description of images, a long way has been traveled since the pio-neering work of Bela Julesz in [13], and good results have been reported in difficult visual classification tasks, such as texture classification, face recognition, and object categorization.
What makes the problem of visual detection and classification challenging is the great variability in real life images. Sources of this variability include view-point or lighting changes, background clutter, possible occlusion, non-rigid defor-mations, change of appearance over time, etc. Furthermore, image acquisition itself may present perturbations, like blur, due to the camera being out-of-focus, or noise.
Over the last few years, progress in the field of machine learning has manifested in learning based methods to cope with the variability in images. In practice, the system tries to learn the intra- and inter-class variability from, typically a very large set of, training examples. Despite the advances in machine learning, the maxim “garbage in, garbage out” still applies: if the features the machine learning algorithm is provided with do not convey the essential information for the application in question, good final results cannot be expected. In other words, good descriptors for image appearance are called for.