{tab=Description}
Introduction
IMAGINA: Synthesizing features of medium/high semantic level and applying distance-based index structures in the process of content-based image retrieval.
(TIN2005-05939) Ministry of Education and Science. 31/12/2005 - 31/12/2008
This project is part of a research area of great importance in many application domains, called Image Retrieval Based on Content (CBIR). A typical approach of current CBIR systems is to bring the images to a process of extraction and selection of features that, properly filtered, would constitute a vector of perceptual features that attempt to alleviate the semantic emptiness of the images. The database consists of the vectors of characteristics, and using appropriate methods of indexing are carried out consultations neighborhood located next feature vectors. The distance should be used to synthesize the concept of semantic similarity between images, based on appropriate models of similarity. Given these considerations, the objectives posed in this project aimed at improving the processes of feature extraction and selection as well as mechanisms for indexing and similarity searches. Specifically, the aim is to summarize characteristics of medium-high level semantics from physical characteristics, and design algorithms to facilitate efficient retrieval of images by content, using multidimensional indexes.
Description
One of the determining factors in the interest in CBIR techniques has been rapid growth in the size of collections of digital images. On the one hand has increased the number, size and availability of images in all spheres of life, on the other images played a crucial role in various application fields such as medicine, journalism, food technology, advertising, design, education, film and tv, etc.. There is therefore a need to manage digital image data, since the scan itself does not facilitate the management, while allowing the automatic result of information in the images themselves. Therefore requires a collaborative research in areas such as data representation, extraction techniques and selection of features, methods of indexing, search engines for the execution of queries on content, user interfaces, etc.
The first developments of methods in CBIR framework dating from the late 70s, and are based on textual annotations of images and use of DBMS for retrieval. This approach is faced with two serious problems, for a manual recording of a voluminous collection of images was a very expensive, secondly the existence of subjectivity in the perception of the contents showed that the main problem today remain current systems, known as the semantic gap. The next step in the evolution (in which we find ourselves today) dates from the early 1990s, and the key element in this evolutionary step lies in the automatic extraction of visual information embedded in the image in the form of features (color, texture, shape ,...), and more recently in the development of techniques that train the feedback system to report more relevant answers.
The image retrieval based on content uses a visual image such as color, shape, texture and spatial arrangement to represent and characterize the image. In the typical approach of current CBIR systems, each image in the database is subjected to a process of extraction of visual content. This content, properly filtering (selection of features), is represented by a set of values that describe the features of vector images. Vector characteristics of the images in the database are the database of features. To retrieve images, the user gives the system an example image or a given pattern. The system then transforms the image or pattern in your example internal representation as a vector of characteristics, which serves as a reference to search within the database of features. This search query usually consists of a neighborhood, which locate and retrieve objects from the database most similar to a given (ie the image or pattern) with the help of an indexing scheme to speed up this search process. CBIR systems include mechanisms for more recent feedback on the relevance of the result (relevant feedback) whose purpose is to use a posteriori information of the user to refine the search process and to generate increasingly significant results.
According to this description, we can identify three key areas (depicted in the figure) that converge on research within the field of systems for image retrieval by content (1) extraction and selection of visual features, (2) methods for similarity search and indexing schemes, and (3) the feedback mechanisms of relevance.
Current approach of the systems CBIR
{tab-segundo=Feature Extraction and Selection}
Feature Extraction and Selection
The process of feature extraction, essentially rooted in the techniques of computer vision, aim to describe the pictorial content of the image to determine if two images are similar in a given context. The basic techniques for extracting more widely used in digital image processing are concentrated on characteristics of color, textures and shapes (or structures), as well as combinations thereof. These extraction processes can be directed to a process (for the entire image) or local (object or a region within the same), so that sometimes an effort to provide accurate determination or removal of structures or regions interest of the image (Rois, Regions of Interest).
Within the wide range of features drawn from the image of interest that are invariant against rotations, translations and scale, to be independent of the viewpoint from which it was acquired image or object, and it remains as much as possible approximate values for them, both for the global and local levels.
One of the important aspects of the problem in this phase 1 of the CBIR process is to properly handle the massive number of features that are extracted from the images. In this connection it is necessary to implement computationally expensive processes, and methodologies based on known and can be classified into scalars (the highest individual sensitivity analysis), vector (attempts to prove discriminatory capacities of all subsets) and global (such as case of the most widely used method, Principal Components Analysis) to select a reduced and manageable subset of features that describe the image properly. Thus, a system which requires a lot of features is not profitable, and if they will have few features that are able to differentiate a large number of patterns, thus reducing the dimensionality of the space of representation and to maintain a high discriminatory capacity. All these processes lead to obtaining the vector of characteristics that define the characteristics of the overall image or an object of particular interest locally.
The approach described above follows an approach of image processing at a basic level, from where we get some primitive features search syntactic level, by means of Pattern Recognition and Image Analysis. However, the slowing down of the semantic gap defined in the preceding paragraphs, has ramifications in this part of the CBIR process and therefore must be treated in some respects during the extraction and selection of features. Establish types of search semantic load requires more study and synthesis of new higher-level features (as cluster-level syntactic features) that reflect the most appropriate way possible, the index of similarity between images as we are looking at them. It is necessary to evaluate different models of similarity, at first metric satisfying certain mathematical properties, but others who are looking to better suit the perceptual characteristics such as texture, that attempt to compare their shapes, and scaling them evaluating their differences (Transformational Distances), by calculating their waste.
{tab-segundo=Methods for similarity search and indexing}
Methods for similarity search and indexing
Phase 2 of the CBIR reflecting the figure, part of a collection of feature vectors. The resulting vectors can be matched with points in the multidimensional data space (Rd), turning the database into a set of n d-dimensional points. This area of d-dimensional data (Rd) equipped with an appropriate distance dist is a metric space Md = (Rd, dist) that allows us to establish an assessment of the proximity between any two points in space. If the distance metric defined in the concept of capturing the semantic similarity between images that represent the points of space, a decision on the semantic similarity between images can become an activity likely to be resolved through a computational algorithm. This phase 2 of the process in CBIR is to build efficient algorithms for similarity search in the multidimensional space, rather than addressing the problem of designing measures that capture the semantic similarity between images, thus alleviating the problem of semantic gap.
When the multidimensional space is large (the usual assumptions made) the cost of computing the proximity between a given point and a collection of items stored has become one of the major stumbling blocks for system performance, and therefore require mechanisms Indexing can effectively organize the multidimensional space and efficiently respond to inquiries from neighbors. Addressing the formula to relieve this bottleneck is one of the goals we set in this project. In this sense, both approaches make the current trends for indexing databases of characteristics: indices based on characteristics and indices based on distances. While the first set partition organized a multi-dimensional space Rd, and access to data based on the location of points, the second index the distances between points of space and selected points of the space (pivot elements) and use the property of triangular inequality to discard parts of the space with items not relevant to the query.
Achieving efficient mechanisms in phase 2 of the process, not only through the design of the method of indexing but also for the design of the neighborhood search algorithm, which should integrate the role as a critical element of similarity. The synthesis of similarity functions that assign meaning to a set of features follows two alternative approaches. One is to express the similarity between two vectors as the composition of a distance defined on the space of features and a positive role and not monotonically increasing. Sometimes the distance is a simple Euclidean distance, but in many cases are more complex. The other approach takes a critical perspective is to express probability and similarity by a dependent function of the probability that two images belong to the same semantic group.
This brief comment on the similarity measures shows how difficult it can accommodate the proper integration of the similarity in the search algorithm built on the method of indexing. This latter task is an essential part in the research that we intend to address in the context of Phase 2 of the CBIR.
In summary, the purpose of this project is to investigate techniques and mechanisms applied in stages 1 and 2 of the process with the aim of CBIR methods and algorithms original design that contribute to the development of effective and efficient image retrieval by content.