The Fruit&Vegetable Image Collection


¿What is The Fruit&Vegetable Image Collection?

The growing rise of new technologies and the widespread use of Internet has driven documents present a rich visual content on the web. To properly indexing these documents (images, videos, audio files, etc..) and build useful search engines, capable of meeting the information needs of users, methods are needed to accurately describe the visual and audio content.

CBIR systems ("Content Based Image Retrieval Systems") and images searchers have evolved greatly in the last decade thanks to the use of increasingly accurate descriptors in its mission to represent the image content. In fact, the synthesis of new visual descriptors is nowadays an important field of research involving institutions and companies in the search for effective solutions in image interpretation.

In order to measure the accuracy and quality of these descriptors and overall CBIR systems, is requiring specific collections of images of different types that serve as a testing ground to carry out the necessary experiments. This is the main reason that made us create The Fruit&Vegetable Image Collection.

Although over the years we have used a large number of diverse collections of images that have served to support multiple experiments, few collections that have an acceptable level of quality in terms of factors such as resolution, lighting, contrast, sharpening, no noise, etc. In particular, object-oriented collections, where the foreground power and tries to minimize the effect of noise and image background, we are not aware at present of the existence of any collection of open license with similar characteristics The Fruit&Vegetable Image Collection of quality.

As the name suggests, the purpose of the images are fruits, and vegetables, of different size, shape and color, but its main potential is its homogeneity in terms of technical features and parameters used (resolution, exposure time, aperture , zoom, ...) along the array, which greatly facilitates the evaluation of the descriptor in its task of characterizing the visual content of the image without influences of aspects accessories to said visual content.

These parameters have been selected, after doing several tests, to try to minimize the noise in the images either in the context of the image by differences in lighting or object sobreiluminación lossy. Both cases can be seen in the following image:

As seen to the left and right of the image is less bright areas while there sobreiluminadas object areas.

Although identical characteristics, each photograph is taken from a distance or at a different angle depending on their size.

Size Distances
Small 48 cm 50 cm 55 cm -
Normal 48 cm 50 cm 65 cm 70 cm
Big - 60 cm 65 cm 70 cm

All these features make it excellent collection for testing for which it has been designed.

The quality of the images has been achieved thanks to the material used for this project as it is provided in a high quality SLR camera and a small studio with spotlights that has facilitated the photographs. As evidence of this can be indicative that, for example, the resolution of the images is of 5616 x 3744. The quality is also obtained product quality testing to photographs made after completion controlling noise generated and repeating as necessary.

As mentioned above all images are within the field of his Cabera Exif Copyright license has been considered optimal, which in this case has been the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

It has also added another header field (ImageDescription) some HTML code that contains important each photograph about the texture, color, shape… A generic example of the code is as follows:


As seen in the image the text is in blue corresponds to the XML tags, text in black represents the possible values that can have these labels and green are the explanatory comments of its meaning.

This code gathers information about:

  • Image ID
  • Object ID
  • Object colour
  • Object texture
  • Spatial attributes
  • View

For identifying the image, identification, size and the view of the object have been made arbitrarily because being more specific project not found any specific ontology to define.

In contrast to the case of color, texture and shape of the object labels and their information is viewed in different ontologies:

Ontology textures Modified
It has adapted to the characteristics of the textures of the reasons for the images.
Color Ontology Model HSV
Ontology forms Modified
You select the volume of ontology and added some form of reason itself.


In conclusion, the process of generating images can be summarized by the following chart:

For each image For all images of an object



The photographic collection was performed on a total of 100 different varieties of fruits and vegetables, which because of its wide variety of sizes, shapes, colors and textures are suitable to test descriptors, additionally providing a certain appeal to images.


The total size of the collection is of 1098 images. The collection can be downloaded in compressed format JPG and accurately jobs are also available in RAW format. Additionally it provides a file with all miniatures.

As explained in the introduction the photographed objects have a controlled size, particularly objects we selected 19 small (cherry), 67 medium-sized objects (orange) and 14 large-sized objects (watermelon).



All objects have been photographed from a front position and rotated 45 degrees to either the right or left, and for each position four (medium-sized objects) or three doses (objects small and large) at different distances.



Here you can download The Fruit&Vegetable Image Collection.

For user convenience downloading has been divided between JPEG and RAW. The JPG format is a package of thumbnails of all the images quickly to display images in the collection.



  • Each image has a resolution of 5616 x 3744.
  • 100 different varieties of fruit and vegetables.
  • 1098 images in total.
  • All objects have been photographed from a frontal position, rotated 45 degrees to either the right or left, and for each position four or three shots at different distances.


To facilitate downloading has been divided images (JPG format) into 4 parts and contains a file with all the images in a much lower resolution (300 x 200) for a faster download and viewing.

File Link
The Fruit&Vegetable Image Collection part 1
The Fruit&Vegetable Image Collection part 2
The Fruit&Vegetable Image Collection part 3
The Fruit&Vegetable Image Collection part 4
The Fruit&Vegetable Image Collection thumbnails  Download


Below is a text file where indicated inside the links to download the images of The Fruit&Vegetable Image Collection en formato RAW.

To facilitate this we recommend using a download manager.

File Link
The Fruit&Vegetable Image Collection RAW

The Fruit&Vegetable Image Collection is licensed under the Creative Commons.


As an additional part of the project, include an application we have measured some simple descriptors using pictures of The Fruit&Vegetable Image Collection. Besides obtaining relevant information about the images in the collection and study some geometric properties of the objects shown in the images, extracted in a simple metadata information collection.

The application, written in Java, using Matlab routines (see 7.9), so it installed locally is required to run the application. So you can run, the application source files should be placed in the working directory and call from the same directory the application "interface" (using the console).

Following these steps, the following interface:

1.- The first thing you should do to get started with the application is going to open menu marked with red. In this menu you have to add the directory with the images you want to work. When you add the directory marked in brown table is populated with XML information from all images directory. It also shows the first picture in the area and if there is ImagenRGB set their corresponding binary threshold.

2.- If you are interested in any of these photographs store or histogram, enlarge it or see any information about them exists in the menu marked with red open an option for each.

3.- From this point if you click on the option marked in orange histogram shows a graph of the histogram of the image being displayed.

4.- At this point the user can browse the entire directory of images pichando on the buttons next and previous image that are marked with blue.

5.- Any of the images can be displayed with different thresholds to binarize the "Threshold Test" and delete its vignetting with the "Remove Noise". Both buttons are highlighted in dark blue.

6.- Once you have selected the desired image with and without noise threshold required can use any of the existing functionality in the Image tab grayed menu:

  • Dimension: Displays the interface itself in the area, width and height, both in pixels and inches, the object appears in the current image.
  • Color: Displays the interface itself in color, brightness and saturation of the object at the current image.
  • XML: Extract XML Information Photo currently being displayed and stored in a file in the image directory.

7.- Finally the Features tab options, marked in green, are the same as those discussed in image but that affect all images in a directory instead. The resulting information, rather than appearing in the interface, is stored in a spreadsheet saved in the same directory of images.

8.- All actions on the interface generate messages that are displayed in text areas highlighted in black in the catch. These messages help the user understand being processed at any given time.


If the functionalities of the application the only thing that matters is getting the XML data of the images is another option that may be faster and easier. To do this all you have to download the application on the following link ExifTool.

ExifTool is a free and independent software platform that is managed from the command line used to read, write and edit metadata in a wide variety of files.

Using the application is very simple:

If we want to simply display information of an image exif exiftool would RutadelaImagen. For example:

If you run one can see that the amount of information obtained is huge complicating a while to find what you're looking for, but it is the only option if you do not know the name of the field you want to query, but anywhere where download the application there is a section where you can view all the header fields.

But if we know the name of the field to read the consultation of this information becomes more easily:

As you can see the name of the field is added before the path prepending a minus. You can concatenate several expressions of symbol and name of the field if you want to see more fields.

Another widely used feature of this application is to write in the header fields. An example of how you could store the image description field Image Description and license in the Copyright field is as follows:

As you can see the syntax is similar, only now we have to add the information you want in double quotes and assign it to the field under consideration.

The result tells you that a file has been updated and if we consulted the fields we would see that the result is as expected:

We also discuss an aspect that could save time if you view or enter information for all images in a directory. If any of the above examples we had not put the path of an image but had shown a directory or writing in these fields but for all images of the same:

It is also possible to filter the images we want to read modified, for example by adding extension *. Extension instead of the name of the image:

Finally say that if instead of getting the information in the console you want to redirect to a file all you have to add to the command lines explained above is the redirection symbol (>>) and path of the file where write, for example:

The result is a file in the specified directory that contains the requested information file fichero "Apricot_F_48.jpg".


Creative Commons


Los 10 eventos #MAKER & #IoT de la…


Los 10 eventos #MAKER & #IoT de la @EPCC_Unex en el 2015

Ahora que estamos empezando el 2016, está bien echar la vista atras, y es ahí cuando nos damos cuenta que han sido muchos los eventos MAKER&IoT de la Escuela Pol...

Read more

#20añosCODDII y #700añosRamonLlull en …


#20añosCODDII y #700añosRamonLlull en Palma de Mallorca la Asamblea General de la @coddii_org – nov.-dic.2015

La Asamblea General de la CODDII se celebró en Palma de Mallorca el martes 1 de diciembre en la @UIBuniversitat (Universidad de las Islas Baleares). La CODDII es la ...

Read more

SmartPoliTechENERGY – TFG de Carlos Ru…


SmartPoliTechENERGY – TFG de Carlos Rufo @soft_carlos (febrero 2016)

El miércoles 3 de febrero de 2016 Carlos Rufo presentó su Trabajo Fin de Grado (TFG) sobre SmartPoliTechENERGY. Al día siguiente se iba a Brasil para trabajar. Ya an...

Read more

Log in