Challenge XVI



Le challenge organisé par le Groupe Français de Chimiométrie se présente sous la forme d'une énigme. Il s'agit de trouver les réponses d'un fichier "test" à partir d'un fichier dit d'étalonnage. Les données de ce dernier peuvent présenter des difficultés particulières (données manquantes, données aberrantes, etc.).

L'analyse des  solutions se fait au cours du congrès. Les congressistes ayant proposé les meilleures solutions seront invités à présenter brièvement leur méthodologie lors d'une session spéciale CHALLENGE. Les meilleures solutions seront primées. Afin de favoriser la participation des jeunes chimiométriciens (moins de 30 ans), le comité scientifique distingue deux catégories: "Junior" et "Sénior".

Challenge XVI

The samples correspond to oils from petroleum reservoirs around the world. These spectra were collected in the laboratory at various conditions of pressures and temperatures, in transmittance, using a high pressure flow cell. It should be noted that as pressure and temperature change, so does the effective path length and the density of the fluid. The nominal path length was 1 mm. The temperature and pressure at which the spectra were acquired are provided, and the path length can be calculated using the equation: d (mm) = 0.8801 + 0.000402065 × T(Kelvin) + 0.00060493 × P(MPa)

Note that in the dataset, the temperatures are in degree C and pressures in psi.

Data correspond to NIR and IR data and the axis, while not provided, are in reciprocal cm-1, linearly spaced. Finally, while most spectra represent mixtures, some pure components are provided, but not identified as such. No pure components are present in the validation set.

There are 1556 spectra in calibration and 208 in validation. Each spectrum has 5678 variables.
Reference values for the parameter of interest are provided for the calibration  set only.

Data could be downloaded at :


The goal is of course to get the smallest RMSEP on the validation set.


Participants who wish to compete for prizes must submit their predictions (excel) of the validation set and their approach (doc or ppt) by 10 January 2015 to  .

The participants with the best results will be asked to present their approach during the conference. Depending on the number of responses, it will be 2 juniors and 2 seniors.


Personnes connectées : 1