Chemometric Modelling
Spectra contain high volumes of information, with subtle differences, which are very difficult to interpret by visual inspection only. Chemometrics is a tool for extracting this information from the multivariate chemical data, in spectra, using mathematics.
What is Chemometrics?
Chemometrics is generally applied to explore patterns of association in data; track properties of materials on a continuous basis or to prepare and use multivariate classification models.
There are multiple forms of chemometric models including regression-based and classification-based algorithms. Chemometric methods enable useful information to be separated from noise, reveal hidden correlations, and provide a visual approach for multivariate data analysis.
Regression based-models can be used to quantify substance concentrations whilst classification-based models can be used to assess quality parameters. Classification models can be built using a range of classes (multiclass), or simply two classes i.e. authentic vs adulterant.
Principal Component Analysis (PCA) is a commonly used exploratory chemometric tool. It helps to reveal the hidden structure/relationships in, and compress, complex multivariate datasets. It is often used in numerous applications as a first step, which helps to describe the data patterns.
Classification analysis is used for discriminant analysis. It is used to assess to which of various predefined classes an object belongs, or indeed identify if a material does not belong to an individual class.
Partial least squares-discriminant analysis (PLS‐DA), k-nearest neighbour (k-NN), support vector machine (SVM) and [data driven]-Soft Independent Modelling of Class Analogy ([DD]-SIMCA) are several examples of classification algorithms used for discriminant analysis. The choice of which approach to use is dependent on the question requiring an answer and on the type of experimental data.