in Environmental and Engineering NIR Spectroscopy and Aquaphotomics Approach to Identify Soil Characteristics as a Function of the Sampling Depth

Soil is a very complex medium made of minerals, organic matter, microorganisms, air, and water. Vibrational spectroscopy techniques are exceptionally well-suited to be used with portable and hand-held devices. In this study, NIR spectroscopy of 2 mm) to remove skeletal particles, large roots, and organic debris. Data on soil spectra were collected in duplicate using the microNIR OnSite-W spectrometer (VIAVI Srl, Italy) in reflectance mode from 900 – 1,600 nm (50 scans; 125 reading points). Then, the data on the absorbance from 1,300 to 1,550 nm were statistically processed to construct the aquagrams and perform a PCA (95% confidence level). The suitability of using a portable instrument to identify the kind of soil in different areas, for example, in areas that have undergone desertification, can help in soil classification and the rapid and non-destructive analysis of its characteristics. The Aquaphotomics approach could detect considerable variation (particularly in the soil from Arborea) associated with the sampling depth.

Conversely, Jiang et al. [10] found that the predictive models constructed with the Vis-NIR spectra of soil samples from different depths performed better than those constructed with only superficial or deep samples. The authors attributed these findings to the high variability in the values of the chemical properties of the samples at different depths. Such a wide range of values allowed them to construct robust global predictive models. Fajardo et al. [13] used Vis-NIR along with discriminative clustering techniques to determine the physicochemical characteristics of soil profiles. The outcome of their analysis proved that the A horizon greatly differed from the underlying layers, mainly in the clay-related absorption both in the visible region, between 390 and 700 nm, and at 2,200 nm.
NIR spectroscopy (NIRS) has several advantages: minimum or no sample preparation is required; the sample is not destroyed or altered by analysis; fast measurements (few seconds); only one spectrum, recorded in the lab or under in situ conditions, allows the estimation of several soil properties; NIR optics can be miniaturized for portable or hand-held use; no chemicals are required [14].
The last aspect is crucial for sustainable soil analysis and management. By excluding chemicals, this technique reduces harmful and hazardous waste and residues. Thus, NIRS has a lower negative impact on the environment than traditional analytical techniques [15]. According to the Sustainable Development Goals (SDGs) [16] scheduled in the 2030 Agenda, this technology is considered to protect the environment, including the soil. Goal 12 of the SDGs, which deals with "Sustainable consumption and production", encourages the adoption of an environmentally friendly approach to chemicals and waste [17]. The objectives are: to conduct eco-friendly management of chemicals and waste to minimize their negative impact on human health and the environment; to reduce waste production by prevention and reduction; to stimulate companies to embrace sustainable practices [17]. NIRS fits well within this framework.
NIRS studies the interaction of matter with light in the NIR region of the electromagnetic spectra from 750 nm to 2,500 nm. NIR spectra represent overtones of vibrational modes of functional groups containing-XH bonds, where X can be carbon, oxygen, or nitrogen. However, some specific absorption bands overlap, concealing some information. Thus, chemometric techniques are essential for constructing reliable and accurate predictive models from NIRS data [18,19].
Aquaphotomics is a novel scientific discipline that uses NIR measurements and multivariate analysis to determine the relationship between water absorption patterns and bio-functionalities. Tsenkova [20,21] stated, "Aquaphotomics investigates the water-light interactions in biological systems, exploiting the fact that changes in the water matrix reflect, like a mirror, the rest of the molecules the water surrounds". According to this approach, 12 water absorption ranges (each 6-20 nm wide), called Water Matrix Coordinates (WAMACs) and labeled Ci-i = 1-12, characterize the NIR spectra in the area of the first overtone of water (1,300 -1,550 nm). Within the WAMACs, specific water absorbance bands (WABs) are associated with specific water molecular conformations (water species, water molecular structures) [20]. When a perturbation produces changes at specific WABs, such bands are considered to be "activated" by the perturbation. The selected WABs are then plotted in star charts, called "aquagrams", which depict the Water Absorption Spectral Patterns (WASPs) [21].
Several factors, such as the temperature of the sample and other soil constituents (e.g., salts), affect the position and the intensity of the water absorption peaks in the NIR spectrum. Thus, the differences in the composition of various soil profiles can be evaluated and monitored by studying the modifications of water absorption peaks [20].
Studies have mostly focused on Vis-NIR spectroscopy applied to soils. In this study, NIR spectroscopy was conducted between 900-1,700 nm.
NIRS and the holistic approach of Aquaphotomics were applied to: i) confirm that indirect measurement of NIR spectral variations of water could distinguish between different soil samples; ii) verify their potential in identifying soil samples from different depths.

Study Areas and Chemical Analyses
Soil samples (n = 95) were collected from three study areas of Sardinia (Italy). The characteristics of the three sites are shown in Table 1. An artichoke organic cropping system. The soil was a calcareous clay-loam with pH 7.7.

O 28
Soil samples were collected from different depths: B and O soil samples were collected between 0 and 20 cm, while the soil A sample was collected between 0 and 100 cm.
The soil samples were dried and sieved, following which the carbon and nitrogen content of the samples was determined using an elemental analyzer (LECO CHN 628), according to the method described by Mura et al. [22].
Data on Soil Organic Carbon (SOC) and Soil Nitrogen (SN) were analyzed by the one-way analysis of variance (ANOVA) using the Statgraphics ver. 5.1 (Manugistic Inc, Rockville, MD, USA) software package to compare the means. ANOVA was performed to compare the chemical data of the three types of soil and those of the soil A samples at different depths. The differences among the sample means were determined by performing Tukey's HSD test, and the statistical significance was determined at 5% (P = 95%).

NIR Spectra Acquisition and Data Analyses
NIR spectra were recorded in the reflectance mode with the MicroNIR OnSite-W (VIAVI Solutions Italia S.r.l., Monza, Italy) portable spectrometer between 908 and 1,676 nm (50 scans; 125 reading points). Two measurements were made for each soil sample placed in Petri dishes (5 cm diameter) at room temperature. Before the analysis, the instrument was calibrated in the dark, in the air, and against a white standard.
Aquaphotomics was applied following the method by Tsenkova et al. [23] and, as described by Cattaneo et al. [24], the NIR absorbance spectra were pretreated by applying the Savitzky-Golay second derivative (second-order polynomial fit and 21 points). Multiplicative scatter correction (MSC) was applied to remove any possible scatter effects [23].
The spectra were then normalized using the formula: Here, Aλ is the absorbance at a specific wavelength, µλ is the mean value of all spectra, and σλ is the standard deviation of all spectra at wavelength λ. The mean of the resulting normalized spectra was determined using MS Excel® (Office 365, Microsoft Italia, Milan, Italy). Aquagrams were constructed from the normalized spectral data. An aquagram is a star-shaped diagram that shows, on axes with the same origin, the mean absorbances of the samples at 12 wavelengths selected from the normalized spectra. The chart represents the water absorbance pattern under certain physiological conditions of biological systems, and it is a useful tool for easily identifying abnormalities in the system described by its water absorbance pattern [21].
Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were performed on spectral data using the Minitab 17 software (Minitab, Inc., State College, PA, USA). PCA is a linear, unsupervised pattern-recognition technique for analyzing, classifying, and reducing the dimensionality of numerical datasets in multivariate problems [25]. PCA allows the extraction of useful information from the dataset, the exploration of its structure, and the global correlation of the variables. The analysis was performed using the correlation matrix to have the same weight for all variables.
Linear Discriminant Analysis [25,26] is a classification procedure that maximizes the variance between categories and minimizes the variance within categories. It was performed to: i) compare the effectiveness of the variables recorded by the NIR sensor for determining differences among soil samples and their depths; ii) investigate how variables contribute to group separation. PCA was initially performed using the whole filtered and normalized NIR spectrum (1,298-1,676 nm) and then recalculated using only the 12 water absorption bands plotted in the aquagrams (WABs). Finally, another PCA model was constructed by adding SOC and SN contents as loadings.
The NIR classification ability was evaluated by performing LDA on five groups of soil: Arborea samples at 0-20 cm depth, Arborea samples at 20-50 cm depth, Arborea samples at more than 50 cm depth, Berchidda samples, and Ottava samples. The LDA was performed using the loadings of PCA that showed the highest values on the first three Principal Components (PCs).

Results and Discussion
The SOC and SN content of the different soil samples are presented in Table 2. The average SOC content in the 0-20 cm soil depth in the three different lands were 11.91 g/kg, 29.90 g/kg, and 15.58 g/kg for the intensive forage system (A), the agro-silvopastoral area (B), and the artichoke organic cropping system (O), respectively. The average SN contents were 1.08 g/kg at Arborea (A), 2.05 g/kg at Berchidda (B), and 1.47 g/kg at Ottava (O) ( Table 2). The composition of the three soil samples was significantly different (P = 95%). The characteristics of the analyzed soil samples showed high variability, indicating significant differences in the characteristics and morphology of the different soil samples. Such differences might be related to variations in land use, vegetation cover, and specific climatic conditions [27]. Soil A samples from 20-50 cm depth showed SOC and SN contents similar to those of the superficial samples (Table 2), i.e., the sample groups were not significantly different for the SOC and SN content (ANOVA, P = 95%). However, the samples from the deeper layers showed significant differences in the SOC and SN content (ANOVA, P = 95%). The raw spectra of the soil samples are shown in Figure 1a. The absorbance spectra were characterized by a flat profile throughout the spectral range with a strong absorbance peak around 1,430 nm. This feature was also detected by Oliveira et al. [28] and represented the first overtone of the-OH stretching modes associated with water [23]. The spectral differences among the samples were more evident after applying chemometric pretreatments, highlighting substantial variations in the region between 1,300 and 1,600 nm (Figure 1b).

Figure 1
The NIR spectra of soil samples; (a) raw and (b) second derivative (in the 1300-1600 nm range) (see Table 1 for details).
The spectra of samples at different depths differed due to soil moisture, organic matter, and soil texture. Zhang et al. [12] reported higher absorbances for the more superficial samples that were rich in organic matter and water compared to the samples below in the intermediate layer that were characterized by more sand and less water. The deepest samples, which were more clayey and moist, showed higher absorbances than the intermediate ones. These results suggested that Aquaphotomics can be used to determine the differences among various types of soil from different depths.
The variability of each type of soil was monitored while constructing the aquagrams. Based on the absorption bands detected in the normalized spectra following the method of Tsenkova et al. [23], 12 wavelengths were selected (WABs) as 1,342, 1,364, 1,374, 1,412, 1,426, 1,440, 1,452, 1,460, 1,476, 1,488, and 1,512 nm. Each wavelength was ascribed to specific species of water molecules. The aquagrams of the three types of soil ( Figure 2) showed similar profiles characterized by high absorption at 1,342, 1,364, and 1,374 nm, corresponding to free water molecules. The samples also showed lower absorbance values at 1,412, 1,426, and 1,440 nm related to trapped water, free water, and water with free-OH [23,29]. Soil A showed a more heterogeneous profile for the layers sampled at different depths from 0 to 100 cm.

Figure 2
The aquagrams of the soil from the three study areas.
The samples from Arborea were investigated in greater detail. A new star chart was constructed with the average absorbances of the three subgroups of samples according to their sampling depth: i. 0-20 cm, ii. 20-50 cm, and iii. 50-100 cm (Figure 3).

Figure 3
The aquagram of the subgroups of soil A at different sampling depths.
The chart showed the overlapping samples collected up to a depth of 50 cm. Samples from the deeper layers (>50 cm) showed higher absorbance values at 1,342 nm and between 1,460 and 1,488 nm. They were also characterized by lower absorbance values at 1,412 and 1,426 nm. As reported in a study [1], the composition of the soil differed depending on the depth. The uppermost layers, i.e., the O and A horizons, had a high content of organic matter and minerals from the parent material. The E horizon demonstrated a loss of silicate clay, iron, and aluminum due to flowing water, while the deepest layer, i.e., the B horizon, was generally denser and lower in organic matter. This is the layer where the leached materials from the A and E horizons accumulate. The organic matter in soil A decreased with an increase in depth, as found in a previous study [1]. This gradient affected the solvation properties of water, which changed the profile of the spectra that the aquagrams highlighted.
The score plots and the loading plots obtained after preliminary PCA processing of the standardized data of all the absorption bands are shown in Figure 4. Overall, the explained variance was 80.4%, which increased to 84.6% after reprocessing using the wavelengths that affected the principal components the most. In both cases, the PCA did not represent the soil samples well, indicating that a different approach was required to select the variables.

Figure 4
The score plots (left) and loading plots (right) of the PCA of the soil samples for the entire NIR range (above) and the absorption bands influencing the PCs the most (below).
The score plot and the loading plot of the PCA performed using the WABs resulting from the construction of the aquagrams, which showed the most characteristic wavelengths acting on the distribution of the samples, are presented in Figure 5. The score plot showed a separation along PC1 between soil samples B and O. The loadings indicated that this separation was based on the wavelengths from 1,412 to 1,440 nm, characterizing soil B. One group of samples belonging to soil A were superimposed on group B, while some samples of Group A were separated from all the others showing higher values for PC2. The separation was mainly based on the wavelengths at 1,374 nm on PC1 and 1,460 nm on PC2.
The separation of the samples using NIR based on the sampling area confirmed the preliminary results of another study [22]. The authors found that NIR could effectively determine the differences among three different soil samples collected at the same depth (total explained variance was over 85%) using the whole spectral range between 908 and 1,676 nm.
Another PCA, performed after, including the SOC and SN contents as loadings, allowed further investigation of the subgroup of soil A samples ( Figure 6).

Figure 6
A PCA performed with soil samples using the 12 WABs along with the contents of SOC and SN; the score plot is on the left and the loading plot is on the right. This processing showed improved global separation among soils on the PC1 compared to Figure  5. However, it did not provide further information regarding the subgroup of samples from Arborea that have variable and negative PC2 loadings. The PCA scores and the loading plot showed that this sample subgroup was characterized by lower SOC and SN values. The samples from the deepest layers showed significantly lower SOC and SN content than those from the superficial layers (Table 2). According to their increasing sampling depths, these samples were distributed along PC1 from right to left, i.e., more negative values of PC1 were associated with samples from deeper layers. These results allowed us to determine the differences between the top layer samples (0-20/25 cm) and the deeper ones (45-90 cm), based on the wavelengths at 1,488 and 1,476 nm ( Figure 3). These samples had water molecules with strong hydrogen bonds. Soil A was sandy with a high percentage of sand (over 60%). It had low organic matter but the best drainage. As the soil was light and porous, the water passed quickly from the top layers to the deeper layers [30].
An LDA was performed using the variables of the PCA with the highest loadings on the first three PCs (1,374, 1,342, 1,426, and 1,460 nm). The discriminant function coefficients and the classification matrices obtained through the analysis are reported in Tables 3 and 4. Table 3 The linear discriminant function coefficients from the variables of the Aquaphotomics approach.   Table 4 The classification matrix of the LDA performed at four wavelengths for "Arborea 0-20" (n = 27), "Arborea 20-50" (n = 4), "Arborea >50" (n = 6), "Berchidda 0-20" (n = 30), and "Ottava 0-20"(n = 28). The coefficients of the linear discriminant functions (Table 3) showed how the predictor variables differentiated among the groups. For example, "Ottava 0-20" had the largest linear discriminant functions (7,498 and 2,999) for absorbances at 1,374 and 1,460 nm, which indicated that the scores for that group contributed more than those of Arborea and Berchidda to group membership classification. Similarly, the absorbance at 1,426 nm helped to classify the "Arborea 20-50" samples, and that at 1,342 nm helped to classify the "Arborea > 50" samples.

Actual
The LDA achieved a moderately good classification rate, where 77 out of 95 items were classified correctly (81.0%) ( Table 4). The best classification performance was for the "Ottava 0-20" and "Arborea 20-50" samples, followed by that of the "Arborea >50" samples. Most of the misclassification occurred for the topsoil samples of Arborea and Berchidda, indicating that their current use destination (Table 1) and chemical contents ( Table 2) might exhibit similar NIR characterization. This was also found from the squared distances between groups where the similarity of the topsoil of Arborea and Berchidda was evident (Table 4).
Including the SOC for classification significantly improved the classification rate (Table 5) from 81.0% to 92.6% (i.e., 88 out of 95 items were classified correctly). Table 5 The classification matrix of the LDA constructed including the SOC with the four wavelengths (The size of the groups did not change). Comparing the results presented in Table 4 and Table 5 showed that including the SOC for classification improved the accuracy of classification for the Berchidda 0-20 samples with improved squared distances from all other samples. However, there were still some issues regarding the accuracy of the classification of the Arborea samples according to their sampling depth. Most studies have performed Vis-NIR spectroscopy [13,28,[31][32][33]. In this method, the visible portion of the spectrum greatly contributes to the classification models as it is linked to the colorimetric aspects. We showed that even NIR, limited to the range of the first water overtone, could determine differences among soil samples collected at different depths. Indeed, our results agree with the work of Chen et al. [31] regarding to the use of vis-NR spectroscopy combined with the 'multiple objectives mixed support vector classification' (MOM -SVC) method to classify soil profiles. They found a good classification power of the method, which improved with the addition of data on physicochemical characteristics of the soil, such as SOC, to the model.

Conclusions
NIR spectroscopy coupled with Aquaphotomics could successfully determine the soil structure and highlight the significant role of water molecules in assessing 'natural' systems.
The structure of water explains intermolecular and intramolecular forces, most of which arise due to the dipole moment of water. Such forces influence many properties that regulate the flow of matter and energy [30].
A portable instrument to identify the kind of soil in different areas, for example, in areas that have undergone desertification, could be useful for soil classification, as well as, the rapid and non-destructive analysis of soil characteristics. Using such a technique without chemical reagents and waste disposal is important to develop sustainable and eco-friendly methods. In this context, NIRS can be considered to be 'green' as it follows the SDGs regarding sustainability.
The Aquaphotomics approach also detected considerable variability in soil properties associated with the sampling depth. The availability of Aquaphotomics-dedicated software could help researchers to monitor soil conditions. Aquaphotomics can save energy and additional costs together with other methods of evaluation, such as precision maps, prescription maps, and other precision agriculture tools.
The information could be used to develop new flowsheets of precision agriculture, such as irrigation procedures and soil nutrient distributions related to the frequency and amount of fertilizer applied.
This approach allows the collection of information that is currently not easy to explain. Thus, additional studies are required to comprehensively understand how the vibration of water molecules influences soil composition.