عنوان مقاله [English]
نویسنده [English]چکیده [English]
Introduction Mapping the spatial distribution of soil taxonomic classes is important for informing soil use and management decisions. Digital soil mapping (DSM) can quantitatively predict the spatial distribution of soil taxonomic classes. DSM is the computer-assisted production of digital maps of soil type and soil properties. It typically implies use of mathematical and statistical models that combine information from soil observations with information contained in correlated variables and remote sensing images. Machine learning is a general term for a broad set of models used to discover patterns in data and to make predictions. Although machine learning is most often applied to large databases, it is an attractive tool for learning about and making spatial predictions of soil classes because knowledge about relationships between soil classes and environmental covariates is often poorly understood. Our objective was to compare multiple machine learning models (multinomial regression logistic, boosted regression trees and decision tree) for predicting soil great groups at Bam distinct in Kerman province.
Materials and Methods The study area, Bam district was located between 58°4΄17˝ to 58°28΄8˝ E longitudes and 28°52΄51˝ to 29°9΄29˝ N latitudes (Fig. 1), at Kerman province, (Southeastern Iran). The area is surrounded by mountains (dominantly limestone and volcanic) from northwest toward southeast with major landforms included young alluvial fans and pediment, clay flat and hills. The mean annual precipitation, temperature and potential evapotranspiration are respectively 64 mm, 23.8◦C and 3000 mm with Aridic and Hyper thermic soil moisture and temperate regimes Stratified sampling scheme were defined in 100000 hectares, and 126 soil profiles were excavated and described by Key of soil taxonomy. Our objective was to perform and compare multiple machine learning models for predicting soil taxonomic classes (great group level). The models were used in this study including, multinomial logistic regression (MLR), boosted regression trees (BRT) and decision tree (DT). We used 80/20 training/testing split (80% of the pedon observations were used for model training and 20% for model testing). Kappa index (KI), overall accuracy (OC), Brier scores (BS), User accuracy (UA) and producer accuracy (PA) were used to compare model accuracy.
Results and Discussion The profile description revealed the presence of two soil orders: Entisols and Aridisols that, subdivided in six suborders and eight great groups: Haplosalids, Haplocambids, Haplocalcids, Haplogypsids, Calcigypsids, Calciargids, Petrocalcids and Torriorthents. This testifies to the wide pedodiversity of the study area, considering that is characterized by the presence of eight soils great groups. Results showed that the geomorphology map contributed importantly to the prediction accuracy. This can be explained by the fact that the geomorphological surfaces have formed recently, or during a geological period with soil formation under conditions close to those of current processes in the arid regions. Terrain attributes and finally remote sensing indices after geomorphic surface were imported as predictors in the prediction. The best prediction result was obtained when characteristics derived from terrain, remote sensing and geomorphological processes were used together and when differentiation of geomorphological processes and overall heterogeneity identification and stratification of the study area was made. In areas where the distribution of predictors was more homogenous, the models can better understand and connect predictors and response. The spatial distribution of soils in the study area followed the distribution pattern of most geomorphological and terrain attributes. The results of model comparing indicated that decision tree was consistently the most accurate. The results of prediction accuracy of soil groups showed that the highest accuracy related Haplosalids, Calcigypsids and Petrocalcids soil great groups. The lowest of predictive quality was observed for Haplocalcids in three approaches. As a reliable and flexible approach, decision tree could be used successfully to prepare continuous digital soil maps.
Conclusion The application of decision trees for prediction of soil types could be a promising alternative. In digital soil mapping, the best prediction result was obtained when parameters derived from terrain, remote sensing and geomorphological processes were used together and when differentiation of geomorphological processes and overall heterogeneity identification and stratification of the study area was made. In areas where the distribution of predictors was more homogenous, the models can better understand and connect predictors and response. Altogether, an extended digital terrain analysis approach and clear description of geomorphological, geological and pedological processes could be a promising key technology in future soil mapping.