نوع مقاله : کاربردی

نویسندگان

1 دانش آموخته دکتری گروه مهندسی علوم خاک، دانشکده کشاورزی، دانشگاه شهید چمران اهواز، اهواز، ایران

2 استاد گروه مهندسی علوم خاک، دانشکده کشاورزی ، دانشگاه شهید چمران اهواز، اهواز، ایران

3 استاد گروه سنجش ازدور و GIS، دانشکده علوم زمین، دانشکاه شهیدچمران اهواز، اهواز، ایران

چکیده

آگاهی از توزیع مکانی کربن آلی خاک گامی موثر در دستیابی به استفاده پایدار از اراضی و تعیین استرازی‌های مدیریتی مربوط به آن است. از این رو، این مطالعه با هدف مدل‌سازی و نقشه‌برداری رقومی کربن آلی خاک سطحی (10-0 سانتی‌متری) شهرستان سمیرم با استفاده از روش‌های رگرسیون جنگل تصادفی و رگرسیون خطی چند متغیره انجام شد. به این منظور200 نمونه خاک سطحی به صورت منظم و با فواصل نمونه‌برداری 5 کیلومتر × 5 کیلومتر از سطح منطقه برداشت گردید و سپس کربن آلی نمونه ها با استفاده از روش واکلی- بلک اندازه‌گیری شد. در پایان، نقشه رقومی کربن آلی در خاک سطحی منطقه با روش‌های مزبور و به کمک متغیرهای کمکی استخراج شده از مدل رقومی ارتفاع و تصاویر ماهوارۀ لندست 8 در محیط نرم‌افزار RStudio تهیه شد. یافته-های این مطالعه حاکی از آن است که الگوریتم جنگل تصادفی برای برآورد میزان کربن آلی خاک به ترتیب با مقادیر RMSE و R2 معادل 12/0 و 79/0 نسبت به روش رگرسیون خطی چندمتغیره با RMSE و R2 معادل 192/0 و 57/0پیش‌بینی‌های بهتری ارائه داده است. نتایج نشان داد که مهم‌ترین متغیرهای محیطی مؤثر بر توزیع کربن آلی خاک در منطقه مطالعاتی در مدل‌های مورد استفاده یکسان نیستند. به‌گونه‌ای که در مدل جنگل تصادفی شاخص‌های مستخرج از پوشش گیاهی و در رگرسیون خطی چندمتغیره شاخص‌های توپوگرافی نقش بیشتری در توزیع کربن آلی داشته است. بررسی نقشه نهایی پراکنش کربن آلی خاک در منطقه مطالعاتی نشان داد که تخمین‌های انجام شده با روش جنگل تصادفی اگرچه در مقایسه با روش رگرسیون خطی چندمتغیره تخمین‌های بهتری را ارائه داده اما در تخمین مقادیر کمینه و بیشینه مقادیر کربن آلی سطحی خاک‌ها موفق نبوده است.

کلیدواژه‌ها

موضوعات

عنوان مقاله [English]

Modelling and Digitally Mapping of Surface Soil Organic Carbon in Semirom County Employing Several Machine Learning Algorithms

نویسندگان [English]

  • Fatemeh Rahmati 1
  • ُSaeid Hojati 2
  • Kazem Rngzan 3
  • Ahmad Landi 2

1 Former PhD Student, Department of Soil Science, Faculty of Agriculture, Shahid Chamran University of Ahvaz, Ahvaz, Iran

2 Professor,Department of Soil Science, Faculty of Agriculture, Shahid Chamran University of Ahvaz, Ahvaz,, Iran

3 Professor,Department of Remote Sensing and GIS, Faculty of Earth Sciences, Shahid Chamran University of Ahvaz,, Ahvaz, Iran

چکیده [English]

Introduction: Knowledge about the spatial distribution of soil organic carbon (SOC) is one of the practical tools in determining sustainable land management strategies. Estimation of carbon contents and stocks are important for carbon sequestration, greenhouse gas emissions and national carbon balance inventories. Accurate mapping of SOC’s spatial distribution is a key assumption for soil resource management and land use planning. During the last two decades, the utilization of data mining approaches in spatial modeling of SOC using machine learning algorithms have been widely taken into consideration. The digital environment needs to have soil continuous maps at local and regional scales. However, such information is always not available at the required scale. Therefore, DSM approach is a key solution for quantifying and assessing the variation of soil properties such as SOC using remotely sensed indices and digital elevation model (DEM) as the most commonly useful ancillary data for soil organic carbon prediction. In this way, the data mining techniques is the pathway to create digital soil maps. Therefore, this study was carried out to compare the two common machine learning algorithms including random forest and multiple linear regressions in digital mapping of surface SOC in the Semirom County, Isfahan province. The digital maps of SOC using the two above-mentioned algorithms were also created and the most important variables affecting the distribution of SOC in the study area reported.



Materials and Methods: A total number of 200 surface soil samples (0-10 cm) were collected from the Semirom area (51º 17' - 52º 3' E; 30º 42' - 31º 51' N), Isfahan, Iran. Based on the synoptic meteorological station reports, the annual average temperature was in the range of 7.5-12.5 ▫C, the annual precipitation ranged between 350-450 mm. Soil moisture and temperature regimes are Xeric and Mesic, respectively. Then, using the Global Positioning System (GPS), sampling was done from the soil surface layer (0-10 cm). The preparation of soil samples includes air drying, pounding and softening of the collected samples performed, and then the samples were passed through a 2 mm sieve. Then, the amount of organic carbon in the samples was determined utilizing the Walkley-Black method. Also, in order to evaluate the effect of other soil properties on the organic contents of the soils, laboratory analyzes including saturated soil moisture content, soil texture, soil pH in saturated pastes, electrical conductivity of the soil saturation extracts and the calcium carbonate equivalent of the soils were measured utilizing standard laboratory protocols.

In this research, auxiliary variables including terrain parameters and vegetation indices were derived from digital elevation model (DEM) and the Landsat 8 OLI satellite images employing ArcMap version 10.4.10 and SAGAGIS version 6.0.4. Then, all auxiliary layers were converted to raster format using the “raster” package and merged with each other using the “Covstack” function. Afterwards, the values of the all environmental covariates at each sampling point were extracted in a single file using the “extract” function of the “sp” package in the RStudio environment. Then, using SPSS software v.19 and the principal component analysis (PCA) method, among the 29 auxiliary variables used in this research, the most important auxiliary covariates were used in the modeling process. The dataset were then split into two groups referred to as calibration (80%) and validation (20%) subsets. Finally, SOC contents of the soils were predicted and mapped using multiple linear regression (MLR) and random forest (RF) algorithms in RStudio environment. MLR and RF algorithms were run employing “lm” and “randomForest” packages, respectively. Five different statistics was used for evaluating the performance of each model including the coefficient of determination (R2), bias, root mean square error (RMSE), nRMSE, and mean bias error (MBE).



Results and Discussion

Based on the descriptive analysis of the soil samples, soils of the study area were characterized as non-saline, alkaline, and calcareous soils. The SOC contents of the soils ranged from 0.3 % to 2.2% with the mean value of 0.89 %. The coefficient of variation for the SOC contents was 21.7%, based on which soils of the study area are classified as soils with the moderate variability considering the values proposed by Wilding (1985). The results of PCA showed that the most important auxiliary variables could be used for the modeling process are slope aspect, channels network base level, catchment slope, total curvature, height, longitudinal curvature, mass balance index, modified catchment area, slope degree, slope length, topographic position index, vertical distance to channel network, soil adjusted vegetation index, transformed vegetation index, difference vegetation index, ratio vegetation index, and general curvature.These variables explained 80% of the total variance over the study area. The comparison of the two different SOC prediction models, demonstrated that the RF model (ntree =1000 and mtry =10) with the R2, RMSE, nRMSE, and bias values of 0.79, 0.12, 0.13, and 0.002 respectively, had a better performance rather than MLR model in this study. The first five very important variables detected by RF algorithm to predict SOC contents over the study area were transformed vegetation index, ration vegetation index, soil adjusted vegetation index, and slope degree. The final map of the surface SOC distribution over the study area shows that although the estimates made by the RF algorithm have provided better estimates compared with the MLR model, but caused overestimation and/or underestimation in predicting the minimum and maximum values of the surface SOC contents, respectively.



Conclusion

The results of this study showed the better performance of the RF regression algorithm due to its ability to take into account the nonlinear and complex relationships between SOC contents and the environmental covariates compared to the MLR method.

کلیدواژه‌ها [English]

  • Environmental covariates
  • machine learning
  • performance
  • spatial distribution
1. Breiman, L., 2001. Random forests. Journal of Machine learning, 45(1): 5-32. DOI: 10.1023/A: 1010950718922.
2. Bouyoucos, G. J. 1951. A recalibration of hydrometer method for making mechanical analysis of soil. Agronomy, 43: 434-438. DOI: 10.2134/agronj1951.00021962004300090005x
3. Cutler, D.R., Edwards, J.T.C Beard, A. Cutler. K. H., and Hess, K.T. 2007. Random forests for classification in ecology. Journal of Ecology, 88(11): 2783-2792. DOI: 10.1890/07-0539.1
4. Hengl T., Rossiter D. G., and Stein A. 2003. Soil sampling strategies for spatial prediction by correlation with auxiliary maps. Geoderma, 120: 75–93. DOI: 10.1071/SR03005.
5. Hengl, T., Heuvelink, G.B.M., and Stein, A. 2004. A generic framework for spatial prediction of soil variables based on regression kriging. Geoderma, 120: 75–93. DOI: 10.1016 / j.geoderma. 2003.08.018.
6. Hengl, T., Heuvelink, B. M., Kempen, B., Leenaars, J.G. B., Walsh., M. G., Shepherd, K.D., Sila, A., MacMillan, R.A., Jesus, J. M., Tamene, L., and Tondoh, J.E. 2015. Mapping soil properties of Africa at 250 m resolution: random forests significantly improve current predictions, PLOS ONE 10 (6): e0125814. DOI: 10.1371/journal.pone.0125814.
7. Heung, B., H.C. Ho., J. Zhang., A. Knudby., C.E. Bulmer., and M.G. Schmidt. 2016. An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping. Geoderma, 265: 62-77. DOI: 10.1016 / Geoderma.2015.11.014.
8. Fathololoumi, S., Vaezi, A. R., Alavipanah, S. K., Ghorbani, A., Saurette and D., Biswas, A. (2021). Effect of multi-temporal satellite images on soil moisture prediction using a digital soil mapping approach. Geoderma, 385, 114901. DOI: 10.1016/j.geoderma.2020.114901.
9. IBM Corp (2010). IBM SPSS Statistics for Windows, version 19, SPSS Inc., Chicago, Ill., USA.
10. Jalalian A. 1997. The studies of land resources and capability determination in Semirom area. The Ministry of Jahad Sazandegi, Isfahan Province. (in Persian)
11. Jenny, H. 1941. Factors of soil formation: A system of quantitative pedology. McGraw-Hill, New York.
12. Jeong, G.H., Oeverdieck, S.J., Park, Huwe, B. and Ließ, M. 2017. Spatial soil nutrients prediction using three supervised learning methods for assessment of land potentials in complex terrain. Catena, 154: 73-84. DOI: 10.1016/j.catena.2017.02.006.
13. Lanyon, L. E. and Heald, W. R. 1982.Magnesium, calcium, strontium and barium. In: Page, A.L., et al. (Eds.), Methods of Soil Analysis. Part II, Agronomy. Monograph, American Society of Agronomy and Soil Science Society of America, Madison, Wisconsin, pp. 247-260.
14. Mahmoudzadeh, H., Matinfar, H. R., Taghizadeh-Mehrjardi, R., and Kerry, R. (2020). Spatial prediction of soil organic carbon using machine learning techniques in western Iran. Geoderma Regional, 21, e00260. DOI: 10.1016/j.geodrs.2020.e00260.
15. McBratney, A.B., Mendonça Santos, and M.L., Minasny, B. 2003. On digital soil mapping. Geoderma, 117: 3-52. DOI: 10.1016/S0016-7061(03)00223-4.
16. McBratney, A.B., Stockmann, U., Angers, D., Minasny, B., and Field, D. 2014. Challenges for soil organic carbon research. In: Hartemink, A.E., McSweeney, K., (Eds.), Soil Carbon, Cham: Springer, pp. 3-16. DOI: 10.1007/978-3-319-04084-4-1.
17. Mondal, A., Khare, D., Kundu, S., Mondal, S., Mukherjee, S. and A. Mukhopadhya. 2016. Spatial soil organic carbon (SOC) prediction by regression kriging using remote sensing data. Egypt. Journal of Remote Sensing and Space Science, 20 (1): 61-70. DOI: 10.1016/j.ejrs.2016.06.004.
18. Mosleh, Z., Salehi, M.H., Jafari, A., Borujeni, I.E., and Mehnatkesh, A. 2016. The effectiveness of digital soil mapping to predict soil properties over low-relief areas. Environmental Monitoring and Assessment, 188 (3): 195. DOI: 10.1007/s10661-016-5204-8.
19. Minasny, B., and McBratney, A.B. (2016). Digital soil mapping: A brief history and some lessons. Geoderma, 264: 301-311. DOI: 10.1016/j.geoderma.2015.07.017
20. Nelson, R. E. 1982. Carbonate and gypsum. In: Page, A.L., et al. (Eds.), Methods of Soil Analysis. Part 2: Chemical Methods, 2nd Ed., Agronomy Monograph, No. 9, American Society of Agronomy and Soil Science Society of America, Madison, WI. pp. 181-196.
21. Olaya, V. 2004. A gentle introduction to SAGA GIS. The SAGA User Group eV, Gottingen, Germany. 208 pp.
22. Osat, M., Heidari, A., Karimian Eghbal, M., and Mahmoodi, Sh. (2016). Spatial variability of soil development indices and their compatibility with soil taxonomic classes in a hilly landscape: a case study at Bandar village, Northern Iran. Journal of Mountain Science, 13(10): 1746-1759. DOI: 10.1007/s11629-016-3952-0.
23. R Development Core Team. 2015. R: a language and environment for statistical computing. R. Foundation for Statistical Computing, Vienna, Austria. http://www.Rproject.org.
24. Rossel, R.A.V., and McBratney, A.B. 2009. Diffuse reflectance spectroscopy as a tool for digital soil mapping. In: Hartemink, A.E., et al., (Eds.), Digital Soil Mapping with Limited Data. Springer, Dordrecht, pp. 165-172. DOI: 10.1007/978-1-4020-8592-5.
25. Rudiyanto, R., Minasny, B., Setiawan, B.I., Arif, C., Saptomo, S.K., and Chadirin, Y. (2016). Digital mapping for cost-effective andaccurate prediction of the depth and carbon stocks in Indonesian peatlands. Geoderma, 272: 20–31. DOI: 10.1016/j.geoderma.2017.10.018.
26. Richards, L.A. 1954. Diagnosis and Improvement of Saline-Alkali Soils. USDA Hand book, No. 60. Washington, D.C., U.S.A.
27. Rhoades, J.D. 1996. Salinity: electrical conductivity and total dissolved soils. In: Sparks, D.L. (Ed.), Methods of Soils Analysis, Part 3: Chemical Methods. Soil Science Society of America Book series Number 5, Soil Science Society of America, Madison, Wisconsin, pp. 417-435.
28. Tarkalson, D.D., Brown, B., Kok, H., and Bjorneberg, D.L. 2009. Irrigated small-grain residue management effects on soil chemical and physical properties and nutrient cycling. Soil Science, 174:303-311. DOI: 10.1097/SS.0b013e3181a82a5f
29. Skullberg, U. 1991. Seasonal Variation of pH H2O and pH CaCl2 in centimeter- layers of mor Humus in a Picea Abies (L.) Karst stand. Sweden University of Agri Science, Department of Forest Site Research.
30. Sreenivas, K., Dadhwal, V.K., Kumar, S., Harsha, G.S., Mitran, T., Sujatha., G., Janaki Rama, S., Fyzee, M.A, and Ravisankar, T. 2016. Digital mapping of soil organic and inorganic carbon status in India. Geoderma, 269: 160-173. DOI: 10.1016/j.geoderma.2016.02.002
31. Sys, C., Van Ranst, E. and Debaveye, J. 1991. Land Evaluation. Agricultural Publication No. 7, General Administration for Development Cooperation, Brussels.
32. Taghizadeh-Mehrjardi, R., Nabiollahi, K. and Kerry, R. 2016. Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran. Geoderma, 266: 98–110. DOI: 10.1016/j.geoderma.2015.12.003
33. Vaysse, K. and Lagacherie, K. 2015. Evaluating digital soil mapping approaches for mapping Global Soil Map soil properties from legacy data in Languedoc-Roussillon (France). Geoderma Regional, 4: 20-30. DOI: 10.1016/j.geodrs.2014.11.003
34. Venables, W.N., and B.D. Ripley. 2013. Modern applied statistics with S-PLUS. Springer, Dordrecht, 498 p.
35. Walkley A. and Black, I.A. 1934. An examination of the Degtjareff method for determining organic carbon in soils: effect of variations in digestion conditions and of inorganic soil constituents. Soil Science, 63: 251-263. DOI: 10.1097/00010694-194704000-00001
36. Wallach, D., Makowski, D., Jones, J.W., and Brun, F. 2006. Working with dynamic crop models: Evaluation, analysis, parameterization, and applications. Elsevier.
37. Wang Sh., Jin X., Adhikari, K., Li, W., Yu, M., Bian, Zh. and Wang, Q. 2018. Mapping total soil nitrogen from a site in northeastern China. Catena, 166: 134-146. DOI: 10.1016/j.catena.2018.03.023.
38. Wang, S., Wang, Q., Adhikari, K., Jia, S., Jin, X. and Liu, H. 2016. Spatial-temporal changes of soil organic carbon content in Wafangdian, China. Sustainability, 8: 1154. DOI: 10.3390/su8111154.
39. Wilding, L. 1985. Soil spatial variability: Its documentation, accommodation, and implication to soil surveys. In: Nielson, D.R., Bouma, J. (Eds.), Wagenigen, Netherland, pp. 166-194.
40. Yarali, J., Esmaili, A. and Esmaili, G.H. 2013. Statistical Analyze with SPSS 20. Kankash Publication, pp. 220-234.
41. Zhang, H., Wu, P., Yin, A., Yang, X., Zhang, M. and Gao, C. 2017. Prediction of soil organic carbon in an intensively managed reclamation zone of eastern China: A comparison of multiple linear regressions and the random forest model. Science of the Total Environment, 592: 704-713. DOI: 10.1016/j.scitotenv.2017.02.146
42. Zhao, Z., Yang, Q., Benoy, G., Chow, T.L., Xing, Z., Rees, H.W., and F.R. Meng. 2010. Using artificial neural network models to produce soil organic carbon content distribution maps across landscapes. Canadian Journal of Soil Science, 90 (1): 75–87. DOI: 10.4141/CJSS08057
43. Zhou, T., Geng, Y., Chen, J., Pan, J., Haase, D. and Lausch, A. 2020. High-resolution digital mapping of soil organic carbon and soil total nitrogen using DEM derivatives, Sentinel-1 and Sentinel-2 data based on machine learning algorithms. Science of the Total Environment, 729: 138244. DOI: 10.1016/j.scitotenv.2020.138244
44. Zhou, Y., Hartemink, A.E., Shi, Z., Liang, Z., and Lu, Y. 2019. Land use and climate change effects on soil organic carbon in North and Northeast China. Science of the Total Environment, 647: 1230-1238. DOI:
10.1016/j.scitotenv.2018.08.016.