Application of multivariate transformation methods in geochemical data analysis of Hemych exploration area, South Khorasan Province

Document Type : Research Article

Author

Assistant of Professor, Dept. of Mining, Birjand University of Technology

Abstract

Summary
Normalizing data distribution, modifying outlier data, and converting datasets from the closed system to the open system are initial preprocessing using multivariate statistical methods. In this paper, PPMT, MAF, and SCT transformation methods are also introduced for this purpose in addition to the log-ratio transformation method. The analyzed data of 396 litogeochemical samples in the Hemych exploration area have been used to evaluate the performance of the transformation methods. The presence of hydrothermal alterations and Cu, Zn, Pb and Fe mineralization on the surface of the study area indicate the susceptibility of this area to porphyry and hydrothermal mineralization. The results of factor analysis on the transformed data by the four mentioned methods show when the distribution function of the transformed data is closer to the multivariate normal distribution, the values of the variance and loading factors are lower. This is due to the decreased correlation of the variables. The results of factor analysis also show that the contour maps of the factor scores obtained from the SCT transformation method can identify the internal and external zones of the porphyry mineralization system of the study area well.
Introduction
The use of statistical methods, especially multivariate statistical methods, is among the basic principles of geochemical data analysis. However, data with non-normal distribution function, the existence of outlier data, and the closed system of the geochemical data are challenges on the way of using these methods. These problems can cause bias in the results of multivariate statistical analysis such as regression, discriminant analysis, principal component analysis, and factor analysis. The solution to this problem is to use nonlinear transformations to transfer data from one coordinate system to another.
 
Methodology and Approaches
In addition to the conventional log-ratio transformation method (ilr-clr), projection pursuit multivariate transform (PPMT), Min/Max autocorrelation factors (MAF), and step-wise conditional transformation (SCT) were used to convert data distribution to a multivariate normal distribution. Then, the factor analysis method was used on the transformed data to identify the factors related to mineralization and the internal and external zones of the porphyry and hydrothermal systems of the study area.
Results and Conclusions
The results of factor analysis on the transformed data by the four mentioned methods show that the closer the distribution of the transformed data to the multivariate normal distribution is, the lower values of the variance and loading factors are. This is due to the decreased correlation of the variables. The results of factor analysis also showed that the contour maps of the factor scores obtained from the SCT transformation method can identify the internal and external zones of the porphyry mineralization system of the study area well. Also, the contour maps of the factor scores obtained from the data transform by the ilr-clr method have been able to show the external zone of mineralization well and the internal zone of mineralization to some extent. Therefore, this paper suggests using the introduced transformation methods along with the log-ratio transformation method for all preprocessing of the statistical and data mining on the exploration data.

Keywords

Main Subjects


استفاده از روش­های آماری خصوصاً روش­های آمار چند متغیره، جزء اصول اساسی تحلیل داده­های ژئوشیمی اکتشافی محسوب می‌شوند. ولی عدم تبعیت توزیع داده­ها از توزیع نرمال، وجود مقادیر خارج از ردیف و بسته بودن سیستم عددی داده­های ژئوشیمیایی امکان استفاده از این روش‌ها را با چالش بزرگی مواجه می‌کند. تبعیت توزیع داده‌ها از توزیع نرمال شرط اولیه استفاده از روش­های آماری چند متغیره همچون رگرسیون، تحلیل تمایز، تحلیل مؤلفه‌های اصلی و تحلیل فاکتوری است؛ زیرا غیر نرمال بودن توزیع داده­ها باعث ایجاد اریب در نتایج تحلیل­های آماری می­شود [1]. همچنین وجود مقادیر خارج از ردیف باعث ایجاد اریب در تخمین ماتریس میانگین و تورم ماتریس واریانس-کوواریانس خواهد شد که نتیجه آن ایجاد اثر پوششی1 و اثر درون آوری2 در تحلیل داده­ها به روش‌های آمار چندمتغیره خواهد بود [3،2]. بسته بودن سیستم عددی داده­ها نیز باعث ایجاد همبستگی کاذب بین متغیرها می­شود به‌نحوی‌که تغییرات یک متغیر بر روی متغیرهای دیگر تأثیر گذاشته که نتیجه آن محدود شدن دامنه تغییرات هر متغیر خواهد بود. این نکته نیز نتایج تحلیل­های آماری را اریب­دار و با مشکل روبرو خواهد کرد [5،4].

مراجع
[1]                 Johnson, R., and D. Wichern, (2018). Applied Multivariate Statistical Analysis, 6th Edition. Pearson Publisher, 808 p.
[2]                 Geranian, H., and Z., Khajeh Miry, (2018). Application of robust estimators in determining the outlier data; A Case study: Geochemical data of Shah Soliman Ali, South Khorasan Province. Journal of Analytical and Numerical Methods in Mining Engineering 7(14), 73-85 (In Persian).
[3]                 Maronna, R.A. (2006). Robust Statistics: Theory and Methods (with R), 2nd Edition. John Wiley & Sons, 460p.
[4]                 Pawlowsky-Glahn, V., Egozcue, J.J., and R., Tolosana-Delgado, (2015). Modeling and Analysis of Compositional Data. John Wiley & Sons, 275 p.
[5]                 Roshani Rodsari, P., Mokhtari, A.R., and S.H., Tabatabaei, (2012). Investigation on geochemical association of elemets in open and closed data system (Case study: Kuh-e-Panj copper deposit, Kerman). Journal of Analytical and Numerical Methods in Mining Engineering 2(4), 46-58 (In Persian).
[6]                 Filzmoser, P., Hron, K., and C., Reimann, (2009). Principal component analysis for compositional data with outliers. Environmetrics 20, 621–632.
[7]                 Filzmoser, P., Hron, K., Reimann, C., and R., Garrett, (2009). Robust factor analysis for compositional data. Computers & Geosciences 35, 1854–1861.
[8]                 Filzmoser, P., Hron, K., and M., Templ, (2012).  Discriminant analysis for compositional data and robust parameter estimation. Computational Statistics 27, 585-604.
[9]                 Alenazi, A. (2019). Regression for compositional data with compositional data as predictor variables with or without zero values. Journal of Data Science 17(1), 219 – 238.
[10]              Wackernagel, H. (2013).  Multivariate Geostatistics: An Introduction with Applications 3rd Edition. Springer, 403 p.
[11]              Battalgazy, N., and N., Madani, (2019). Stochastic Modeling of Chemical Compounds in a Limestone Deposit by Unlocking the Complexity in Bivariate Relationships.  Minerals 9, 683.
[12]              Madani, N. (2019). Application of projection pursuit multivariate transform to alleviate the smoothing effect in cokriging approach for spatial estimation of cross-correlated variables. Bollettino di Geofisica Teorica ed Applicata 60(4), 583-598.
[13]              Rajabi Nasab, B. (2019). Three-dimensional geometallurgical modeling in Sangan iron ore mine using multivariate geostatistical simulation. PhD Thesis, Tehran University.
[14]              Zacché da Silva, C., and J.F., Coimbra Leite Costa, (2019). Minimum/Maximum autocorrelation factors applied to grade estimation. R. Esc. Minas, Ouro Preto, 67(2), 209-214.
[15]              Taheri, Z. (2016). Multivariate Geostatistics Grade Estimation by MinimumMaximum Autocorrelation Factors Dimensionality Reduction Method (Case Study: 12A Anomaly Central Iron Ore). MSc Thesis, University of Kashan.
[16]              Mahlooji, R., Asghari, O., and B., Ghane, (2019). Multivariate simulation of a multi-element deposit وbased on the different transformations. Case study: Mehdiabad deposit, Iran. Bollettino di Geofisica Teorica ed Applicata 60 (4), 599-620.
[17]              Hosseini, S.A., and O., Asghari, (2015).  Simulation of geometallurgical variables through stepwise conditional transformation in Sungun copper deposit, Iran. Arabian Journal of Geosciences 8, 3821–3831.
[18]              Akhondi, N. (2012). Analysis of the mineral deposit area Hamych birjand Using Statistical Multivariate Methods. MSc. Thesis, University of Birjand (In Persian).
[19]              Dohuee, M., Aryafar, A., Hosseinzade, H., Khosravi, V., and S., Yousefi, (2018). Investigating Mineralization Related Alterations Using Remote Sensing Techniques and Fractal Geometry in Hemich Area, South Khorasan. 10th Symposium of Iranian Society of Economic Geology, Isfahan University (In Persian).
[20]              Mostafaei, K., Norouzi, G., Askari, M.S., and M., Shiva, (2010).  Statistical analysis and modeling of geophysical data of IP and RS in Hemich mineral deposit. The First Symposium of Iranian Society of Economic Geology, Ferdowsi University of Mashhad (In Persian).
[21]              Barnett, R.M., Manchuk, J.G., and C.V., Deutsch, (2014).  Projection Pursuit Multivariate Transform. Math Geosci 46, 337–359.
[22]              Barnett, R.M., Manchuk, J.G., and C.V., Deutsch, (2012).  Projection Pursuit Multivariate Transform. Paper 103, CCG Annual Report 14, Alberta University.
[23]              Barnett, R. M. (2017). Projection Pursuit Multivariate Transform. In J. L. Deutsch (Ed.), Geostatistics Lessons. Retrieved from http://www.geostatisticslessons.com/lessons/lineardecorrelation.html.
[24]              Barnett, R.M., Manchuk, J.G., and C.V., Deutsch, (2016). The projection pursuit multivariate transform for improved continuous variable modeling. SPE J. 21(6), 2010–2026.
[25]              Arcari Bassani, M.A., Leite Costa, J.F.C., and C.V., Deutsch, (2018). Multivariate geostatistical simulation with sum and fraction constraints, Applied Earth Science 127(3), 83-93.
[26]              Hwang, J., Lay, S., and A., Lippman, (1994). Nonparametric multivariate density estimation: a comparative study. IEEE Transactions on Signal Processing 42, 2795-2810.
[27]              Rondon, O. (2012). Teaching Aid: Minimum/Maximum autocorrelation factors for joint simulation of attributes. Math Geosci 44, 469–504.
[28]              Elogne, S., and O., Leuangthong, (2008).  Implementation of the Min/Max Autocorrelation Factors and Application to a Real Data Example. Geostatistics Lessons, Alberta University.
[29]              Barnett, R.M. (2017). Sphering and Min/Max Autocorrelation Factors. In J. L. Deutsch (Ed.), Geostatistics Lessons.  Retrieved from http://www.geostatisticslessons.com/lessons/sphereingmaf.html.
[30]              Leuangthong, O., and C.V., Deutsch, (2003). Stepwise conditional transformation for simulation of multiple variables.  Mathematical Geology 35(2), 155-173.
[31]              Deutsch, C.V. (2006). Stepwise Conditional transformation in estimation mode. Geostatistics Lessons 122, Alberta University.
[32]              Davis, R.A., Lii, K.S., and D.N., Politis, (2011). Selected works of Murray Rosenblatt, Springer, 486 p.
[33]              Filzmoser, P., Hron, K., and C., Reimann, (2009). Univariate statistical analysis of environmental (compositional) data: Problems and possibilities.  Science of the Total Environment 407, 6100–6108.
[34]              Filzmoser, P., Hron, K., and C., Reimann, (2010). The bivariate statistical analysis of environmental (compositional) data. Science of the Total Environment 408, 4230–4238.
[35]              Buccianti, A., and E., Grunsky, (2014). Compositional data analysis in geochemistry: Are we sure to see what really occurs during natural processes? Journal of Geochemical Exploration 141, 1–5.
[36]              Malekzadeh Shafaroudi, A., and M.H., Karimpour, (2015). Mineralogic, fluid inclusion, and sulfur isotope evidence for the genesis of Sechangi lead–zinc (–copper) deposit, Eastern Iran. Journal of African Earth Sciences 107, 1–14.
[37]              Kan Azin company, (2009). Detailed exploration of minerals in Birjand county (Hamych area). Industry, Mine & Trade Organization of South Khorasan Province (In Persian).
[38]              Ramzan, S., Maqbool Zahid, F., and S., Ramzan, (2013). Evaluating Multivariate Normality: A Graphical Approach. Middle-East Journal of Scientific Research 13 (2), 254-263.
[39]              Reimann, C., Filzmoser, P., and R.G., Garrett, (2002). Factor analysis applied to regional geochemical data: problems and possibilities. Applied Geochemistry 17, 185–206.
[40]              Karimpour, M.H., Malekzadeh, A., and M.R., Haidarian, (2012). Ore deposit exploration: Geology, geochemistry, satellite and geophysics methods. Ferdowsi University Press, 632 p. (In Persian).
[41]              Aliyari, F., Afzal, P., Lotfi, M., Shokri, S., and H., Feizi, (2020). Delineation of geochemical haloes using the developed zonality index model by multivariate and fractal analysis in the Cu–Mo porphyry deposits. Applied Geochemistry 21, 104694.