نوع مقاله : مقاله پژوهشی
نویسنده
گروه مهندسی معدن، دانشگاه صنعتی بیرجند، بیرجند، ایران
چکیده
کلیدواژهها
موضوعات
عنوان مقاله [English]
نویسنده [English]
Summary
Normalizing data distribution, modifying outlier data, and converting datasets from the closed system to the open system are initial preprocessing using multivariate statistical methods. In this paper, PPMT, MAF, and SCT transformation methods are also introduced for this purpose in addition to the log-ratio transformation method. The analyzed data of 396 litogeochemical samples in the Hemych exploration area have been used to evaluate the performance of the transformation methods. The presence of hydrothermal alterations and Cu, Zn, Pb and Fe mineralization on the surface of the study area indicate the susceptibility of this area to porphyry and hydrothermal mineralization. The results of factor analysis on the transformed data by the four mentioned methods show when the distribution function of the transformed data is closer to the multivariate normal distribution, the values of the variance and loading factors are lower. This is due to the decreased correlation of the variables. The results of factor analysis also show that the contour maps of the factor scores obtained from the SCT transformation method can identify the internal and external zones of the porphyry mineralization system of the study area well.
Introduction
The use of statistical methods, especially multivariate statistical methods, is among the basic principles of geochemical data analysis. However, data with non-normal distribution function, the existence of outlier data, and the closed system of the geochemical data are challenges on the way of using these methods. These problems can cause bias in the results of multivariate statistical analysis such as regression, discriminant analysis, principal component analysis, and factor analysis. The solution to this problem is to use nonlinear transformations to transfer data from one coordinate system to another.
Methodology and Approaches
In addition to the conventional log-ratio transformation method (ilr-clr), projection pursuit multivariate transform (PPMT), Min/Max autocorrelation factors (MAF), and step-wise conditional transformation (SCT) were used to convert data distribution to a multivariate normal distribution. Then, the factor analysis method was used on the transformed data to identify the factors related to mineralization and the internal and external zones of the porphyry and hydrothermal systems of the study area.
Results and Conclusions
The results of factor analysis on the transformed data by the four mentioned methods show that the closer the distribution of the transformed data to the multivariate normal distribution is, the lower values of the variance and loading factors are. This is due to the decreased correlation of the variables. The results of factor analysis also showed that the contour maps of the factor scores obtained from the SCT transformation method can identify the internal and external zones of the porphyry mineralization system of the study area well. Also, the contour maps of the factor scores obtained from the data transform by the ilr-clr method have been able to show the external zone of mineralization well and the internal zone of mineralization to some extent. Therefore, this paper suggests using the introduced transformation methods along with the log-ratio transformation method for all preprocessing of the statistical and data mining on the exploration data.
کلیدواژهها [English]
استفاده از روشهای آماری خصوصاً روشهای آمار چند متغیره، جزء اصول اساسی تحلیل دادههای ژئوشیمی اکتشافی محسوب میشوند. ولی عدم تبعیت توزیع دادهها از توزیع نرمال، وجود مقادیر خارج از ردیف و بسته بودن سیستم عددی دادههای ژئوشیمیایی امکان استفاده از این روشها را با چالش بزرگی مواجه میکند. تبعیت توزیع دادهها از توزیع نرمال شرط اولیه استفاده از روشهای آماری چند متغیره همچون رگرسیون، تحلیل تمایز، تحلیل مؤلفههای اصلی و تحلیل فاکتوری است؛ زیرا غیر نرمال بودن توزیع دادهها باعث ایجاد اریب در نتایج تحلیلهای آماری میشود [1]. همچنین وجود مقادیر خارج از ردیف باعث ایجاد اریب در تخمین ماتریس میانگین و تورم ماتریس واریانس-کوواریانس خواهد شد که نتیجه آن ایجاد اثر پوششی1 و اثر درون آوری2 در تحلیل دادهها به روشهای آمار چندمتغیره خواهد بود [3،2]. بسته بودن سیستم عددی دادهها نیز باعث ایجاد همبستگی کاذب بین متغیرها میشود بهنحویکه تغییرات یک متغیر بر روی متغیرهای دیگر تأثیر گذاشته که نتیجه آن محدود شدن دامنه تغییرات هر متغیر خواهد بود. این نکته نیز نتایج تحلیلهای آماری را اریبدار و با مشکل روبرو خواهد کرد [5،4].