تازه های تحقیق
مطالعه و بررسی ماهیت ترکیبی دادههای لیتوژئوشیمیایی در مطالعات آماری ضروری است.
اگر ماهیت ترکیبی این داده ها نادیده گرفته شود، نتایج تحلیلهای آماری مستعد خطا و تناقض هستند.
نادیده گرفتن ماهیت ترکیبی دادههای ژئوشیمیایی در یک کانسار طلا منجر به براورد عیار منفی برای عنصر طلا شد که تناقضی آشکار است.
مطالعه آماری داده های ژئوشیمیایی با رویکرد ترکیبی با نتایج قابل قبول تر و دقیق تری نسبت به رویکرد سنتی همراه است.
در تحلیل رگرسیونی طلا، روش تبدیل لگاریتم ایزومتریک (ilr) بر سایر روش های تحلیل داده های ترکیبی ارجحیت دارد.
رویکرد ترکیبی در برآورد چارک های اول و سوم بهتر از روش های سنتی عمل می کند. این چارک ها از نظر اکتشافی اهمیت بالایی دارند.
عنوان مقاله [English]
When a geochemical sample is analyzed, grades are reported as strictly positive and constrained values, which are a form of compositional data (CoDa). It is proven that spurious correlations of closed data can affect the conventional statistical analysis such as regression modeling. The problem is that one cannot say how much of the uncertainty of the model is due to spurious covariance and correlations. Thus, it is wiser to choose the safe side and consider the compositional nature of the data using proper logratio approach. In this study, we assessed the regression analysis of gold grade in a gold occurrence located in the NW Iran. Compositional and noncompositional approaches were followed and the consequent results were compared to understand the impact of neglecting the compositional nature of data on gold grade regression analysis. Isometric logratio (ilr) balances were calculated and used to perform the compositional approach.
Comparison of the two approaches was carried out based on Correct Classification Rate (CCR) of the estimated values and the correlation coefficient of the estimated and real gold grades (R2). Additionally, the resemblance between the distribution of the estimated and real data were compared. R2 values for compositional and noncompositional approaches are 0.84 and 0.74 respectively, and CCR values at 40 ppb cut-off are 0.875 and 0.688 for the same set. On the other side, the distribution of estimated grades by compositional approach is closer to the real gold grades. It is notable that the noncompositional approach has estimated a negative grade, which is an evident inconsistency. Although the noncompositional approach is returning the exact value of the average, the compositional approach is more accurate at the first and the third quartiles, which are more critical.
All above-mentioned results approve that CoDa analysis of lithogeochemical data is essential. It is concluded that neglecting the compositional nature of data will compromise the reliability of the regression models. Thus, to stay away from the falsification, it is highly recommended to perform the proper logratio approach in multivariate statistical studies of geochemical data.
Almost all geochemical data are reported as constrained and strictly positive grades and concentrations. They count as a form of compositional data. Statistical analysis of such data, in the raw form, is exposed to inconsistency and can result unrealistic models. Logratio approach is an adequate way to treat the problem.
Methodology and Approaches
Multivariate regression analysis of the raw and logratio transformed data was performed and the results were compared. Isometric logratio (ilr) transform was executed to calculate unconstrained balances for the compositional approach.
Results and Conclusions
The compositional approach outperforms the noncompositional approach in terms of Correct Classification Rate (CCR) and R2 (correlation coefficient of estimated and real data). In addition, the noncompositional approach inconsistently estimated a negative grade and is less accurate in the first and the third quartiles of the population. It is concluded that the compositional data analysis is essential when we are working with multivariate geochemical data.