How does dimensionality influence outlier detection effectiveness in multivariate geochemical data? insights from LOF and IF methods
Journal Publication ResearchOnline@JCUAbstract
This paper examines the impact of the curse of dimensionality on the performance of isolation forest (IF) and local outlier factor (LOF) in detecting mineralization-related geochemical anomalies from a high-dimensional geochemical dataset. Using subsets selected through random and supervised methods with varying dimensions, IF and LOF were tested against known mineral deposit locations to assess their effectiveness. This study evaluates the percentage of mineral occurrences classified as anomalies and the area under the ROC curve across different dimensionalities. Furthermore, the influence of dimension reduction techniques such as PCA and ISOMAP on IF and LOF performance is explored. IF demonstrates consistent performance, proving robust across various dimensions and particularly suited to high-dimensional datasets. In contrast, LOF displays sensitivity to dimensionality, with optimal performance in lower dimensions (5 to 10 variables) but diminishing effectiveness beyond this range. This sensitivity highlights the importance of judicious input variable selection for LOF to achieve effective anomaly detection in geochemical datasets. Additionally, this study reveals that the performance of IF remains stable with both PCA and ISOMAP, whereas LOF benefits more from PCA, where its variance-maximizing feature may retain sufficient structural integrity for effective anomaly detection. Conversely, the performance of LOF declines with ISOMAP due to its more significant impact on local density changes. This variation underscores the need for a careful selection of dimension reduction methods and the number of components used as input for outlier detection methods.
Journal
Earth Science Informatics
Publication Name
N/A
Volume
18
ISBN/ISSN
1865-0473
Edition
N/A
Issue
N/A
Pages Count
15
Location
N/A
Publisher
Springer
Publisher Url
N/A
Publisher Location
N/A
Publish Date
N/A
Url
N/A
Date
N/A
EISSN
N/A
DOI
10.1007/s12145-024-01611-0