Nyah Bay, PhD

Impact of Geologically Confirmed Negative Training Labels on Prospectivity Modelling of Canadian Magmatic Ni (±Cu ±Co ±PGE) Sulphide Mineral Systems

N. Bay1,2, M. Parsa2, A. Swidinsky1
1Department of Earth Sciences, University of Toronto, Toronto, Ontario, Canada
2Geological Survey of Canada, Ottawa, Ontario, Canada

Mineral Prospectivity Mapping (MPM) involves integrating geological, geochemical, and geophysical data to assess the likelihood that an area will contain target mineral deposits. Advancements in Machine Learning (ML) have enhanced MPM's capability to process and extract features from extensive datasets. This study investigates the impact of geologically confirmed negative training labels on the MPM of Canadian magmatic Ni (±Cu ±Co ±PGE) sulphide mineral systems. Two types of labels were compared: randomly selected negative labels and geologically confirmed negative labels derived from other mineral deposit types such as sediment-hosted Zn-Pb, carbonatite-hosted REE±Nb, and Li-Cs-Ta pegmatites. Gradient Boosting (GB) and Random Forest (RF) algorithms were employed to generate prospectivity models, followed by a risk-return analysis. The models created with geologically confirmed negative labels demonstrated superior Area Under the Receiver Operating Characteristic Curve (AUC-ROC) values, with the GB and RF models achieving 0.930 and 0.923, respectively, compared to 0.909 and 0.891 for models created with random negative labels. Despite this, the random negative labels outperformed the confirmed negative labels in success-rate curves, with Area Under the Success-Rate Curve (AUC-SRC) values of 0.911 and 0.884 for GB and RF models, respectively, compared to 0.854 and 0.836 for the confirmed labels. The random labels identified only 1.8% and 0.98% of the cells as high-return and low-risk for GB and RF models, respectively, compared to higher proportions identified by the confirmed labels. The normalized density index (NDI) further emphasized the advantages of random labels, showing higher values in the low-risk high-return class, particularly for the RF model created with random labels, which achieved an NDI of 20.1. In contrast, the model created with confirmed labels only achieved an NDI of 8.38. The findings suggest that while geologically confirmed negative labels offer higher discrimination power as reflected in AUC-ROC values, random negative labels may provide a more efficient exploration focus, reducing the search space and better concentrating on the most prospective areas with minimal risk.