Enhancements in the Interpretation of Geochemical Data using Multivariate Methods and Digital Topography

Title: Enhancements in the Interpretation of Geochemical Data using Multivariate Methods and Digital Topography

Year: 2003

Publication Type:

Source: Explore - Association of Exploration Geochemists Newsletter (2003)


Authors: ,

The development of low-cost, rapid multi-element analytical techniques has generated large geochemical databases in many exploration programs. When a sampling program consists of several thousand samples, the resulting data matrix is enormous and effective interpretation using all of the elements individually becomes burdensome. However, the application of multivariate statistical techniques can extract geochemical patterns related to the underlying geology, weathering, alteration and mineralization. Imaging the results over topography enhances the interpretation of these patterns. Examples of this approach are shown from mineral exploration programs in Canada, Mexico and Indonesia.


There has been a dramatic change in the effectiveness of using geochemical survey data in the past 10 years. This is mostly due to:

1) Lower cost of geochemical analysis,
2) Lower detection limits,
3) New methods of multi-element analysis,
4) A marked decrease in the turnaround time,
5) The intelligent use of computers in presenting geochemical data.

Geochemical surveys commonly have two objectives: locating abnormal concentrations of ore-forming or pathfinder elements and characterizing the underlying host lithologies. Geochemical surveys often employ a wide range of sample media that may typically include whole rock lithogeochemistry, various soil horizons, till and basal till sampling, stream sediments, lake sediments, and various forms of weathered regolith. Each of these sample media will exhibit a unique element distribution that must be factored in the interpretation phase of a geochemical survey study. This can become a cumbersome process if univariate data handling methods are employed.

During the past 30 years both government and mineral exploration companies have documented and developed techniques for sampling various sampling media for the purposes of locating mineral deposits. These organizations have accumulated large geochemical databases that are commonly available as digital open file reports. More recently many government agencies have been re-assembling these geochemical survey data and making them available via the Internet. Many government surveys are re-analyzing archived sample material using modern analytical methods and a larger number of useful elements (up to 60). Geochemical surveys that have been carried out over time reflect the methods available at the time of analysis. Thus when assembling diverse geochemical data there are differences in methods of analysis (INAA, ICP-ES, ICP-MS, AA, etc.) that can result in difficulties with interpretation. (e.g., Grunsky, 1997). Despite the complexities, these datasets are valuable to exploration when properly evaluated and clearly presented using tools such as geographical information systems (GIS). Harris et al (1997) and Wilkinson et al (1999) document the process of evaluating geochemical data using GIS technology.

The use of multivariate statistical methods coupled with modern imaging systems and GIS is especially powerful with geochemical data that has been collected at a spatial resolution scale that is close to the scale of the geology that is being evaluated.

The Evaluation and Interpretation of Geochemical Data
Geochemical responses can reflect a number of processes that have occurred within the survey area depending on the nature of the material sampled and the method of analysis used to make geochemical determinations. In some cases, individual elements may reflect specific geochemical processes independently of other elements but typically the responses of most elements reflect multiple processes. The geochemical research literature has documented many methodologies that apply various numerical and statistical methods from which these processes can be inferred (e.g. Grunsky, 1997). Various forms of multivariate statistics including cluster analysis, discriminant analysis, the analysis of single element and multi-element distributions, principal component and factor analysis and variants thereof have been applied with significant success. Rebagliati, 1999 and Garret and Grunsky (2001) discuss a simplified form of multivariate analysis based on empirical indices for a selected number of elements whereby the geochemists selects weights for each element of interest. Important elements receive greater weights than less important elements.

Many of these methods have been proposed since the 1950’s yet are still not commonly used in geochemical assessment programs. These techniques are commonly termed as “black box” because they may be mathematically and computationally challenging. However, when these methods are properly applied they can provide significant insight into the discovery of the geochemical processes which directly and indirectly infer geological processes associated with background lithological variation, regional metamorphism, alteration associated with mineralization and mineralization itself. Proper application implies that the correct media has been sampled and the methods of analysis are appropriate for the intended purpose. This requires the knowledge and experience of a geochemist/geologist who understands something about the geology and the sample material of the area. In addition a thorough analysis of the geochemical data should be undertaken to be sure that proper quality control measures have been undertaken and that analytical artifacts have been removed. Without this knowledge any statistical methodology applied to the data will have little meaning.

Geochemical data are reported as proportions (weight %, parts per million, etc.) For a given observation compositional proportions (i.e. weight%) always sum to a constant (100%). Closed data, by definition, are points of the simplex, which are a part of n-dimensional real space. The elements of points on the simplex are vectors of real numbers that satisfy the constant sum constraint. Statistical measures, such as measures of correlation and covariance can be misleading and can result in an incorrect assessment of correlation or other measures of association. Aitchison (1986), Aitchison et al. (2000) advocates the use of log-ratio transformations to overcome the effect of closure. When analyzing trace element geochemistry the problem of closure is seldom noticed because trace elements represent a small part of an entire composition. In this case, conventional statistical measures are less likely to be affected by closure. Compositional data should be adjusted by the use of log-ratios, particularly when using major element oxide data.

A compositional vector x defined by D component variables (elements)). By definition, this vector will sum to a constant (100%) and as a result, the composition can be described by D-1 of the variables. A composition x can be transformed by yi = log(xi/xD) (i = 1,…, D-1). There is no loss of information by choosing one of the variables as a divisor. An alternative way of transforming a compositional vector is by applying the log-centered ratio, namely: zi = log(xi/g(xD)) (i = 1, …, D), where g(xD) is the geometric mean of the composition. The log-centered ratio is useful because it preserves all of the variables in the composition. However, the inverse of covariance matrix for this transform is singular. For statistical procedures that require the inverse of the covariance matrix, a generalized inverse procedure may be applied; or the log-ratio transform may be used instead. This log-centered transformation has been applied to the Ben Nevis metavolcanic geochemical data used below.

The Advantage of Multivariate Methods

This paper will emphasize the use of principal components analysis (PCA), which has been available to the geochemical community for the past 30 years. The objective of principal components analysis is to reduce the number of variables necessary to describe the observed variation within a set of data. This is done by forming linear combinations of the variables (elements), which describes the distributions and relationships of the data. Ideally, each component might be interpreted as describing a geological process such as differentiation (partial melting, crystal fractionation, etc.), alteration/mineralization (carbonatization, silicification, alkali depletion, metal associations and enrichments, etc.), and weathering processes (bedrock-saprolite-laterite).

Components analysis methods are based on the extraction of linear combinations of samples and elements from some measure of association. Most commonly the measures of association used are the inter-element correlations or covariances. Davis (1986, Chapter 6) gives a very readable account on the mathematics of principal components analysis. A more thorough treatment of the subject is given in (Joreskog et al, 1976).

Covariance relationships between the elements reflect the magnitude of the elements and thus elements with large values tend to dominate the variance-covariance matrix. This has the effect of increasing the significance of these elements in the results of the principal components analysis. The correlation matrix represents the inter-element correlations, which are the standardized equivalent of the variance-covariance matrix. When the correlation matrix is used, then all elements have equal representation and the linear combinations of the elements are based on their correlations only and associations are not influenced by the relative magnitudes of the elements. Other metrics of association can be used and are discussed by and Reyment and Joreskog (1993), Joreskog et al (1976) and Davis (1986).

A method of principal components analysis, known as simultaneous RQ-mode Principal Components Analysis (Zhou et al., 1983), was applied in the examples presented here. Robust estimation of either covariance or correlation gives a better estimate of the means of the variables by down-weighting the influence of anomalous samples (Zhou, 1985, 1989). The application of RQ-mode PCA has the advantage of presenting the component loadings of the samples and the elements in the same (scaled) component space. Thus, scatter plots of component loadings show the relationships of the samples and the elements with respect to each other. The interpretation of the results of principal components is usually oriented on placing a geological/geochemical interpretation on the linear combinations of elements (loadings) that comprise the components. The dominant components generally reflect obvious geochemical processes whereas the less significant components may be related to processes such as mineralization. Examples of this are shown in Grunsky (1986a, 1997).

Integrating Geochemical Data with Digital Topography

Given a large number of samples and elements, the assessment and interpretation of multi-element geochemical datasets can be an overwhelming task. Large regional government datasets can contain tens of thousands of samples. Modern methods of data management, analysis and presentation include the use of desktop database management systems (DBMS), desktop statistical processing, geographical information systems (GIS) and image processing systems. Data can be statistically analyzed, spatially selected and aggregated to view geochemical data, using GIS technology. Geochemical data can also be interpolated and rendered in image processing systems which can be integrated with geophysical and other raster based data for enhanced interpretation.

In many areas of the world digital base maps can be acquired from local governments that typically include lakes, rivers, streams, road networks and other topographic information that is useful in the orientation and interpretation of geochemical data. As well, digital topography, also known as digital elevation models (DEM), is often available from local governments that provide a topographic backdrop for the interpretation of geochemical data. Digital geological maps are now routinely provided by many geological surveys around the world in addition to mineral occurrence inventory databases that have been accumulated by many geological surveys and private companies. In areas where little geographic information is available global datasets can be useful for regional scale studies. Products such as the Digital Chart of the World (DCW) (Defense Mapping Agency, 1992), regional digital elevation data (Globe Task Team, 1999) and even ocean bathymetry (NOAA, 1988) can be obtained easily and integrated with digital geological, geophysical and geochemical data in GIS and Image Processing systems.

The use of DEM with geological observations offers a unique view of data in that it provides a “real world view” of the data over the terrain. Surficial processes that are reflected in the topographic variation commonly modify soils and regolith materials. The interpretation of the geochemistry of these materials is often enhanced by draping the geochemical signatures over the DEM and viewed in a geographical information or image processing system.

Case Studies

We present two case studies in which principal components analysis coupled with digital topography have assisted in the interpretation of geochemical processes.

Ben Nevis Township, Ontario, Canada

The Ben Nevis township area of Ontario is comprised of a sequence of calc-alkaline volcanics intruded by late granitoid rocks that are part of the Blake River Group of the Abitibi greenstone (Jensen, 1975). Two significant gold occurrences have been investigated in this area: the Canagau Mines and the Croxall properties (Grunsky, 1986a). A detailed lithogeochemical sampling program was carried out from 1975-1982 that outlined a zone of extensive carbonatization associated with the Canagau Mines Property. Details on the geology, sampling methodology and mineral occurrence descriptions can be found in Grunsky (1986a,b). A total of 822 field samples were collected and analyzed for SiO2, Al2O3, Fe2O3, MgO, CaO, Na2O, K2O, TiO2, P2O5, MnO, CO2, S, H2O+, H2O-, Ag, As, Au, Ba ,Be, Bi, Cl, Co, Cr, Cu, F, Ga, Li, Ni, Pb, Zn, B, Mo, Sr, V, Y, U, Zr, Sc, and Sn. A subset of 26 elements were chosen for the statistical analysis which included Si, Al, Fe+3, Fe+2, Mg, Ca, Na, K, Ti, Mn, P, CO2, S, H2O+, Ba, Co, Cu, Cr, Li, Ni, Pb, Sr, V, Y, Zn, and Zr. These data have evaluated using the RQ-mode principal components analysis method discussed previously. Because the geochemical data are composed of both major element oxides and trace element compositions, the effect of closure may affect the results.


Figure 1. Scatter plot of loadings for principal components 1 and 2. Loadings of the samples are shown as crosses. Loadings of the oxides and elements are shown by the positions of their names.

The geochemical data were transformed using a log-centred transformation as outlined above. Principal components analysis was carried out on the log-centred transformed data and Figure 1 is a plot of the loadings of the elements and the samples for the first two principal components. The relationships seen in this figure account for more than 33% of the variation of the data and can be interpreted to represent meaningful relationships. The element loadings are represented by the placement of the element labels in the figure. The small crosses represent the sample loadings. The actual numerical values of the loadings of the elements and samples are less important than the relative placements of the samples and elements with respect to each other. Notice that the left side (negative side) of the figure (the negative C1, x-axis) has elements that are commonly associated with mafic volcanic rocks whereas elements towards the positive x-axis are commonly associated with felsic volcanic rocks. Note that along the y-axis, the second component, C2 (11.7% of the variation of the data) shows an inverse relationship between TiO2-Fe2O3-CaO-Sr-Na2O and Li-CO2-S-K2O Samples that plot along the negative side of the C2 axis have relative enrichment of Li and CO2 and represent the samples that have been carbonatized. Sulphur, Cu, and Pb also show relative enrichment in the samples that are carbonatized. Note also, that there appears to be an inverse association between Na2O and K2O. This shows that samples that are relatively enriched in K2O are relatively depleted in Na2O. This also shows that samples are relatively enriched with Li and CO2 with corresponding depletion in Na2O. The third principal component (not shown), accounts for 8.5% of the variation of the data and is characterized by a relative increase of S and Cu corresponding with a relative decrease in CO2 and Li. Samples with these characteristics occur in silicified mineralized zones.

Figure 2a is an interpolated image of the first principal component draped over the digital elevation model for the area. The first component outlines lithological variation in the map-area. Areas in red highlight felsic volcanics, areas in green are intermediate to mafic calc-alkaline flows and areas in blue are mafic intrusive rocks. These areas closely match the mapped geology as shown in the scanned map of the geology in Figure 2d.

Figure 2b is an image of the second principal component that highlights the pervasive carbonatization signature associated with the Canagau and the Croxall properties. These properties are shown as black circles and labelled in Figure 2a. This component outlines a hydrothermal system that hosts gold at the Canagau Mines property.

Figure 2c is an image of the third principal component that highlights Fe, K and S enrichment. This image identifies both real sulphide occurrences associated with Au and Cu mineralization as well as a number of pyrite occurrences that can be viewed as false anomalies.

Figure 2d is a scanned image of OGS Map 2283 (Jensen, 1975), draped over a 25 metre DEM provided by the Ontario Ministry of Natural Resources. Examination of this map together with the DEM highlights some linear features that are not present on the published map. The advantage of using a DEM in an Archean terrain is that structural information can often be obtained and related to lithogeochemical patterns. Major faults within the Ben Nevis map area truncate several lithologies and are evident when examining the pattern of Figure 2a in conjunction with the linear features in the underlying DEM. Similarly, alteration and mineralization in Archean terrains are commonly structurally controlled which may coincide with linear features in the DEM. Both the Canagau and Croxall properties show association with linear features in Figures 2b and 2c.

The use of principal components analysis reduces the dimensionality of the data from 26 elements to three components that, when interpolated and imaged as shown in Figure 2a-c, describe the primary lithological variation, alteration patterns and sulphur bearing occurrences. When these images are draped over the digital topography, which provides structural information, a more effective interpretation results.


Figure 2. Figure 2a is an interpolated image of the first principal component that outlines lithological variation in the map-area. Figure 2b is an image of the second principal component that highlights the pervasive carbonatization signature associated with the Canagau and Croxall properties. Figure 2c is an image of the third principal component that highlights Fe, K and S enrichment.. Figure 2d is a scanned image of OGS Map 2283. All images are draped over a 25 metre digital elevation model of the area. The vertical exaggeration is 3 times the actual relief.

Campo Morado, Mexico

The Campo Morado mining camp in the Guerrero state of Mexico hosts seven precious metal bearing volcanogenic massive sulphide deposits in the complexly folded and faulted Guerrero terrain (Oliver, Payne and Rebagliati, 1996, Rebagliati, 1999). Approximately 29, 221 samples were collected over a soil grid comprised of 25 meter sample spacing along lines and each line was 100m apart. The field samples were analyzed for: Al, Fe, Ca, K, Mg, Na, Ti, Au, Ag, As, Ba, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, P, Pb, Sc, Sr, V, W and Zn using aqua regia digestion and ICP-ES finish. A DEM was created at 25 metre resolution. Principal components analysis was carried out on the data and revealed several significant patterns related to lithological variation and mineralization. Because of the high topographic relief in the area, the problem of transported material from weathering has the potential to result in “false anomalies” that are often the result of hydromorphic dispersion and down-slope creep. When the results of the principal components analysis are draped over the topography, there is an increased ability to distinguish anomalies associated with hydromorphic dispersion from those associated near the bedrock source.

Figure 3a shows a planimetric image of the second principal component over a shaded relief image of the DEM. Felsic volcanic rocks (red and yellow) are distinguished from mafic volcanic rocks (blue). Felsic rocks show relative enrichment in K, and Na while the mafic rocks show relative enrichment in Fe, Co, Ti, Mg, Cr, Al, Sc, and V. The areas, highlighted in green, represent lithologies of intermediate compositions and are mostly mudstones, argillites and sandstones, which are host rocks for several of the mineral deposits in the Campo Morado area. The first principal component highlights areas of relative enrichment of Ag, Zn, Au, As, Pb, Hg, Sb and Cu. These areas, shown in red and yellow are potential sites of mineralization in Figure 3b. This image is a three dimensional rendering over the DEM. Examination of these areas in conjunction with the DEM assists in setting priorities for follow-up. Anomalies that lie along riverbeds or show significant dispersion must be treated with caution due to the effects of hydromorphic and downslope creep dispersion effects.

enhanceInterpretFig3aFigure 3a represents an index of lithology. Areas in red are rocks that have geochemical characteristics that are typical of felsic volcanics. Areas of blue are areas that have geochemical characteristics typical of mafic volcanic rocks. The areas of green represent intermediate volcanics, argillites and mudstones. Grid lines are 2500 metres apart.


Figure 3b The index of mineralization represented by relative enrichment of Ag, Zn, Zu, As, Pb, Hg, Sb and Cu. These multi-element anomalies can be viewed and examined in the context of topographic location and helps distinguish anomalies associated with source material from anomalies that occur as hydromorphic dispersion and down-slope creep. Vertical exaggeration is approximately 1.5 times.

Concluding Remarks

Complex geochemical datasets can be interpreted more effectively when the dimensionality of the initial data is reduced to a few variables that describe geological processes related to lithological variation, alteration and mineralization. The use of multivariate statistical procedures such as principal components analysis can be very effective in simplifying large datasets. While not all geochemical data can be interpolated and imaged (because of spatial resolution and irregular sampling strategies), data that can be interpolated and imaged may reveal spatial trends that assist with interpretation. When these geochemical patterns are combined with digital topography the interpretation of these patterns is significantly enhanced.


The authors wish to acknowledge thanks to the following for permission to use their data:

Ontario Geological Survey and the Ontario Ministry of Natural Resources for the provision of the digital elevation data for the Ben Nevis area of Ontario.

Mark Rebagliati and Bob Cluff of the Hunter Dickinson Inc., Vancouver, for permission to present the results of the Campo Morado geochemical study.

by: Eric C. Grunsky
Alberta Geological Survey,
4999 98th Ave. Edmonton, AB, Canada T6B 2X3

and Barry W. Smee
Smee and Associates Consulting Ltd.
4658 Capilano Road, North Vancouver B.C. V7R 4K3
Phone 1 (604) 929-0667 Fax 1(604) 929-0662


AITCHISON, J., 1986: The Statistical Analysis of Compositional Data, Methuen Inc., New York, 416 p.

AITCHISON, J., BARCELO-VIDAL, C., MARTIN-FERNANDEZ, J.A. and PAWLOWSKY-GLAHN, V., 2000: Log-ratio analysis and compositional distance, Mathematical Geology, Vol. 32, p. 271-275.

DAVIS, J.C., 1986: Statistics and Data Analysis in Geology, John Wiley & Sons Inc., second edition, 646 pp.

DEFENSE MAPPING AGENCY (DMA), 1992: Digital Chart of the World, Defense Mapping Agency, Fairfax, Virginia. (Four CD-ROMs.) NOAA, 1988: Data Announcement 88-MGG-02, Digital relief of the Surface of the Earth,

NOAA, National Geophysical Data Center, Boulder, Colorado, 1988. (URL http://www.ngdc.noaa.gov/mgg/global/etopo5.HTML)

GLOBE Task Team and others (Hastings, David A., Paula K. Dunbar, Gerald M. Elphingstone, Mark Bootz, Hiroshi Murakami, Hiroshi Maruyama, Hiroshi Masaharu, Peter Holland, John Payne, Nevin A. Bryant, Thomas L. Logan, J.-P. Muller, Gunter Schreier, and John S. MacDonald), eds., 1999: The Global Land One-kilometer Base Elevation (GLOBE) Digital Elevation Model, Version 1.0. National Oceanic and Atmospheric Administration, National Geophysical Data Center, 325 Broadway, Boulder, Colorado 80303, U.S.A. Digital database on the World Wide Web (URL: http://www.ngdc.noaa.gov/seg/topo/globe.shtml) and CD-ROMs.

GARRETT, R.G., and GRUNSKY, E.C., 2001: Weighted Sums – Knowledge Based Empirical Indices for Use in Exploration Geochemistry, Geochemistry: Exploration Environment, Analysis, Vol. 1, p. 135-141.

GRUNSKY, E.C.,and SMEE, B.W., 1999: The differentiation of soil types and mineralization from multi-element geochemistry using multivariate methods and digital topography, Journal of Geochemical Exploration, Vol. 67, p.287-299.

GRUNSKY, E.C., 1997. Strategies and Methods for the Interpretation of Geochemical Data, in Current Topics in GIS and Integration of Exploration Datasets, Short Course, Exploration’ 97 Workshop, September, 1997, 145 p.

GRUNSKY, E.C., 1986a: Recognition of Alteration in Volcanic Rocks Using Statistical Analysis of Lithogeochemical Data, Journal of Geochemical Exploration, Vol. 25, p.157-183.

GRUNSKY, E.C., 1986b: Recognition of Alteration and Compositional Variation Patterns in Volcanic Rocks Using Statistical Analysis of Lithogeochemical Data, Ben Nevis Township Area, District of Cochrane, Ontario; Ontario Geological Survey, Open File Report 5628, 187 p., 50 figures, 10 tables, and 6 photos.

HARRIS, J.R. GRUNSKY, E.C., WILKINSON, L., 1997. Developments in the Effective Use of Lithogeochemistry in Regional Exploration Programs: Application of GIS Technology, in Proceedings of Exploration’97, Fourth Decennial International Conference on Mineral Exploration, edited by A.G. Gubins, pp. 285-292.

JENSEN L.S., 1975: Geology of Clifford and Ben Nevis Townships, District of Cochrane. Ontario Div. Mines, GR 132, 55 pp. Accompanied by Map 2283, scale 1 inch to 1/2 mile.

JORESKOG, K.G., KLOVAN, J.E. and REYMENT, R.A., 1976. Geological Factor Analysis, Elsevier Scientific Publishing Company, New York, 178 pp.

OLIVER, J., PAYNE, J., REBAGLIATI, M., 1996: Precious-metal-bearing Volcanogenic Massive Sulfide Deposits, Campo Morado, Guerrero, Mexico, Exploration Mining Geology, Volume 6, Number 2, pp. 119-128.

REBAGLIATI, M., 1999: Applied Exploration Geochemistry: Campo Morado Precious-Metal-Bearing Volcanogenic Massive Sulphide District, Guerrero, Mexico, 19th International Geochemical Exploration Symposium, Vancouver, British Columbia, Canada, April 10-16, 1999, Abstract, p. 45.

REYMENT, R. and JORESKOG, K.. 1993: Applied Factor Analysis in the Natural Sciences, Cambridge University Press, 371p.

WILKINSON, L., HARRIS, J.R. and GRUNSKY, E.C., 1999: Building a Lithogeochemical Database for GIS Analysis; Methodology, Problems and Solutions, Geological Survey of Canada Open File 3788.

ZHOU, D., CHANG, T. and DAVIS, J.C., 1983: Dual Extraction of R-Mode and Q-Mode Factor Solutions, Mathematical Geology, Vol. 15 No. 5 p. 581-606.

ZHOU, D., 1985: Adjustment of geochemical background by robust multivariate methods, Journal of Geochemical Exploration, Vol. 24, p. 207-222.

ZHOU, D., ROPCA, 1989: A Fortran Program for Robust Principal Components Analysis, Computers and Geosciences, Vol. 15, No.1, pp. 59-78.

Link: Google Scholar