Indicator Kriging based on Principal Component Analysis

Transcription

Indicator Kriging based on Principal Component Analysis
INDICATOR KRIGING BASED ON PRINCIPAL COMPONENT
ANALYSIS
A
THESIS
SUBMITTED TO THE DEPARTMENT O8 APPTIED EARTH SCIENCES
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAT FUTFITTMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
MASTER OF SCIENCE
By
Vinicio Suro P6rez
December, 1988
./
ACKNOWLEDGEMENTS
It
is a pleasure to thank Prof. Andre Journel for his guidance, support and helpful
discussions. He patiently reviewed each word of this text and his influence can be found on
each page. However,
I
reserve
for me the mistakes.
I am grateful to Consejo Nacional de Ciencia y Tecnologia (Mexico) for financial support
through an scholarship. Additional funding was provided for the Geostatistics Program at
Stanford University and the Environmental Protection Agency through the Project No.
cR-814899-01-0.
I
am indebted to my fellows of the Geostatistics Program for their friendship and all
those wonderful discussions about everything and specially to my friend of many years, Luis
Ramos, for his constant support and his tennis lessons
.
Last but not least, I want to acknowledge to my wife Maria T. for her love, patience and
tolerance during these two years. Without her help
edgement.
rll
I
would not be writing this acknowl-
ABSTRACT
An alternative to multiple indicator kriging is proposed which approximates the full
coindicator kriging system by kriging the principal component variables of original indicator
variables. This transformation is studied in detail for the bigaussian model.
It is shown
that the crosscorrelations between principal components are either not significant or exactly
zero. This result permits inference of the conditional cdf by krigtng the principal components
then applying a back transformation. A performance comparison based on on a simulated
data set with results derived from multiple indicator kriging and multigaussian kriging is
presented.
lv
Contents
1
2
Introduction:
1
1.1
Use of Principal Component Analysis:
2
I.2
Our Goal:
3
The Indicator Kriging Approach:
2.L
A Deterministic Interpretation:
2.1.1
2.2
5
6
Bivariate Distribution Model:
The Probabilistic Model:
2.2.L Optimality
Criteria:
11
Development of the Indicator Kriging Estimator:
2.4
An Alternative Estimator to
2.5
Are the Indicator Crmscorrelations Important?
IK: .
.
3
13
15
16
2.5.1 The One Sample Example:
t7
2.5.2 The Two Sample
18
Case:
Alternatives to CoIK:
2.6.1 The Probability Kriging Estimator:
2.6.2 A Simplified Version of the CoIK Estimator:
2.7
,
10
2.3
2.6
ry
OtherPerspectives:
20
2L
23
24
Principal Components and the Indicator Approach:
26
3.1
27
Transforming the Indicators:
9.2
The Bigaussian Model:
...
29
3.2.1 Covariance Matrix El(h)
31
3.2.2
33
Eigenvectors of E1(h)
3.3
Computation of the Principal Component Croescovariancee:
36
3.4
Numerical Computation of the Indicator Croosconariances .
37
3.4.L
Cautious Romberg Extrapolation
3.4.2 The Composite
3.5
Trapezoidal Rule
..
37
.
39
3.4.3 End Point Singularity
40
3.4.4 Implementation.
4l
Examples:
4L
3.5.1 The Three Cutoffs Case: .
42
3.5.2
52
The Five Cutoffs Case:
IK based on Principal Component Analysis
61
4.1
An Estimator Based on PCA:
61
4.2
4.3
Unbiasedness:...
63
Estimation Variance
65
Practice of IKPCA:
67
4.4
4.4.1 Declustering
4.4.2
the Univa^riate CDF:
Selection of Cutoffs:
68
4.4.3 Computation of E1(h): . . .
4.4.4 Checking for Bigaussianity:
69
4.4.5
7L
Order Relations Problems:
4.4.6 Probability
Intervals:
70
7L
4.4.7 Optimal Estimates:
72
5 A Case Study
5.1
67
74
Structural Analysis:
74
vl
s.l.L
Indicator Correlograms and Crossconelogra,ms:
5.L.2 Principal
5.2
Estimation of
5.2.t
Component
Correlogra,me:
Modeling Correlogra.*s:
Estimation of
L02
..
L02
cdf F*(x;
zl(n')):
Panels:
128
6.1
128
6.4
6.5
6.6
Bim,riate Distribution:
r29
Number of Cutoffs:
Inference of Principal Component Correlograms:
..
.
Indicator Kriging:
L29
130
Indicator Kriging based on Principal Component Analysis:
Nonlinear Combinations of Indicators:
130
131
Spectral Decomposition of E;(h)
L32
A.1 Ilouseholder Transfomations:
133
L.2
134
The QR Algorithm:
A.3 The Singular Value Decomposition:
C
tI7
Conclusions and Recommendations:
6.2
6.3
B
103
LL4
5.3.1 Simple IKPCA and Ordinary IKPCA:
6
75
90
Points:
5.2.2 Performance of the Conditional
5.3
. .
.
Computation of Indicator Crogsconariances
Computation of Principal Component Crosscovariances
vll
136
138
L67
List of Tables
3.1
cutoffs
L24
5.1
The eelected nine
5.2
Predicted proportions for expreesion (5.4).
L25
5.3
Predicted proportions for expression (5.7).
L25
5.4
Predicted quantity of metal recovery factor.
L25
o.o
Predicted tonnage recovery factor.
L26
5.6
Comparison between
5.7
Panel Estimation: Comparison between
MG,IK, IKPCA and the approximated IKPCA
L26
IK, IKPCA and the approximated
L26
IKPCA.
Panel Estimation: Comparison between ordinary IKPCA and simple IKPCA. L27
vlu
List of Figures
2.1
The estimator O*(V; z1) is the best bivariate-type estimator of E[O(I/; zp)
|
.....
("')l
12
3.1
The principal component transformation.
27
3.2
Expression 3.37
35
3.3
Indicator Covariance for the cutoff z = -I.0.
43
9.4
Indicator Covariance for the median cutoff z = 0.
43
3.5
Indicator Corariance for the cutoff z = -1.0.
44
3.6
Ind.icator crossco\xariance for the cutoffs z
3.7
Indicator crosscovariance between the extreme cutofis: z
3.8
Indicator crcscova,riance for the cutoffs z = 0.0 an.d z = 1.0.
45
3.9
Indicator correlogram for the cutoff z = 'L,0.
46
.
= -I.0
ap.d
z'= 0.0.
=
44
L.0 and' z
= -1.0
45
3.10 Indicator correlogram
for the median cutoff z :0.0.
47
3.11 Indicator correlogram
for the cutoff z = 1.0.
47
3.r2 Indicator crosscorrelogram for z
3.13 Indicator crosscorrelogra,m
3.14 Indicator croscorrelogram
3.15
-
--L.0 an'd z'= 0.0.
for z = 0.A
and,
z'=
48
48
1.0.
for the extreme cutoffs z = -1.0
First principal component correlograrn.
3.16 Second principal component correlogram
and.
zt=
1.0.
49
50
50
3.L7
Third principal component correlogram
51
3.18
First and third principal component crooscorrelogram
51
lx
3.19 Indicator correlogram for the cutoff z
= -2.0.
53
z'= -1.0.
3.20 Indicator croescorrelogra,m
for the cutoffs z = -2.0
3.2L Indicator crosscorelogra,n
for the cutoffs z = -2.0 arid zt = 0.0.
54
3.22 Indicator croescorrelogra,m
for the cutoffs z = -2.0 alid. z'= 1.0.
54
3.23
and,
Indicator crosscorrelogra,m for the extreme cutoffs z = -2.0
arLd
53
z' -- 2.0.
oo
3.24 Principal component correlograms for the five cutofis case.
56
3.25 Principal component crosscorrelograms different from zero.
56
4.1
Choice of synmetric F(z) does not entail symmetric cutofis.
68
4.2
The elements of matrix >/(0) can be read from Fi@).
70
5.1
Exhaustive data set.
75
5.2
Z-Correlogram derived from the exhaustive information
76
5.3
Indicator conelogram for the cutoff z = 'L.28
77
5.4
Indicator crosscorrelogram for the cutoffs z
77
o.D
Indicator crcscorrelogram
78
5.6
Indicator crosscorrelogram
5.7
Indicator crosscorrelogram
5.8
Indicator crosscorrelogram
5.9
Indicator crcscorrelogram for the cutoffs z
= -L.28 and z = -0.84.
for the cutoffs z = -1.28 and z = -0.52.
for the cutoffs z = -L.28 and z = -0.25.
for the cutoffs z = -L.28 and z = 0.0.
for the cutoffs z = -L.28 and z = 0.25 .
-
-1.28
and' z
78
79
79
= 0.52
.
80
5.10 Indicator crosscorrelogram
for the cutofis z -- -t.28
z = 0.84
.
80
5.11 Indicator crosscorrelogra,m
for the cutoffs z = -1.28 and z = 1.28
.
81
5.12 Indicator correlogram
and'
for the cutoff z = -0.84
81
for the cutofis z = -0.84 and z = -0.52.
82
5.14 Indicator crosscorrelogra,m for the cutoffs z
82
5.15 Indicator
= -0.52 ar'd z = -0.25.
croescorrelogra,m for the cutofis z = -0.25 and z = 0.0.
83
5.13 Indicator crosscorrelogra,m
5.16 Indicator crosscorrelogra'm
for the cutoffs z = 0'a and z = 0'25'
83
5.17 Indicator crosscorrelogram
for the cutofis z = 0-25 and' z = 0.52.
84
5.18 Indicator crooscorrelogr"m
for the cutofis z = 0.52 and z = 0.84.
85
5.19 Indicator croescorrelogra,m
for the cutoffs z = O.84 anLd z = 1.28.
85
5.20 Indicator correlogra.m for the cutoff z
86
5.2L Indicator correlogram
86
= -0.52.
for the cutoff z = -0.25.
5.22 Indicator correlogram for the cutoff z
6.23 Indicator correlogram
87
0.0.
for the cutoff z = 0.25.
5.24 Indicator correlogram for the cutoff z
87
A.52.
88
for the cutoff z = 0.84.
88
Indicator correlogram for the cutoff z = 1.28.
89
5.25 Indicator correlogram
5.26
=
=
5.27 Greyscale map of the indicator crooscorrelogram for the
z
cutoffz = *1.28 and
-- -0.84.
5.28 Greyscale map of indicator croescorrelogra,m for the cutoff
5.29
z=
L,28 and.
z = 0.84.
90
First principal component correlogra,m
92
5.30 Greyscale map of the first principal component
92
5.31
First and second principal component crosscorrelogram.
93
5.32
First and third principal component crosscorrelogram.
93
5.33
Fimt and fourth principal component croescorrelogram.
94
5.34
First and fifth principal component
94
5.35
First and sixth principal component crosscorrelogram.
5.36
First and seventh principal component crooscorrelogram.
5.37
First and eighth principal component crosscorrelogam.
96
5.38
First and ninth principal component crosscorrelogra,m
96
croescorrelogra.m.
5.39 Second principal component correlogram.
5.40
Third principal component
correlogra^m.
5.41 Fourth principal component correlogran.
5.42
Fifth principal component correlogram.
95
..
95
97
98
98
99
5.43
Sixth principal component correlogre.m.
99
5.44 Seventh principal component correlogram
100
5.45 Eighth principal component correlogram
100
5.46
Ninth principal component correlogram
101
5.47
Third and fifth principal component crossorrelogram.
101
r02
5.48 Fourth and sixth principal component crossonelogtam grey ecale map.
5.49 Data configuration used by
IKPCA.
103
5.50 Scatterplot of the predicted and actual proportions
105
5.51 Scatterplot of the predicted and actual proportions
106
5.52 Scatterplot
ofthe predicted quantity ofmetal recovery factor and the actual
108
nalues.
5.53 Scatterplot of the predicted tonnage recovery factor and the actual values.
.
5.54 Scatterplot of the predicted S(V; z) factore and the MG factors
109
110
5.55 Scatterplot of the
IK and MG estimate of r'*(xo; -1.28).
111
5.56 Scatterplot of the
IKPCA and MG estimate of F*(x'; -1.28).
LL2
5.57 Scatterplot of the approximated
IKPCA and MG estimate of .F*(xo; -1.28).
tt2
5.58 Scatterplot of the
IK and MG estimate of F*(xo;0.0).
113
5.59 Scatterplot of the
IKPCA and MG estimate of .F.(xo;0.0).
113
5.60 Scatterplot of the approximated
IKPCA and MG estimate of F.(x.;0.0).
tL4
5.61 Scatterplot of the
IK and MG estimate of r'*(xo;1.28).
115
5.62 Scatterplot of the
IKPCA and MG estimate of f'*(xo;1.28).
115
5.63 Scatterplot of the approximated
IKPCA and MG estimate of r'*(xo;1.28).
5.64 Data configuration for the estimation of
panels.
116
LL7
5.65 Scatterplot of the composite
distribution fot z = - 1.28 and the actual value'
118
5.66 Scatterplot of the composite
distribution for z = -0.84 and the actual value.
118
5.67 Scatterplot of the composite
distribution for z = -0.52 and the actual value.
119
5.68 Scatterplot of the composite
distribution for z = -0.25 and the actual value.
119
xll
5.69 Scatterplot of the composite
distribution tor z = 0.0 and the actual value.
L20
5.70 Scatterplot of the compooite
distribution fot z = 0.25 and the actual value.
L20
5.7L Scatterplot of the composite
distribution fot z = 0.52 and the actual nalue.
L2L
= 0'84 and the actual value'
L2L
distribution for z = 1.28 and the actual value.
t22
5.72 Scatterplot of the composite distribution fot z
5.73 Scatterplot of the composite
5.74 composite distribution lor z
= -1.28 obtained by IK, oIKPCA
and SIKPCA.
L23
distribution fot z = 0.0 obtained by IK, oIKPCA and SIKPCA. t23
5.76 Composite distribution ftor z = 1.28 obtained by IK, OIKPCA and SIKPCA. t24
5.75 composite
xln
Chapter
1
Introduction:
In Ea,rth Sciences we are dealing most often with problems involving patterns of spatial
correlation. Sometimes only one attribute is sampled and at other times a eet of attributes
is sampled, but whether using one or several attributes the information is usually spatially
distributed, therefore the potential for spatial correlation/crosscorrelation must be investigated. This distinctive feature is the key and justification for a gectatietical approach.
Geostatistics typically addresses two classes of problems: estimation (prediction) and
simulation. Wheo at each location there are several attributes and the goal is the joint
estimation of these attributes, cokriging ie the proper method to account for correlation
and crosscorrelation. The price to pay is crooscorra,riances modelling, for instance the case
of
K attributes implies
the modellingof. O(K2) conariances and crosscovariances. Likewise,
for joint simulation of several attributes, the number of cova,riances and crosscovariances to
model is the main obstacle to practical implementation. In addition, the numerical solution
of large cokriging systems is another limiting factor.
These two reasons, modelling efiort and computational effort, are a strong motivation
to find alternatives for the solution of joint estimation or joint simulation. The proposed
alternative should be as simple like as ordinary kriging and at the same time, as powerful
as cokriging. The
first condition invalidates the use of heavy mathematics and the second
condition offers an interesting challenge possibly difficult to achieve. These two conditions
CHAPTEN
L
INTNODACTION:
define guidelines for building a new estimator capable to infer accurately conditional cumu-
lative density functions (cdf).
1.1
LJse
of Principal Component Analysis:
Principal Component Analysis (PCA) is one way to avoid the cokriging system and the
required crosscorrelations modelling. This traditional multinariate analysis technique consists
= l rt
attributes Y" = [ f,
in transforming a vector of K correlated attributs ZT
transformed vector Y = A.TZ of uncorrelated
Zx
into
Y" ] .
Unfor-
a
tunately, in the case of data spatially distributed this transformation only ensures that
crosscorrelations are zero only for attributes located at the same location,
Coo(Y;(0),Y;(0))
These crosscorrelations are
locations,
i. e. :
- 0, i # t
in general different from zero for attributes located at different
i. e. :
Coa(Y;(x),Y1(x* h))
l o
V
i
anil
i
Vh>0
Borgman and Flahme (1976) used PCA for 11 bentonite properties to perform their
joint estimation. An additional assumption that all crcscorrelations are zero, i.
Coo(Y;(x),Y;(x*h))=0 ,V ili
andV h
allows to model only 11 cona,riances and reduce the cokriging system (11
11 systems of normal equations
e.:
x
11) into only
( kriging systems ). To our knowledge this was the first
attempt to use principal components in a practical geostatistical application.
Bryan and Roghani (1982) and Davis and Greenes (1983), following the Borgman and
Frahme line, applied PCA to reduce a cokriging system to a series of normal equations.
gimil6fly, Dagbert (1981) and Luster (1985) performed simulations on principal components
under the assumption that all crosscovariances are zero.
CHAPTEN
L
INTNODACTION:
Matheron (1982), Sandjivy (1983;1984), Wakernagel (1985;1988) and mote recently Ma
and Royer (1988) propced using PCA to describe and analyze multivariate information.
In their approach, the original attributes
ar.e expressed as
a weighted sums of factors with
weights derived from application of PCA to the va,riance-conariance matrix. Additionaly
each factor is decomposed as a linear combination
of
new faetors obtained
by solving full
cokriging systems. Therefore, application of such methodology is higlly demanding in terms
of computational effort and crosscova,riances modelling.
Ilowever, these previous references do not address the problem of estimation of the condi-
tional cdf. They all share the assumption that after the principal component transformation
the crossco.rrelation is zero which is not necessa,rily true, however the PC transformation
idea is valuable and simple.
1,.2 Our Goal:
The goal of this research is to infer the conditional cdf using an indicator data. The solution
consists
in solving an indicator cokriging
based on the indicator formalism
aystem or a simpler syetem. Thie application is
( Switzer, 1977; Journel, 1982; 1983 ) where any attribute
is coded into a vector of0's or l's associated to the location x.
This thesis takes a different point of view and uses simple PCA to transform the original
indicators. The impact of PCA on crooscorrelations is analyzed and better approximations
to infer the conditional cdf are proposed. Important results are derived for the bigaussian
distribution which suggest that using PCA is a reliable and efficient technique to infer the
conditional cdf. Computational advantages over Indicator Kriging and Colndicator Kriging
are demostrated.
Chapter 2 presents in detail the theory of Indicator Kriging and discusses how the
conditional cdf is estimated. Different exa,mples are elaborated to emphasize the role of
crosscorrariances. Probability Kriging and Colndicator Kriging are reviewed, and their
practical implementations compared with that of multiple Indicator Kriging.
CHAPTER
1.
INTNODUCTION:
Principal Component Analysis ie introduced in Chapter 3 to orthogonalize the indicator
vectors. A general analytical expression for indicator crocscoraxiances is derived for the
binormal case. This bivariate distribution is ta,ken as enample to compare the relative
levels of indicator crosscorrelation versus direct correlations. Application of PCA under
bigaussianity is discussed and the properties of the correeponding principal component
crosscorelograms are derived. The binormal case Buggests that working in the space of
principal components does solve the problem of reducing indicator cokrigiag into multiple
kriging of principal components with minimum assumption about the croeecorrelograms.
This new estimator based on principal components and denoted IKPCA is developed
and analyzed in Chapter
performance
4.
Unbiasedness and minimum variance are discussed and its
in terms of estimation
nariance is compared
with tat of Indicator Kriging
and Colndicator Kriging. Its practical implementation is reviewed and the checking of its
constitutive hypothesis is strongly recommended.
Finally, IKPCA is applied on a simulated data set and its performance compared with
multiple Indicator Kriging and Multigaussian Kriging. Proportions, quantity of metal recovery factor, tonnage recovery factor are estimated by all three methods and their estimation
scores are compared.
It
is found that, for certain family of bivariate distdbutions, indica-
tor transformation via PCA provides a direct and fast technique to estimate conditional
distributions. Furthermore,
it is shown that starting with K
principal component correl-
ograms to model, application of IKPCA reduces significantly that number in a significant
proportion.
Appendices describe the numerical procedure for obtaining the principal components.
Commented source codes for calculation of indicator and principal component covariances
and crosscova,riances are given. Both programs assume a bigaussian model and with a
given the z-correlogram, and provide resulting conariances or crosscornriances. The indi-
cator principal component coraniance or crosscovariances
z-correlogram input.
a^re
obtained directly from the
Chapter
2
The Indicator Krigittg Approach:
The purpose of the Indicator Kriging (IK) approach is to infer a model for the conditional
cumulative density function (cdf), r'(x; zl(n)). In mining, for example,
it
is used to fore-
cast the recovered tonnage and quantity of metal above a given cutoff. In Environmental
F(x;z | (")) allowe computing the probability of exceedence over
a specific threshold which could be the maximum acceptable concentration in a certain
Sciences, knowledge of.
pollutant.
IK is but one method to infer such conditional cdf. Other techniques are Disjuntive
Kriging (DK) (Matheron, 1976), Lognormal Kriging (Switzer and Parker, 1976; Journel, 1980), Multigaussian Kriging (Verly, 1983; 1984), Uniform Conditioning (Guibal and
Remace,l984) and Bigaussian Kriging (Marcotte and David, 1985) which are based on
a
parametric approach as oppmed to the nonparametric approach of IK. This single difference reveals a
najor philosophical difference: IK
relies more on the available data while the
parametric techniques capitalize on an implicit or explicit multivariate distribution model.
IK from its introduction (Journel,
1982; 1983; 1984a) was devised a,s a nonparametric
technique and an alternative method to DK. The main hypothesis of the latter method is
that the bivariate marginal cdf for
il\ Z(x), Z(x * h) is isofactorial
expansion of orthogonal polynomials, and in particular
used
in the expansion and the corresponding
it
goodness of
and expressed as a finite
depends on the number of terms
fit. IK
expresses the conditional
CHAPTEN
2.
THE INDICATOR KRIGING APPROACH:
cdf as a simple linea,r combination of indicatore aagociated to each cutoff.
Notice that any method to infer the conditional cdf relies either on enough data or
on aasumptione about the multivariate data distribution.
If the data do not honor the
hypothesis made about the multiva,riate distribution, there
will be a departure from the
theoretical formulation whose consequences on the estimated cdf are not fully known yet.
How to measure the departure from the models is an open problem in most of the geostatiscal
developments.
2.L A Deterministic Interpretation:
Given an physical domain A exhaustively sampled, the problem considered is to infer the
proportion of point values within A which are below or above a certain threshold 21. This
proportion has an exact expression if exhaustive sa,mpling is a€sumed:
Q(A;z*)=
where | ,a I is the measure of the domain
(,
d(x;21)={
t
#
/,
lorr*,zy)
dx
(2.1)
and
i
t".!*)s.'*
| otherwise ft=r,...,K
0
is an indicator function of the threshold zs.
The evaluation of the integral (2.1) can be accomplished by numerical integration since
all indicator values
a,re
known on the domain ,4, :
6(A;z*)=
In the
case
#"D_r,*,*,
(2.2)
of a non-exhaustive sampling of A expression (2.2) may not be anymore
televant. Clustere and the
sa,mples number must be accounted
for to obtain an approx-
imation to (2.1). The problem is complex because one has to devise some algorithm to
solve numerically the integral when the integrand is known only at a finite number of loca-
tions nonrandomly sampled. The common techniques for numerical integration can not be
CHAPTEN
2.
THE Id|{DICATON KRTGING APPROACH:
applied directly to our problem due to lack of knowledge of the integrand functional properties and the fact that thee techniques aasume knowledge of the integrand on a regula,r or
specified
grid. However (2.1) can be approximated by the weighted linear combination:
o-
if n sa,mples are available
@; rx) =
over
l*
i?l P,
ur(x')a(:<- ; z6)
(2.3)
A. The symbol * indicates a particular approximation,
and
the weights u(xo) are defined according to some criterion. For example the polygonal
method or cell declustering nethod (Journel, 1982) can be used to determine these weights.
In the first case the weights are proportional to the polygonal area of influence of each
In the second case the weights are inversely proportional to the
sample location &
eamples number
in
each predetermined cell.
Both approaches attempt to approximate a deterministic integral with no probabilistic
component. Indeed, the domain A is unique and we can not aasume any kind of repetitivity
of
it. If the weights u(x.)
a,re chosen such
i
that:
ur(xs)
o=l
=
|
and
ur(:<") € [0,1]
this ensures that the estimate 6*(A;26) satisfies:
6*(A;
zi
< O*(A; zr)
Y
k' > k
6*(A;zr) e [0,1]
Both conditions are similar to conditions required for a function of z6 to be a cdf. Therefore,
they allow interpreting the estimate (2.3) as a cdf. This cdf can be interpreted as being the
cdf of a random variable Z(x) defined on .4
:
CHAPTER
2.
TflN INDICATON KNIGI]VG APPBOACH:
iD.(,4; z*) =-
Prob(Z(*)
1
zp)
= F(r*)
(2.4)
In expression (2.4) Z(x) is assumed to be stationary over A, thus the cdf (2.4) can be
written as independent of the location x.
The interpretation (2.4) ol the deterministic integral (2.1) defines the stationary univariate distribution of Z(x) over the domain A.
2.1.1 Bivariate Distribution Model:
The question now is to define a birrariate distribution on A following a eimilar procedure The problem is to evaluate the biva^riate cdf of any pair of random variables
Z(*), Z(x* h). That bivariate distribution
conariance
can be ocpreesed as an indicatot noncentered
Kr(h; zkrzk,) defined as:
Kr(h; zh,zh,) = E[f(x; zp)I(x+ h; zr,)] k,k' =
1,...,K
(2.5)
x€(4nA-u)cA, x,x*h€A
EF(*; z1).t(x + h; zr,)l
K
-
Ptob(Z(x)
(
21,
Z(xt
h) S zp)) k,k' =
L,
"',K
being the number of cutofis considered, thus the bivariate distribution is discretized by
K2 values. The domain
The domain
.A
/-U
corresponds to the domain '4, translated by the vector h.
n .4-g is unique in the sense that for each h we can form one and only
one such domain. Different vectors
the bivariate proportion of values
h would entail different intersections AnA-y. Consider
f(x) S z, z(x+ h) S z'within A which can be expressed
as the spatial integral:
O(An A-airrizk)
- #1;
tnno_oi(x;zp)i(x
!h;zr,)
itx
(2.6)
CIIAPTEN
2.
THE INDICATOR KNIGING APPROACH:
assuming that the domain has been sampled exhaustively. The noncentered indicator covariance (2.5) can be identified to the bivariate proportion (2.6):
0@n A-aiz*izp') = Kr(h; zp,zy)
(2.7)
Since exhaustive sampling is ra.rely arrailable a numerical approximation
to the spatial
integral (2.6) is required. Again the integrand functional properties are usually not known
impeding usage of traditional numerical techniques. An estimate of (2.6) can be the weighted
proportion of correspond.ing indicator data values:
(2.8)
! u(x')d(x.; z;)f(x' * h;21,)
a=l
Unfortunately there is not yet any generally accepted procedure to compute the weights
6*(An A-ai"*iz1ot) =
directly. Omre (1985) describes an approach where these weights
a,re
obtained through the
knowledge of the univa,riate cdf: a declustering procedure applied at the univariate level
is transfered to the biva,riate level. No matter its convenience, this approach is debatable
because
the correct way is to decluster first at the bivariate level then proceed to the
univariate level and not the reverse. For exa,mple the marginal cdf derived from the bivariate
distribution:
F(r*) = Q*(A; zp) = $*(A;
z&,
+oo)
(2.e)
is partially reproduced following the Omre's method since that the formulation consists in
minimizing the difference between the resulting declustered marginal distribution (2.9) and
the proposed univariate distribution. Notice that in this expression h
An A-b=
= 0 and therefore
A.
Consistency:
In any practical situation the domain
h:
,4,
is
finite and the subset AfiA-y is diferent for each
CHAPTER
2.
TIIE INDICA"ON KRIGING APPROACH:
An A-6t' An
10
A-y, h # h'
This remark has important consequenceo: the definition domain for each bivariate pro-
portion
t@n
A-Ai"*;zs,) being dependent on h, can not ensure consistency
different valuea of (2.6) for different
h.
between
The immediate solution ie to force the definition
domain to be equal whatever h. Such domain can be defined as:
A'
= A1'1 A-hr o... fl,4-5,
with hi being the vector associated with the ithJag. However such domain would be very
small and the corresponding shortage of data would be a problem. In practice consistency of
the experimental indicator covariances of type (2.7) is achieved through a time-consuming
modeling. This modeling must also ensure the satisfaction of all order relations:
O*(An A-y;21,;zp")
S
O*(An A-6iz1,t;zp,)
Yzs
!
zp' Yzv
1
23"'
(2.10)
0*(An A-y;21,;26,)
€
lA,Ll
Yzp,z1,,
In practice the modeling is performed
assuming a linear model of coregionalization
which ensures the order relations (2.10). This model has limitations since all the direct and
crosscona,riances must be proportional
to a set of predefined
cona,riancee.
It may appear that using indicators an'd identifyrng spatial integrals to probabilistic
cdf's creates more problems than solutions, but in fact the inference of a spatial continuity
me:Lsure
(i.e. semivariogram) is an inherent problem to any geostatistical approach and
is
not particular to the indicator variable. The conditions (2.10) arise from the decision to
model from data the bivariate distribution.
CEAPTEN
2.
TIIE INDICATOB KffGING APPROACH:
11
2.2 The Probabilistic Model:
Once the noncentered indicator covariances have been
ilIentifid to specific spatial integrals
over At-t.A-h one can adress the problem of evaluatinglocal proportions such as the following
spatial integral:
Q(v
with V C.4 and
fr
; zr)=
(urr*,
(2.11)
21,\d,x
lV l<<l .4 l.
The approximation of (2.11) is likely to be difrcult, since there may not be any sample
within
V.
However, now, several panels
V
can.
be defined over the domain A allowing
repetitivity. This characteristic allows randomizing the previous spatial integral into a
stochastic integral:
tD(V; zp)
=
h
(2.L2)
l"I(x;21,)dx
where A(V; z*\ is a random variable and I(x;
z)
is an indicator (binary) random va,riable
defined as:
f
.
|
kk=r..-.
i,l'(x;21) = | rlz(x)<zp
= t,...,K
t ; oi"r*r""
(2.13)
A linear estimator of (2.12) using the indicator data is written:
Knt
Q.(V; z*)
=D |
(2.14)
.\op-I(xo; zr,)
f'=1 a=l
with the n' datalocations xo being
a subset of the whole data set
a. These n'locations need
not be within the panel V. Observe that estimator (2.14) does not assume that E[f(x; z3)]
is known, thus
it is an ordincry kriging-type
estimator.
This particular estimator requires only knowledge of the bivariate distribution (2.7).
trivariate information was available a generalized estimator would be:
If
CHAPTEN
2.
THE INDICATOR KRIGING APPROACH:
Kd
o**(V;
T2
KKln'rrlt
zi=D Irl*,I(r.,i21,,)* D I D t \?oo,r,r,,I(4;zp,)I(x."1;zp") (2.L5)
kt=t htt=l c=l cr=l
&'=1 o=l
The problem with this latter estimator is the call for trivariate information, which is
extremely difrcult to infer from spa,rse and clustered data. The estimator (2.14) faces the
same inference problem
but to a much
lesser extent.
2.2.L Optimality Criteria:
Unbiasedness and minimum nariance are traditional criteria to build statistical eetimators.
In the
case of the estimator (2.14), minimum variance would entail (Luenberger, 1969):
^Ot({O(Iz;
z)-Q'(V;26))}I(:c'; zh)l= 0 Vo- 1,...,n',
Y
k= 1,..-,K
(2.16)
The geometrical interpretation of relation (2.16) in terms of the projection theorem is
that the difference vector A(Vizp)
I(y.*, zr). In terms of equations,
-
O*(V;z&) is orthogonal to each of the data vectors
expression (2.16) are none other than the classical normal
equations.
It
can be shown that the conditional expectation:
Ela$;zx) l(n')l
(2.17)
is the best estimator of A(V;zp) (Rozanov, 1987) in the minimum error variance sense, i.e:
ll o(y;
z)-Elo(V;
z*) | (n')1ll= min ll
o(v; zp)-a*(V;
.8,. being the space of all measurable functions of
Z(a)
zp)
ll
V iD*(V; zp) e
En
(2.18)
e (n). Notice that the estimator
(2.L4) is defined in a subspace (.[") oI En.
Expression (2.16) entails that Q(V;
z6)- EIA(V;rr) I ("')l is orthogonal to any ele-
ment of space En. By construction through the minimum variance criterion the difference
CHAPTER
2,
THE INDICATOR KRIGING APPROACH:
ElaV;rrl(n')l
Q*
13
(V; zy)
Figure 2.1: The estimator O.(V; z;) is the best bivariate-type estimator of E[O(V; ,x) |
EIA(V; zx) | (n')l- O.(y; z1) is orthogonal to any element of the
subspace
("'))
.t" and therefore
by perpendicularity the estimator (2.14) is also the best estimator of (2.17) to be found in
the subspace
Ln.
Figure 2.1 shows graphically these relationships.
Minimum error variance is thus a mandatory condition for the definition of an estimator,
such as (2.L4), for the conditional expectation. There is no other possible criterion to build
in Ln a better estimator than (2.14).
Minimum error variance is one criterion to define the estimator (2.14), unbiasedness is
another, for example:
EIA(
; zk)l
= E[A. (V ; zp)]
(2.le)
The left hand expected value can be seen as a deconditioning with respect to the nt data
of the conditional expectation (2.17), thus:
EIE[a$; z*) | ("')]l - Efa(v;zp)l
(2.20)
The unbiaseness relation (2.19) entails that the estimator (2.14) is also an unbiased estimator
CHAPTEN
2.
THE INDICATON KRIGING APPROACH:
L4
of the conditional expectation (2.17).
Minimum variance and unbiasedness therefore are optimal criteria for the eetimation of
the conditional expectation (2,L7). Any other eetimator in Ln based on any other criteria
different from minimum error variance would yield a suboptimal estimator \n Ln.
2.3
Development of the Indicator Kriging Estimator:
The Indicator formalism amounts to code any randon nariable Z(x) re a series of random
bits (2.13). The following stochastic integrals are then seen
aa
linear operators applied on
these random bits:
A(V;zr\ =
1l
k=L,...,K
m Jrl(x;zr)dx
(2.2L)
with V being the panel over which the proportion (2.21) is to be estimated. The expected
value of (2.2L) is written:
E[o(v; zk)l=
h
lc
1, ...,K
lrE[I(xiz1)]dx =
(2.22)
F(r*)
..,K
(2.23)
ilx= F(zr) V/c: 1,...,K
(2.24)
Since:
EII(x;zp\)= ProblZ(x) S z*l -
lc
=
1, .
the expected value (2.22) is also equal to the univariate cdf F(zy):
E[O(V;
,r)l=
1l
ffi J"F(r*)
A naive estimator of the stochastic integral (2.2L) is written:
t
O.(V;
ll'
zi = *D r(*t
,L
rr) k = L,...,K
(2.25)
a=l
Note that the estimator (2.25) is difierent from (2.14), because
it
uses less
information than
the proposed estimator (2.14). Another shortcoming of this estimator (2.25) is that the
-ar
CHAPTEN
weight
2.
TEE INDICATOR KNIGING APPROACH:
15
(#) ir the same for all n' aamples whatever their location &
Samplea cloeer
with respeci tp V.
to or inside the panel V should receive greater weight than the
located further away
.
sa,mples
There is therefore a need to improve the previous estimator by
ta.king into account some measure of apatial dependence between the samples and the panel
I/, which amounts to consider the position/redundancy
of the sa,mples with respect to the
panel to estimate. An improved eetimator can then be written as:
n'
O*(V; zr) =DA;.f(xo;21) k =
1,...,K
(2.26)
o=l
The problem now is to determine the weights )po. Assuming that the indicator random
variable is second order stationary the unbiasedness condition is devdoped as follows:
rrt
EIA'(Vi"r)l=
! fs.f1f(:c";zr)l
lc
o=l
Accounting for relation (2.23),
it
f
trl
a=1
The estimator (2.26) is unbiased
1,
...,K
comes:
fLl
E[O.(V; zk)l=
=
.\6F(23) = f'(21)D
if
f*,
&
= 1,...,K
:
nt
!)po=1
(2.27)
c=l
This condition is necessary to ensure a linear unbiased estimator and constitutes a
constraint on the weights. The minimum error variance criterion ensures that (2.26) is also
an estimator of the conditional expectation EIO(V; z*) | (n')1. The functional to minimize
is:
E[o(V;
z1)
-
iD*(Vizil2
(2.28)
with the constraint (2.27). This minimization problem can be solved clasically by using the
Lagrange multiplier technique yielding a system of constrained normal equations:
CHAPTDN
2.
TflE INDICATOR KRIGING APPNOACfl:
16
fL,
DCr(*p - xa; zx)\rp * p(zr) = eilrr..,rv;zk) a = lr...rnt
&
=
1,
...,K
(2.29)
0=L
nt
Ir*p =
P=l
1
where:
Cilxp
- xai zp) = El(I(xB;zr) - r(21))(I(r.';"r) - r(rr))l
and
e ilrr.',V;zp) =
and.
h l"cilx- 4iz1')
d'x
p(zp) is the Lagrange multiplier.
The information required by the system (2.29) consists of
K
covariances corresponding
to the K cutofis zr. Notice that the indicator estimator (2.26) is using only the indicator
data at the same cutoff
zk,t
E
opposed to the more complete estimator (2.14). Correspond-
ingln in the normal system (2.29) only the direct indicator cova,riances Cr(h; z1)
a,re used.
Structural information contained by indicator covariances C1(h; zp) at other cutoffs and
indicator crosscovariarrces are not used, consequently additional spatial correlation derived
from indicator data at other cutoff or from the original data Z(x) itself are being ignored
in the expression (2.26).
2.4
An Alternative Estimator to IK:
The previous shortcoming of the Indicator Kriging estimator (2.26) has been mentioned
by Journel (1983) and
Ma,rechaf
(1984). As stated above that estimator ignores a lot
of information. Such approximation would be fully justified
if
the indicators originate
from a random function such that all direct indicator correlations and crosscorrelations are
proportional to each other,
i.
e. proportional to some basic model of correlation Ks(h)
:
CHAPTER"
where
2.
fo(h)
TfrE INDICA?OB KRIGING APPROACH:
L7
dr(h; zk'zk') = D*YKo(h;
(2.30)
is a covaxia,nce model and Dlp is a constant, and:
Cilhl, zv, zs,) =
E[(/(x.;
zp)
- F(z)Xr(*p; zk') - r(rr'))l
is the indicator crcscovariance. If the model (2.30) holds true then
it
can be shown that a
minimum error va,riance estimator of type (2.26) ie identical to the estimator (2.14). However condition (2.30) is very stringent and thus an estimator more appropiate to practical
situations should involve indicator crossconariances and conariances at other cutoffs, which
is precisely the estimator (2.14) called Colndicator Kriging estimator (CoIK):
K
O*(V; z*o)=
tt'
D D \p,ol(to,zp,)
(2.31)
/c'=l o=1
The appropiate criterion to obtain the weights .lpo is the minimization of the error
variance:
Efa(;zk) -
Q.(V;
zr)12
(2.32)
subject to the unbiasedness condition:
K
EIA-(V;z*o\)=
rrt
t \p'.,Bll(a,.,,zp,)l= F(zro)
t c=l
&'=1
The following constraint on the weights ,\1,6 provides sufrcient conditions for unbiasedness:
,t'
D r*p = d&ro lc = 1, . . ., K
with
6ppo
(2.33)
9=l
being a Kronecker delta. The minimization of (2.32) under the constraints (2.33)
yields the following system of equations:
K
trl
t D \*,pCilx.--xpizktzp,)*
kt=l p=l
tt*=e*or(xo,Vizk,zk')
lc
= 1,...rK c =
1,
...,tu'
(2.34)
CIIAPTEN
2.
THE INDICA"ON KRTGING APP&OACII:
18
with p; being the Lagrange multipliers and:
Cilx, -
xp; zp,zp)
=
.O[(/(>..-;
z) - F(zy\(f(*p;
zk)
-
F(zr'))
The constrainte (2.33) are sufrcient conditions to erlsure unbiasednese and together
with (2.3a) constitute the equations get to determine the CoIK estimator. Thus the total
number of equations is
K(n'*
1), a number that represents a formidable task in terms
of computational work and modelling of
O(K')
covariances. This high price to pay was
the very reason to prefer the more practical estimator (2.26) which ignores all indicator
crosscorrelations.
2.5
Are the Indicator Crosscorrelations Important?
The last section showed that practicality was the reason to use the reduced IK estimator
In the IK
(2.26) rather than the CoIK estimator (2.31).
approach crosscorrelations are
ignored, thus the impact of such decision in any particular application must be appreciated.
No general statement is possible because the level and impact of croescorrelations depends
on each pa^rticular data set and study goal.
A simple approach to this question is propoeed considering a basic example and evaluating the impact of ignoring the crosscorrelations on the stimates.
2.5.1 The One Sample Example:
This case represents the most simple estimation problem. Only one sample is used to
estimate the conditional cdf and the assumptions are that the variable Z(x) ie a second order
stationary random variable and that at the gample location x1 the corresponding indicator
sample value for a given threshold zp is 0 and for zya1 is 1. The following probability at
the location x is to be evaluated:
P(22\
= Prob{Z(x) < 4 | I(x1;
zr)
=
0,
f(x1; z1a1)
-
1}
= Prob{I(x,2.) = 1 | I(*r;
z*) = 0,.I(x1; zj11) = 1}
(2.35)
CHAPTER
2.
THE INDICATON KEIGING APPROACH:
19
According to Bayes'Rule relation (2.35) ie orpreesed as:
P('*)=
Since
f(x,21) is a binary random variable then:
Prob{I(xlzr)=L,I(x11zp)=0,f(xr;zr+r)=1}=E[(x;41)f(x1;zr+rXl -f(x1;21))]
E[/(x; zp)I(x1irr+rX1- f(xt;rr))] = K{x- xrizhtz*+r)
- X{x-
x1lzp,,zp)
with
K{x-
:ro; zsr,,z1rn)
=
Prob{Z(x) 1 z6'rZ(x") 1 zp,,}
Prob{I(x1; zr) = 0,.t(x1;2311) = 1} =
F(tr+) - F(r*)
Thus, expression (2.35) is written:
p(rr) - K{x- xr;zL,-z*+t\- {I(x-- n;zn,z*)
F(z*+t) - F(r*)
Now the estimation of this conditional probability using the
P*(zp) =
since )s1
Q
= 1 by virtue of the unbiasedness conditioa
IK eetimator (2.26)
(2.36)
is:
(2.37)
(2.27) and .[(x;zp) being equal to 0.
Similarly, the CoIK estimator (2.31) yields:
P*(zp\
since )11
:
1 and
llrar;r = 0.
cross-information l(xo; zp,), k'
{
: g
(2.38)
Note that the unbiaseness conditions (2.33) cause the
k, to be ignored.
CHAPTEN
2,
THE INDICATOB KRIGING APPROACH:
20
Both estimates are identical and do not reproduce the exact value (2.36). The unbiasedness conditions dominate the estimation proceBs and
both IK and CoIK estimators yields
the same result regardless of the indicator crosscorrelation.
2.5.2 The Two
Sample Case:
In this second example two samples (xr, xz) a,rbitrarily located are considered. The following
conditional probability at the location x is to be estimated:
P(rr)=Prob(Z(x)1zpll(*t;zr)=0,/(xriz*+r)=1,r(xz;z*+r)=0,.r(x2;
z*+z)=1)
(2.3e)
Expression (2.39) is equivalent to:
P("*)
-
E[.t(x;zp) | /(x1;zr) = 0,f(xfizr+r) = 1,{xz; z*+r) = 0,f(x2; zr+z) =
Ll
{f(*t; z*) = 0,.t(x1; z*+t) = 1} is equivalent to {Z(xr) €]zr,zp.u1l} and
U(xz; zr+r) = 0,.(x2; zx+z) = 1) is equivalert to {Z(xz) €fz*+r,z*+zl}. Applyrng Bayes'
where the event
rule, expression (2.39) is written:
P(zp) =
P r ob( Z
(x) J
Pr ob(Z
z p,
Z
(4)
(x1) Ej z p, z r a1f, Z (*z) El" * + r, z r+ zl)
Clrr, zr+r7, Z (*z) €lar+r, zr+21)
or in terms of indicators:
,:r,,_ \
P("r)
=
E[I(x;zs).I(x1;"r+r)(1
- /(*t;
zp))I(x2;rr+r)(L
-
I(*z;zr+r))]
(2.40)
The evaluation of this conditional probability involves using trivariate information which
expressed
in terms of the noncentered covariances is:
Kr(hnr ho2i z*, zh+r, zh+2) - Kr(hor
'
'
boz't
z*t
27q1,
zxlt)
-
Kr(hot, fuzi z*t z*, r*+z)
*
Kr(hor
'
biozi z*,
"r, "x+r)
(2.41)
CHAPTEN
2.
2t
THE INDICATOR KRIGING APPROACH:
with:
Kr( hor,
he21 z 1,t, z pn, z yn t)
= E[(x;
z p') I
(x1i z p") I (x2 zp-)J
which require knowledge of the triva^riate distribution, and:
K ilh;e; zpt, zYt) = E[/(x1 ; zr,)I(x2; zP,)f
which requires the bivariate distribution. Therefore the exact expression (2.40) calls for
trivariate and birariate information.
If the IK estimator (2.26) is used to estimate (2.39) the result is:
2
P*(t*)-t)roi(ru;zr)=0
(2.42)
a=l
given that both indicators at xl , x2 and for e, a;te null. This result shows that whatever the
location of the samples xtr x2 with reepect to x the estimate is zero. Therefore by ignoring
information at cutoffs other than 26, the IK estimate can not take a rnlue varying with the
position of the informing samples.
As opposed to the IK estimator the CoIK estimator does use fully the indicator infor-
mation. For example the cokriging estimator (2.31), using only two cutofis zp all.d. zpq1,
is:
P*
22
(r*) = D,\jof(x.; zk) +
c=l
D
)1r1r1.,/(& i z*+r)
(2.43)
where the weights a,re determined by solving the following system of normal equations:
C1(h;zp,
Q
zr)
Q1(h;zp, 241)
1(h;zp, 21,a1) C1(h;21.r-1, z1a1 )
Lr
gT
0r
lT
10
01
00
00
)rr
C{hot zb zk)
)rz
Cilhoz; zb zh)
llrar)r
C{hor;zuz*+t)
)1r1r)z
Cr(hw;2r, zk+t\
p(zr)
1
p(z*+r)
.0
CHAPTER
2.
THE INDICATON KRIGING APPROACH:
C{h;z;,2;) being the usual crosscovariance matrix
dimension n'
22
between the cutoffs zf, and 21, with
x n'(i.e. 2 x 2), the vector 1 iE [1 1]" and similarly for the vector
All conditioning indicator data at cutofr zk
0.
ate equal to 0, thus the CoIK estimate is:
2
P*(r*)= 0 *
(2.M)
D \**r)od(xo; zr+r)
o=l
The indicator data.I(xo;21a1) take difierent values, therefore the resulting estimate
varies with the sa.mple locations. The CoIK estimate does take into account the spatial
correlation and the location of the informing sampleo.
Another problem
a,rises
for both IK and CoIK estimators: any probability estimate must
verify the order relations:
P*(zp\
e
[0,1]
and
P*(tr)3P*(zp)
V zp,)zp
These conditions a,re not ensured by either estimators and therefore there
will be poten-
tial order relations problems to correct when conditional cdf's are estimated by either IK
or CoIK.
The important point to remember is that crooscorrelations can play a crucial role on
the estimation of conditional probabilities. If such crooscorrelations
a,re
ignored possibly
precious information is ignored. The two samples exercise has showed that
IK yields an
estimate not dependent on data locations, whereas CoIK yields an estimate that depends
on the data locations. However, the modelling of a whole set of O(K2\ crosscovariances is
not practical, thus alternatives to CoIK using more bivariate information that IK must be
proposed
in order to improve the estimation of the conditional cdf.
CHAPTDN
2.
TflE INDICATOR KRTGING APPROACH:
23
2.6 Alternatives to CoIK:
The importance of using crossconelations was pointed out in the last section and yet the
possibility of using the full Cokriging System (2.30) was rejected based on practical considerations. Therefore there appears a need to find a better estimator that IK still involving less
work than CoIK. The CoIK estimator is the most complete biva^riate estimator in the sense
that
it
uses
all
biva^riate structural information, thus any other bivariate-type estimator
using less structural information is necessarily inferior or equal to CoIK
.
Observe also that all crosscorrelations must satisfu Schwarz's inequalitg consequently
their magnitude is lese than that of direct correlations, thus their influence on the final estimate is less important. In any practical situation the crosscorrelations should be computed
to check their relative influence
. Unfortunately, in the recent literature on application
IK there is little if any mention of such
checks.
of
If crosscorrelations are found to be unsignif-
icant then one could document trading the simpler IK for CoIK, otherwise a more complete
bivariate-type estimator than
2.6.t
IK must be used.
The Probability Kriging Estimator:
The Probability Kriging (PK) estimator (Sullivaa, 1984) recognizes that by ignoring the
crosscorrelations biva^riate information is neglected and proceeds in incorporating more bi-
variate information through a uniform transform of the original d,ata Z(x"), therefore the
PK estimator appears
as a simplified version of the general
flt
iD*(V;zr) =
where the random variable
Z(xr)
,t'
Dlpof(xo;zr)+ !
o=1
CoIK estimator:
rr,U(u) k=L,...,K
(2.45)
o=1
t/(x") is defined as the uniform transform of the original data
using the experimental cdf , therefore:
u(z(*")) =u(x")
-F(z(x)") o= L,...,n'
(2.46)
CHAPTEN
2.
The weights
Likewise to
TfrE INDICATOB KRIGING APPROACH:
24
determined using an unbiasednees and minimum error va,riance criteria.
a,re
IK or CoIK the error nariance:
^g[O(Y;
21,)
-
tD'(Vizi]2
expression which is minimized subject to the unbiasedness condition:
tr'
.E[o.(V;
zk)]
=
ft'
t v6ElU (x)l = r(21)
fo=1,r3"a1r(x"; zr)l + a=l
with:
nl
E[o.(V;
zk)l=ix*,t1rr)+
a=l
tt'
i
1
,r,(f,)
o=1
Sufrcient conditions for unbiasedness are:
n'
nl
D'*,
f'\po=1
c=l
c=1
=o
The minimization of the error variance subject to unbiasedness conditions yields a cokrigrng system written in matricial form as:
I
L
C{h;21,,21,) Cru(h;zr,)
"i|"' 'li'
1 oI I r I
I
: liL',!=L
c1(tr;zp)
'-:''"'
(2.47)
with C1(hizk,zh) being the indicator corrariance matrix, Cfit(h;zk) the indicator-uniform
crosscova.riance
matrix whose elements are defined
Cnt(h;zt)= E[(/(x + h;zr)
-
as:
F(21,))(U(x + h)
- ]l}
and Cy(h) is the uniform covariance matrix. Each of these matrices have dimension nt x n',
thus the dimension of the left hand side matrix is (2n'
*
2) x (2n' + 2). As for the right
hand side, Cr(h;21) and Cru(h; zp) arc respectively the indicator covariance vector and the
indicator-uniform crosscovariance vector between the informing sarrples and the location
CHAPTER
2.
TfrE INDICATON KNIGING APPROACH:
25
being estimated. Note that the formulation(2.a7) corresponds to V being a point at location
x.
Application of PK to the two sa,mples exercise yields an estimate that depends on the
data locations. The uniform transform (2.46) by construction defines for each location
&
a
unique uniform rrariable u(:c") difierent from one location to another, thus the PK estimate
is:
222
P*("*)
t )r,i(xo; "i
- a=1
+Dv1,oa(xo)
o=l
-
1+ D r*"r(r..)
o=l
the second sum is different from zero in all the cases. If the result of thie sum is positive
the PK estimator, just like IK and CoIK, will present order relations problems. However,
the use of the uniform transform data improves the estimate resolution and therefore PK
has a significant advantage over
IK in this particular example. In
general, the PK or
CoII( estimators because they use more bivariate information can not be inferior to the
IK estimator. Sullivan (1984)
and Kim (1987) have documented actual studies where the
performance of PK was definitely superior to IK.
The full CoIK estimator requires the modeling of O(K2) conaria,nces, the PK estimator
(2K +7) models and the IK estimator only K models, therefore any improvement represents
additional work over IK. In this sense PK can be seen as an intermediate estimator between
the full CoIK and the simpler IK.
2.6.2 A Simpliffed Version of the CoIK Estimator:
Inspection of the bivariate information used by the PK estimator reveals that the improvement comes from the introduction of one additional va,riable
(U(*")) i.e. the introduction
of an additional overall joint moment of order 2 (Cu(h)) plus the correponding indicatoruniform croscovariance (Cru(h;zr)) interpreted as a conditional moment of order
1984a):
L
(Journel,
CHAPTEN
2.
THE INDICATOR KRIGING APPROACII:
26
CN(h;z2)-E|U(x)I(*;,r)]-"ry+E{|t/(x+h)-U(x)|tr-/(x*h;z1)]{x;zt)}
(2.48)
Other alternative consists in using a simplified version of the full CoIK estimator. For
(K = 2) rather than only one in the case of
example one such estimator could use two cutoffs
IK. The two cutoffs should be z1 the cutoff being considered for the conditional probability
and, zp,
the cutoff maximizing the crosscorrelation between l(x;zy) and f(x;21,). Such
alternative estimator would be written:
tr,
This estimator uses
,i
n,
(2.4e)
D v6I(4; zp')
D
"*) = a=l
a=l
information in the space of thresholds through the additiona) zp,
,\ro(x.;
O.(V;
cutofl thus it requires the computation of all
+
crosecovariances
in order to select the
value
zp maximizing the crosscovariance C1(h; zz,zp). The idea behind this simplification is to
use the indicator crosscovadance
at zy to screen the influence ofthe other crosscovariances
The form of this SCoIK estimator is similar to the PK estimator. Two indicator variables
are considered at cutoffs zp, zlrr and the number of covariances to model is
2K. In matricial
form the corresponding cokriging system is written:
I C{h;z1,,zp)
I
I
C1(h;21,,21,,)
C{h;zp,zp') Cr(h; zh,,zk')
ltts"
LotLr
t
ol
0 rl
0 0l
0 0J
.\
Cy(h;23)
v
Cl(h; zr,zp')
;l
l.t
v
with the unbiasedness conditions:
vt
D)*o=
a=1
,7'
I'ro
a=L
=
o
I
|
(2.50)
CHAPTEN
2.
TflE INDICATOR KNIGING APPROACH:
27
2.7 Other Perspectives:
IK, PK and SCoIK are bivariate'type estimators trying to approximate the CoIK estimator
through a cokriging system reduction. The approximatiou consists in either some screening
of the full crosscorrelations pattern (IK, SCoIK) or the introduction of more but not all of
the bivariate information (PK, SCoIK).
Introduction of more bivariate information requires a growing number of covariances
to model and a la,rger computational effort. Consequently any new future estimator will
be on the same line, assuming some ecreening on the croesco\lariances or increasing the
number of covariances being used. There is no other option as long as one is restricted to
a bivariate-type estimator based on indicators, to estimate the conditional cdf.
The very nature of the indicator coding of information calls for use of multivariate sta-
tistical techniques: at each location there is a vector of K indicator va,riables. The problem
is that moot traditional multivariate statistical techniques were developed for applications
where spatial correlation is absent or ignored. However some of these techniques could
possibly be adapted to the geostatistical context.
Multivariate statistical techniques such
as
Discriminant Analysis, Cluster Analysis, Prin-
cipal Component Analysis, Correspondence Analysis among others consist in either reducing
the dimension of the original problem or in creating new variables summarizing information.
If the indicator formalism is seen along
these lines,
it is not different from other multivariate
statistical applications, except for the spatial correlation.
Another major point is that all previous estimators are based on bivariate information
which is inferred and modeled from the data, and therefore the conesponding estimates are
linear combinations of biva^riate information.
Ilowever,
it can be shown that if the conditioning
data a,re z', the exact conditional
distribution calls for a (n'* 1) multivariate distribution. Thus the solution provided by any
bivariate-type estimator attempts to approximate a (z'* 1) multivariate distribution by an
estimator that depends exclusively on bivariate information. If trivariate information or a
CHAPTEN
2.
TfrE INDICA"OR KRIGING APPROACE:
28
term summanizing the whole multivariate information ie added to the present bivariate-type
estimators, there is no doubt that major improvement will be achieved. Synthesis of such
information could be obtained from non-gaussian simulations or better from actual and
exhaustive data sets.
The challenge to adapt traditional multivariate statistical techniques and incorporate
trivariate information is still open and must be taken to improve the present estimators.
Chapter
3
Principal Components and the
Indicator Approach:
The last chapter described the main features and the major shortcomings of IK. Thie
chapter will present an improved approach based on the application of Principal Component
Analysis (PCA) to indicator data.
PCA is an algebraic technique allowing to transform a vector into another one. In 2D,
PCA can be eeen as a rotation of axie where the rotation angle is chosen such that the
spread of the first transformed va.riable along the first axis ie manimum and the apread of
the second transformed va,riable along the second axis ie minimum. Figure 3.1 ehowe such
transformation. Note that PCA is an orthogonalization procedure which does not require
any etatistical hypothesis about the data. A statistical interpretation of such orthogonalization shows that the crosscovaxiance between the transformed nariables ie zero and that
the first transformed variable or first principal component has the maximum na,riance, the
second principal component the second largest rra,riance and so on.
Those properties popula,rized PCA as a way to reduce the dimension of the data by
selecting a reduced number of principal components. Such decision is supported by the
assumption that the va,riance is the most important aspect of va,riability and that retaining
those rnriablee with the largest contribution to that na,ria,nce, would provide concise yet
eatisfactory explanation of the source of that variance. The choice of the ra,riance as sole
criterion to interprete and rank sources of Eriability is debatable. In geochemistry where
the number of va,riables is large, PCA has been extensively ueed to reduce the dimension
29
CHAPTEL
3.
P8INCIPAL COMPONENTS AND THE INDICATOR APP&OACH:
30
Y2
Figure 3.1: The principal component transformation can be seen as a rotation of the original
a:ris X1 and & into the new a:ris Yr and Yz.
of the problem. The price of such transform ig that the components edected must be
interpreted, a possibly non trivial task.
This chapter describes the application of PCA to indicator vectorg and points out the
advantage of such transform in the context of Indicator Kriging. To allow for full analytical
developmentr any two random va,riables Z(x) and, Z(x* h) a,re assumed to have joint
bigaussian distribution.
3.1 Tlansforming the fndicators:
Denote by I(x; z) the indicator vector associated to the location x, whose .[( elements
a^re
defined as:
d(*;16) ]?
with:
.,
t\xizk)
=
f I ifz(x)!21, &=1,...,K
| o otherwise
The indicator covariance matrix at h = O is written as:
(3.1)
CHAPTE&
3.
PRINCIPAL COMPONEN?S AND THE INDICATOB APPPcOACH: 3t
Cil0;zr,zr)
C{0;zz,z2)
dr(0; zr,zx)
Cr(O; zz,zx)
E1(0) =
(3.2)
eyfnfn.
C{0; zy, zy)
where:
C{0;zr,zpt) = Coo(r(xizp),r(x; zy))
= Prob{z(x) 1 zs,z(x) < zh,}-F(zp)F(zp)
(3.3)
with:
F("*)=Pr&lZ(x)Szr)
One way to obtain the orthogonalization or principal componente of the matrix Er(O)
is to consider ite spectral decompoeition (Strang, 1980; Anderson,1g84) defined by:
>r(0) =
where
A is an orthogonal matrix
AAAT
(3.4)
such that:
r = A"A =
AA"
(a.5)
I
is the identity matrix and A is a diagonal matrix. ftuthermore, by virtue of this
decomposition the columns of matrix A are the eigenvectors of Er(O) aad tbe elements of
the diagonal matrix A are the eigenvaluee of El(O) ordered from the la,rgest eigenvalue to
where
the smallest
()r > .\z )
... > .\"Xwilkinson,1965).
Once the matrix A has been calculated, it is possible to compute the indicator principal
conponents by a simple matrix multiplication:
Y(x) = aTt(x;z)
Each element of
Y(x) is written:
K
Yr(x)= t
&'=l
ar,rtl(x;zrt)
(3.6)
(g.Z)
CHAPTER
3.
PRINCIPAL COMPONENTS AND TEE INDICATOR APPF:OACH: 32
where a1,pr &rd
I(x; zr') a,re elements of matrix A
crosscova,rialces
for h =
and vector I(x; z) respectively. The
new va,riables Yp a,re linear combinations of the indicatorg with the property that their
O a,re zero. Defining:
Cy
(hik, k'\ = C oo(Yt(x), Y1,(x f h))
we have due to the orthogonalization:
Cr(O;fr,k')=Q,
V
k*k'
(3.8)
and the va^riance of Yp is equal to the kth eigenvalue of E1(0):
Cy(O;lc,&) =
A1,
V
&
= 1,...,K
(3.e)
The orthogonalization (3.6) does not ensure that the crosscovariancea Cr(h; krkt) are zerc
l + 0. If that were the case, the cokriging of Y(x) would be reduced to the kriging
of each of its elements Y;(x), since the crosscovariance between fr(x) and Y3,(x * h) would
be zero. However since Yp(x) and Y1,(x * h) are uncorrelated at llhll - O their degree of
correlation for all other llhll # 0 is expected to be wea,k.
The case of a Gaussian random function Z(x) with a known binormal distribution will
be considered in order to investigate the behavior of the Y-crosecovaria.nces for llhll I 0.
for
f
f
hf
3.2
The Bigaussian Model:
Consider a stationary random function
standa,rd biva^riate normal distribution
Z(x), such that any pair Z(x\rZ(x*h) has a
with mean:
lt=O
(3.10)
and cova,riance matrix:
Ez(h)=
With theee
[r,t, 'f,]
pa,ra.meters the bivariate normal
f (2,
z'
;p(h))
=
, with lp(h) l<
I
(3.11)
distribution is expressed as (Anderson,1984)
*Jfu,*eF#:Wl
:
(3.12)
CHAPTEB
3.
PfdNCIPAL COMPONENTS AND THE INDICATOR. APPROACH:
33
where p(h) is the z-conelogra,m defined as:
a(h)
P\1.)
-- coa(?!x\,2{:!D)
= E[z(x)z(x+ h)]
Var(i,g11
(3.13)
Using this standard bigaussian model, Xiao(1985) gave the expressions of the corresponding indicator cornriances:
Cilt;; 2",z") =
1
Prctin
* J,''
P(h)
",2
eanl-;fr.n
Ve
(3.14)
Generalizing this approach, an qxpresEion for the indicator crosscova,ria[oeE can be given.
We proceed first by expressing the non-centered ctoescovarialrce:
Kr(h;
,
z,z'):
lz
lzt
J_* J_*I@,y;
p(h))dody
(3.15)
then the derivative:
Q({h;z'z') =
0p
I:* Y*ar@'{'{(hD a'au
Using Xiao's Lemma
comes
(Xi-,
from the integral (3.16)
198b, p.6), which states
161 9/Gg;aJ$) =
(3.16)
tJJftffi,
it
:
(3.17)
Therefore:
oKt(\;z,zt)
!(z,z,,p(h))
(g.18)
=
0p
Now the crossindicator corra,riance is expressed in terms of the noncentered cova,riance
Cr(h; z, z') = Kr(h; z, z') - F(z)F(zt)
Therefore with relations (3.18) and (3.19)
it
(3.1e)
can be written:
(3.20)
Integrating (3.20):
CHAPTEV
3.
PEINCIPAL COMPONENTS AND
cr(h; z,z') =
"IIE
INDICATOB APPBOACII:
* !,^"' 7{p*or-n i{-:#;"'tdo + cr
34
(3.21)
where Cr ig a,n integration coneta,nt. Thie ercpreeion ca,n be modified considering the change
of va,riable o
=
sin
0z
cr(h; z,zt) =
*
Io*-'"o'o'
"rol-"
*
4-:?'"'ntldl * ct
The constant Cr can be derived considering llhll
Cr(h; zrz') .+ 0 , therefore:
Cr =
* e
which entails p(h)
(g.zz)
-
0
Finally the indicator crosscovariance for the bigaussian random function Z(x)
cilh;2,2,) =
*
* I;'"'"'(ol "rol-" "!:o-'-1''"'o0ld0
0 and
(3.23)
isz
(g.24)
When p(h)
= 1, d can take the value f , therefore the orponent of eap preeents a singula^rity
at this point. The definition of the standa,rd binormal distribution (3.12) is not valid for
such \alue (Anderson, f984) but such aingularity ie not on the definition domain of the
integrand.
Fbom a practical point of view, expression (3.24) can be solved by numerical integration
in account the referred singularity. Note that the adrrantage of this latter expression
against expressions (3.15) and (3.19) is the reduction of a double integral (fr(h; ,z,zt)) to
ta^king
a single integral.
3.2.L
Covariance Matrix E1(h):
Flom expression (3.24) two properties of symmetry for the indicator crogscova.ria,nces can
be derived assuming that Z(a) ie a random function with standa.rd bigaussian distribution:
Cr(h; z,z') = C{h;z',2)
(3.25)
and
Cilh;z,z')
- Cr(h; -2,-z')
(3.26)
CEAPTE&
3.
P&INCIPAL COMPONEIVTS AND TIIE INDICATOR APPROACH: 35
Expreesion (3.25) entails the symmetry of
El(h) with respect to the main diagonal
and
the combination of expreesions (3.25) and (3.26) entaile eymmetry with regard to the other
diagonal . This property ie called persymmetry (Golub and Van Loan, 1983) and is only
verified
if the K cutoffs are chosen such that:
zk
= -zK-h+r V lc = tr.,,,,K
(3.27)
This particula,r choice of cutoffs entails that the indicator croascovariances verify the relation:
c|(h;zprzp') --
c{h;zy-1r'+rrzK-h+r) v
Therefore the na,riance covariance matrix
cl,l
El(h)
&rlc'
(3.28)
presents the particular form:
cr,K-2
cl,2
cr,tr(-r cl,trf
c2,K-t
Clm-l,'m-l
Cm-l,m
E;(h) =
(3.2e)
c^r^
syfnfn.
Syrnrn.
where ci,i =
C{h;z;rz).
Thus the matrix
El(h) is not only symmetric:
Er(h) = EJ(h)
but
it verifies also the relation
(3.30)
of persymmetry:
rr(h) = EEr(h)E
(3.31)
with:
u=
and the vector
ef
[
ef
ef
is defined as the ith-column of a
form of .E entails that:
E=ET--E-r
(3.32)
K XK identity matrix. This specific
3.
CHAPTEE
3.2.2
P&INCIPAL COMPONEN"S AND THE INDICATOL APP&OACH: 36
Eigenvectors of
>{h)
The symmetry and persymmetry of E1(h) yields specific properties for the corresponding
eigenvectors.
Lemma
L
The ith (^t) and
K bing an odil number,
persymmetric matrix Er(h),
ai- Ior,,
om.'r,i
jth (a;) eigenvutors asseiated with a K x K
d^,i om-t,i
eymmetric and
have the follouing two forms:
or,,]t i=2k-lr
&=1r...r K +t
2
(3.33)
and
r ot;
=
"j [
^ -am-ri
O
a,,.-r,i
r"-1
1,...,';
-or; J- =2k,
1T j.
k
(3.34)
uherc aa and a1 arc the eigenaetors convepniling to the eigenualues
\; and ),; of the matria
Er(h).
Proof : Consider the spectral decomposition of E;(h) is:
>r(h)
and with the columns of
A
- AAA"
(3.35)
(3.33) and (3.34), expression (3.35) can be written as:
at,l
E1(h):
alm-l,l 0m-1,2
@mrl0
Om-l; -Om-lr2
alJ
-Q1,2
Am-lrK
&mrK
&m-lrK
at.K
t;
.trr
CHAPTE&
3.
P&INCIPAL COMPONEN?S AND TfrE INDICATOB APP8,OACH:
37
T
Om-l; Om-lr2
&m-l,K
om,l
0
... o^rK
Om-lrl
o1,1
-@m-lr2
Om-lrK
-(tr1,2
Ot,K
After matrix multiplication, the result eKpressed in terms of a's and
in figure 3.2.
(3.36)
l'c ca,n be obeerved
Therefore eigenvector expressions (3.33-3.34) and the spectral decomposition (3.35) are
sufrcient conditione for symmetry and peisymmetry of E1(h).
The necessa,ry conditions requires to show that )'s a,od c'c can be found such that El(h)
can be factorized as (3.37).
Accounting for relations (3.33-3.34) the number of different a's in A is r?+t and the
number of l's is K. so the total number of unknowns is:
***'{t=ry
vKod(t
(3.38)
El(h) has only ([*f different elements considering its double symmetry. The
product AAT, with the columns of A defined by (3.33-3.34), shares the se'ne eymmetries
The matrix
as well. Therefore,
it
is possible to formulate a system of nonlinea^r equations as follows:
(5+U equations derived from the matricial equation AA" I and
=
ry: equations by
equating terms from the expansion (3.37) with the elements of E1(h). The number of
equations corresponds exactly to the number of unknowns.
The conditions to get a solution are given by the Kantorovich-Newton Theorem (Dennis and Schnabel, 1983, p.92; Ortega and Rheindbold, 1970, p.42l)z the Jacobian ( J)
of the equations system has to be Lipschitz continuous (Atkinson, 19?8) and J-l must
exist. Both conditions are satisfied. The first one is satisfied because the elements of J
are polynomial-type expressions and by consequence are Lipschitz continuous; the eecond
condition is verified because the set of nonlinear equations a.re linea^rly independent since
>r(h) is positive definite, therefore J-l orists.
CHAPTER
3.
P&INCIPAL COMPONENTS AND THE INDICATO& APP&OACII: 38
nl
c.|H---i
glt3.S'
5 si;t j...x$
'I-...5
T i?-l k!r'
.i
F.t
i
kI
JiY.{h
r,.l x.!
sI
f \
Ht
T
:.t
N
Et1
1
g $si
{ T**
i .''>. f <.'
AfBa
F.r
I
I
Y .l'iV.U
r'i
Y k.!i^
H
Yt
'a
i
e-
t-
g
c?
ca
i
o
k!!
ti
*X
tr
@
@
o
t
A
x
ra
H
6l
.:
ca
o
d-
H.l
rNNl
I
bo
F{
6:$rF6:
c! .lB
,3
:|
i {.r.,L
.J
^.f
drE
T :.< g
'.tTnT'e
T
;t
{.if {
f..TrT..]
€SxSB
F
s.!!
ri
F
.I
'rR F.! rj
F.g
/\
/\
!^l
l^l
tl
H
F.!!
CHAPTEV
3.
PRINCIPAL COMPONENTS AND THE INDICATOR APPROACH: 39
Thus the proposed forms for the eigenvectors of a eymmetric and persymmetric matrix
are completely defined by expressions (3.33-3.34) .tr
3.3
Computation of the Principal Component Crosscovariances:
Expression (3.7) shows that the principal components
a,re
linear combinations of the original
indicators, with the weights corresponding to the E1(0)-eigenvectorg. Hence if f1(x) is the
Ith-principal component and Y1(x
* h) is the kth-principal component, the crosscova^riance
between them can be written as:
dy(h;t,k)
- Cy(h; alt1z1,a[r1z;;
(s.sg)
>t(0). By matrix
manipulation such
where a1 and a& are eigenvectore associated with
crosscovariance is
written
as:
Cy(h;1,&) = afl>11t;a1
(3.40)
Recall that E1(h) is doubly eymmetric and the eigenvectors eatisfy relations (3.33-3.34),
hence the product (3.40) is exactly zero for llhll > 0 under the conditions of the following
theorem:
Theorem
3.t
The croeacooatiance o! the principal compnente Vt andY* derioed ftvm the
indicator oariable I(x; z) satisfies:
cv(hd,&)=o vllhll >0
(3.41)
t/ >l(h) is a symmetric and persymmetrtc matria, and the indetea I and k arc an odd
number and an euen number rcspectively or uiceoersa,
Proof : The covariance matrix Er(h) satisfy the double symmetry (3.33-3.34). According
to the previous lemma the eigenvector ag ie written:
at
= [ ot,r
Similarly a3 is written:
al,m-t @4m
al,m-t
or,,
f' ,
Y
I
odd
(3.42)
CHAPTEB
3.
PRINCIPAL COMPONENTS AND TfrE INDICATOL APPBOACII: 40
Therefore the product
aflEr(h)
af,Er(h)
has the following form:
= | o,
d^-t d^ d".-t
d, l
(3.44)
and the final product is written:
afEl(h)a3 =
o
(3.45)
whatever llhll > O.tr
This result irnplies that some of the principal component crosscova,riances a,re zero for
all h and that the joint estimation of the elemente of Y(x) can be achieved by the solution
cokriging system or that cokriging system can be further approximated by
kriging each Yt disrega,rding the non-zero crosscovarianceo . Thelast approximation assumes
that all the crosscovariances are zero when in fact some of them a,re not. In contrast
Indicator Kriging ( II( ) ma,kes similar assumptions but without any knowledge about which
crosscona,ria.nces a,re actually zero. The transformation (3.7) on the indicatore ensures that
of a
epa,rse
some crossconariances a,re exactly zero for
random function
3.4
all h when bigauesia,nity ie assumed for the
Z(r).
Numerical Computation of the Indicator Crosscovariances
This section is devoted to numerical integration of (3.24)rexpression defining the indicator
cova,riances and crosscovariances. The major problem with this integral ie the point at
0
= [,
whic.h is a singularity point for the exponent of the integrand. Although
ie not contained between the integral limits,
that point
it ie cloee to the integral limita and therefore
(Eisner, 1967). This potential source of enor is the reason
to choose a numerical integration technique capable of checking convergence.
ca.n cause convergence probleme
3.4.1
Cautious Romberg Extrapolation
The basic idea is to approximate the integral Cilh;zrzt) as follows:
3.
CHAPTEB
ct(h; z, z,)
where the
PRINCIPAL COMPONEN?S AND THE INDICATOB APPP',OACH:
ob'
* I:'"*
=
1's
4L
* '"-::22"'"in|lda s(r)+ Ar.". +. . .*AxflK +o(f*)
=
"ro1-t'
(3.46)
are integere, the constants
A;
a,re
independent of t,
t
ie a \alue linked to the
discretization intenal of the integrand and S(t) ie any numerical rule to approximate the
integral Cr(h; z, y') such that:
I$strl
= Ct(h;z,z')
(3.47)
The numerical rule S(t) to approximate the integral could be, e. g. the Simpson rule, the
trapezoidal rule, the composite trapezoidal rule, etc.
The simplest idea to evaluate Cr(h; z,/)ie to compute,9(l;) for several t; values and
from these S(ti), extrapolate to t = 0. This a,mounts to combine difierent nalues of .9(t;) to
get .t(0). For exa,mple, dr(h;z,z') can be approximated by:
Cr(h;
z,z')= S(2t)+
A1(2t)1t
+...+
AK(?I)1K +O(t1K)
therefore, subetracting the orpression above from (3.46):
K
0
=
^9(r)
-
^s(2r)
and adding (3.48) to (3.40) with r
cr(h; z,z'i) --s(r) +
Observe
if
-
z'ti) +
o(t'tK)
(3.48)
I l:
ry
+
i ry
+
that the first term of the 6um can be made zero if r =
cr(h; z,z') =^e(r)+
Thus
+ !.a;r'1L
i=l
W
-*W
!T
oQ1,)
(3.4e)
r 16"o,
+oe',h)
(s.bo)
the integral Cr(h; z,zt) is approximated by the two first terms of expression
(3.ae):
Cilh; z,
z'\
x,S(t; r) = .t(t)
*
ry
the estimate ,9(l; r) with r = 21 is a better approximation to
The error order ie O(ft).
(3.51)
C{h; zrz') than ^9(l) itself.
CHAPTEL
3.
PRINCIPAL COMPONENTS AND THE INDICATOR APPROACfr: 42
The previous procedure can be generalized into:
.9(t; 11, . . .,
rj) :
.9(f;
?'
. . .,
e;-r)
rr,. . ., rj-r)
- 5(2t; rr,. .. rrl-r)
ri-L
.9(t;
*
(3.52)
with at each step:
"i=?1;
d=1r...rK
and a much better approximation to Cr(h; zrz') is:
Cr(h; z, z') x.9(t;
whose order error is
11, . .
.,
(3.53)
".r)
O(tri+r). Thue the cautious Romberg extrapolation
increase the pre.
cision of ordinary integration rules (de Boor, 1971).
However the approximation (3.52) is based on the assumption that the terms
Atfl ,. . ., Aitti
accountsformostoftheerror[5(t)-Cr(h; z,/)l.If suchassumptionisnottrue,thereisnot
wa.rranty that by doing extrapolation the new approximation .9(t; 11, . . ., rj) fo C {h; z, /)
is better than .9(t).
In order to test convergence Lynch (1967) proved that:
R1-{t)=
*
2",
(3.54)
Thus the satisfaction of (3.54) is the requirement to ensure that:
I
Cr(h; z,z') -.S(t; 11,. .., rr)
I <
I
Cr(h; z, z,)
- S(r)
|
and therefore expression (3.54) is an indication of convergence.
3.4,2 The Composite
Tbapezoidal RuIe
This pa,rticula,r numerical rule is used in the current implementation to solve integral (9.24).
Define the integrand as follows:
g(o)
=
eap[-22
+
@l
z: -
?1i-Ein(o) l
the composite trapezoidal rule can then be defined as:
CIIAPTEB
3.
P&INCIPAL COMPONENTS AND TfrE INDICATOB APPnOACfr:
,5(t1
=
tl2L
t{ig(tr\ + }o@)l
+
!r(cr + dt)}
43
(3.55)
ert az being the limite of integration and
,--ff
with
,
being a predefined integer number.
As g(0) is continuously difierentiable, the integral (3.24) can be expressed as (Davis and
Rabinowitz, 1,975):
K
Cr(h;
z,z')= S(t)+D/.p"
*O(t2K'1
(3.56)
d=1
Therefore the application of cautious Romberg extrapolation entails that:
^l;=2i d=1r...rK
and the order error is
O(*).
3.4.3 End Point Singularity
of end point singularities in the integrand, Lyners and Ninha,m (1967)
have shown that using for numerical rule 5(t) the composite trapezoidal rule, the integral
In the
Cr(h;
presence
zr/)
can be approximated by:
2K*l-o
C{h;z,z')x,S(t)+
where A; and
Bi
td=l
K
A;f+,a
+ln;t2i *O1tzx+t) c€ [-1,1],#0
are independent of
t. This pa,rticula.r
FsAo(r)
Therefore
(3.52)
d=l
form of C1(h; z, zt) enrlarls that:
- 2r+o € [1,4]
(3.58)
if Ao(t) converges to some number
between [1,4], the integrand is suspected
of having an end singula^rity point or a simila,r behavior.
For the case of (3.24)
if
cautious extrapolation is applied, the term .r{1lr+o will be the
dominant term if t is small.
''Gr
CHAPTEN
3.
PNINCIPAL COMPONENTS AND THE INDICATON APPNOACII: 44
9.4.4 Implementation
by coneidering
The programming implementation of cautious Romberg extrapolation sta.rta
automaticdly eplite
the whole integration interval. If no(t) is not satisfied then the Program
the parameter 'L
the original interval in two subintervals which a^re stored in a atack; then
and etored in
corresponding to the number of timee that a eubinterval is split, ie updated
is tested:
the stack. tbom the stack each particular subintennl is analized and convergence
if relatione (8.S4) or (8.58) are not satisfied a flag is set up for that particula^r eubinterval
presente two casee
and is reported as a subinterval without convergence. The next section
results. In
where this technique has been used and compares numerical reeulte with ocact
all the cases stud.ied, the convergence rclatione (3.54) or (3'58) were satisfied'
3.5
ExamPles:
The integrat (g.24) will be ueed to compute the indicator cona.riancea a.nd croesco\ra,ria,nces
for a pa,rticular model of the correlogram p(h). Also, the principal component traneforthe
mation (3.?) defined on the indicators will be investigated together with its effect on
cova,riances and crosscovariances of
type (3'40)'
Two cases a,re presented considering successively three and five cutoffe. Both
consider the sa.me isotropic spherical correlogtam defined by:
p(h) =
8(+) - +(S)'
'l: l:
t
cases
(3.5e)
with range 10 units and unit eiU.
The integral (J.24) is numerically solved via cautious Romberg extrapolation yielding
indicator covariance and crosscovariance nalues. The numerical integration was checked at
result.
llt ll = 0, by compa,ring the exact value with the numerical
From ecpression (3.3) it is known that:
Cr(O; z16,zpt)
-
Prob{Z(x) S zr, Z(x) < ,*,}
-
F(zp)F(zp')
with
Prob{Z(x) <
therefore:
zp,
Z(x) 3 z*'l = F(min{z*,2*,1)
CEAPTE&
3.
PRTNCIPAL COMPONEN"S AND THE INDICATOB APPBOACH: 45
Table 3.1:
zh
-2
-2
-2
-2
-2
-1
-1
-1
0
C 1(0; 26,
zk, C r(oi zk, zh') C r(0; zk, zh,)
-2 0.022280 0.022232
-l 0.019181 0.019140
0 0.011400 0.011375
1 0.003618 0.003609
2 0.000510 0.000517
-1 0.133514 0.133483
0 0.079350 0.079327
1 0.025185 0.025171
0 0.250000 0.249999
zp)
= F(minfzp,
zk
l) - F(zs\F(zp')
(3.60)
Expression (3.60) can be used to check the numerical integration of (3.24) at llhll = g.
The table 1.1 presente this compa,rison, with C{O;zk,zk') being the result obtained by
numerical integration. The relative precision is about 0.2% which is acceptable for the
purpose of the analysis.
3.5.1 The Three Cutoffs Case:
The three cutoffs considered for this example a^re: -1, 0, 1. Figures 3.3 to 3.5 show the
corresponding three covariances . It can be observed that due to the symmetry of expression
(3.24) the covariances for the first and third cutoff a,re the Barne. Frgures 3.6 to 3.8 show
the indicator crosscova,riamces. These figures do not have the same scale hence any direct
comparison is difficult. Figures 3.9 to 3.11 show the corresponding indicator conelograms
pt(h; z) which present a range (10 units) identical to the z-correlogram range. Define the
integral range of the indicator correlogram as:
l@
ll0
t,(r) =
pt(h; z) dh =
p{h;
Jo
Jo
z) dh
(3.61)
Flom figures 3.8 to 3.10 the following inequality is observed:
61(0)>01(z) V
z/0
(3.62)
CHAPTENs. PRINCIPAL COMPONENTS AND THEINDICATON APPNOACII:
- irdaotol.
(Lt5
cgvoFlonoa, orrurirg t*roricrity
46
{-b-lL
(,.t37S
0.1?5
o.ues
tLt
o.0875
a{.
o.ozs
o.06e5
0.05
0.0375
0.o45
o.ote5
^-o
|
?
3
rt
5
6
h
Figure 3.3: Indicator Cova,ria,nce for the cutoff a
units.
78St0lrla
= -1.0.
Observe that the range
iB
indicotor covonionoe, oeruilng binomotiry (O'OL
o.4
0.375
&35
0.3a5
(L3
oaTs
(Las
&aa5
a.
o.e
0.175
GTE
o.las
0.1
0.0r5
&05
o.025
n
-0
|
2
3
4
S
6
h
7
I
e
l0
Figure 3.4: Indicator Coraniance for the median cutoff z = 0. The range is 10 units.
l0
CHAPTER.3. PNINCIPAL COMPONENTS AND THE INDICATOR APPBOACH:
,
47
hclicoton covorionce' oerurning bholhotitg (blL
0.rs
0.13t5
olas
ores
o.l
oJ!875
ix
o.ozs
o.06a5
0.05
o.o375
(L@5
0.0le5
0
Figure 3.5: Indicator Covariance for the cutoff z
indicator cova,riance for z = 1.0.
= -1.0. Note tbat it ig identical
CroBB indicotor covorionce., oeeunhg binornoirg tO,lL
OJ
o.0s5
0.os
o.08s
o.o8
0.075
0.07
0.065
o.o5
.
A
0.055
o.os
o.04s
0.04
0.03s
o.o3
0.0e5
o.0a
0.01s
oot
o.oos
oo"
I
as6
h
Figure 3.6: Indicator crogscovariance for the cutoffs z
= -1,0
and,
z, = 0.0.
to the
CHAPTER
3.
PRINCIPAL COMPONENTS AND TIIE INDICATOB APPROACH: 48
Cnoee indicotor
o.04
covorioncc ort;nhg bhonnolitg l-l'lL
0.0375
0.035
0J)345
0.O3
0.047s
o.oes
0.oaes
5.
{.
O.(P
oJrl75
0.0r5
o.olas
o.0l
0.oo7s
o.(x)s
o.(xlas
o
Figure 3.7: Indicator crogscovaria,nce between the extreme cutoffs: z = L.O and z = -1.0
Cnoo irdicotor covckrrcr orcrning binormdrg (-LOL
(uFe5
a
x
o.os
0.0375
o.045
O.Ol?5
0d
3.s8790t0
h
ttg|
Figure 3.8: Indicator crosscona,riance for the cutoffs a = 0.0 and z = 1.0. This crosocovariance is equal to the crosecova,riance for the cutoffs z = 0.0 and. z = -1.0.
CfrAPTEn 3. PRINCIPAL COMPONENTS AND TIIE INDICATON AP?NOACH:
Figure 3.9: Indicator correlogram for the cutoff z
49
= -t.0.
and:
6r(lzl) >
billz'l) V l"l < lr'l
(3.63)
Relations 3.62 and 3.63 express the destructuration effect of the indicator correlograms as
the cutoff z departs more from the median value 0. For z .* * m, the integral range
b1(*m) vanishes indicating zero practical autocorrelation of the extreme indicators.
For the crosscorrelograms ps(h; z,,zt) of. figures 3.12 to 3.14, significant correlation is
present only for the first 4 units, after that distance for practical purposes there ie no correlation. The crosscorrelation at the extreme values (-1r1) showe, dea^rly, the less correlation.
Once the indicator corra,riances and crossco\taxiances are known, it is poseible to compute
the indicator cova,riance matrix
(E(h)) for all h . For h -
O
it
I o.rsea o.o7es o.o2b1 I
Er(o) = | 0.0ze3 o.2b 0.0zes I
I o.ozsr o.o?e3 o.r$4 l
is:
(3.64)
Observe that E1(0) satisfies the symmetry and persymmetry properties. Thie matrix can be
in (3.4) to obtain an orthogonal matrix A, allowing calculation of the principal component covariancee and croescovariancee (3.40). After the cpectral decomporition
decomposed as
CIIAPTEa
3.
P&INCIPAL COMPONENTS AND TfrE INDICATOB APPfuOACII: 50
tndiootor Comctogron Frcurning Binonroitg (zc= O.OL
0.s
0.8
o.7
o.6
3
o.s
0.4
o3
0.e
0.1
t?34
Figure 3.10: Indicator correlogra,m for the median cutoff
z
=
0.0.
Note that
pr(h; o)
Inclicoror Cometognor
Rrr.ning Einornodrg (zc- l.O).
0.9
O.8
o.7
O.6
x3
o.5
o.4
0.3
0.e
0.1
o
Figure 3.11: Indicator correlogram for the cutofrz
(3.24), p;(h;1.0) = p(h;-1.0).
=
1.0. Due to the symmetry of expreseion
CfrAPTEn
3.
PRINCIPAL COMPONEN"S AIVD TIIE INDICATOB APP&OACII:
trrficoron Crorcocrobgnor Fn.ring Binomotitg (zl:-l.Or ta,O.OL
(l.e
o.8
o.7
o.6
Qo*
oa
0.3
o.a
0.t
ota?ltl'Fh
Figure 3.12: Indicator ctosscorrelogram lot z
= -1.0
and
z'=0.0.
tncJicotor Cnocecomdogron nccr'nirg llhonnotity (zl= 0.0r z?=1.0L
0.9
0.8
o.7
0.6
t
0.s
O..l
0.3
0.2
ot
Figure 3.13: Indicator crosscorrelogran fot z = 0.0 and
/ = 1.0.
51
3.
CHAPTEN
PRINCIPAL COMPONENTS AND TflE INDICATON APPNOACH:
(L6
a.us
0.4
0.3
oa
o.t
0
Figure 3.14: Indicator croscorrelogra,m for the octreme cutoffs z = -L.0 an.d z' = 1.0. The
correlation nalues a,re very small and for practical pupooes it is poesible to coneider that the
extreme cutoffs a,re uncorrelated.
(3.4) the computed matrix A is:
|
o=
|
-o.asro
-0.82e7
L -0.3e46
0.7071
0.5867
0.0
-0.5580
-0.707L
0.5867
(3.65)
This matrix verifies theorem 1.1 and its columns verify lemma 1.1. The columne of A
are used to produce the respective cornriances and crosscorrariances (3.40) or correlogta,ms.
Figures 3.15 to 3.17 present the principal component correlogra,ms py(h;l). The actual
range,like for all indicator correlograms, is l0 units but the practical corelation nagnitude
is drastically reduced after 4 units sepa,ration for the second and third principal compG
nent. I\rthermore, the correlograrns display a smooth and slow decay for the first principal
component progressing towards a Bevere and fast decay for the second and third principal
component. Such situation can be expressed in terms of integral range as:
llo
ay(I)=Jo p(h;l)dh >by(k)
,V k>t
In contrast the only nonzero principal component croaacorrelo$a,m
(3.66)
pr(hilrk)
, between
3. PRINCIPAL COMPONENTS
CHAPTER
AND THE INDICATOR APPROACH: 53
laf. Pri.rcbot Corponrnt Cornobgroa
(L9
qe
u7
G6
0.5
&4
0.3
O.a
0.t
oo
h
Figure 3.15: First principal component correlogram. Note that Dy(l) > Dy(e) V &
and the equality is satisfied only in the case that two or more eigenvaluea are equal.
?nd. Prhcipo Coipon nr Com.bgnoll
0.e
aL8
o.7
&G
0.s
O.,l
o.3
0.e
0.1
0
Figure 3.16: Second principal component correlogran.
>
1
CHAPTEN
3.
PAINCIPAL COMPONENTS AND THEINDICATON APPNOACH:
54
3ad. Princbol Coipon nt CoFr.bgFml
Figure 3.17: Third principal component correlogra,n. Note how the correlation nalues decay
rapidly for the first two units.
tar. ond 3rd. Prlrcbd Conponenr Cor'r'.bgr.on
0.9
0.8
o.7
0.6
o.s
€.
jL
o.4
0.3
o.2
o.t
0
-o.t
-o.e rJJ.r.JJ
nla
3,f667aglo
h
Figure 3.18: First and third principal component crosscorrelogIe.m. For practical purposes
both va,riables a,re uncorrelated.
CHAPTEB
3.
PEINCIPAL COMPONENTS AND THE INDICATOB APPBOACII; 55
the first and third principal component (figure 3.18), ehows a slight negative correlation for
the firet 7 unite dthough of insignificant magnitude when compa,red to the direct correlogfam8.
Observe that for this three cutoffs case, with cutoffs symmetric around the median, after
the principal component transformation (3.7) only three correlogta,ms need to be considered:
indeed the croascorrelation magnitude can be considered null. If the orthogonalization had
not been made then six indicator correlograms would have had to be considered.
3.5.2 The Five Cutoffs
Case:
The number of cutoffs is increased to five by adding the symmetric cutofis -2 and 2. The
new cova,riance matrix >I(0) remains symmetric and persymnetric:
E1(0) =
0.0222
0.0191
0.0113
0.0036
0.0005
The corresponding orthogonal matrix
A=
0.0191
0.1335
0.0793
0.0251
0.0036
A
0.0113
0.0793
0.2500
0.0793
0.0113
0.0036
0.0251
0.0793
0.1335
0.0191
0.0005
0.0036
0.0191
0.0222
is:
0.6966
-0.0602
0.1211
-0.1068
-0.3952
0.6966
-0.5734 -0.1211
-0.8248
0.0000
-0.6963
o.r22L
0.5651
0.0000
-0.0153
-0.3952 -0.6966
-0.5734
0.1211
0.L221
-0.0602 -0.1211
-0.1068
-0.6966
-0.6963
Figure 3.19 shows the additional indicator correlogr"- for the cutoff z
equal to its symmetric at cutof z = 2
(3.67)
0.0113
(3.68)
= -2
which is
. Note that:
D1(h;
-1) > b1(h; -2)
according to relation (3.63).
Figures 3.20 to 3.23 show the crossconelation between the inficator a,l z
= -2
and all
other cutoffs. Obeerve how the correlation decreases as the second cutoffincreases. The lea6t
correlation a,ppears for the pair of octreme cutoffe (-2r2) and the correlation levels between
CHAPTEB
3.
P&INCIPAL COMPONENTS AND THE INDICATOR APP&OACH:
I*fir:otor Coir.togror Frtuohg Bhorrotitg
lzl=.Zogoz?=. ?.Ol
oo
,i'Go
&7
&6
t
o.s
o.4
o.3
0.A
(LT
o3
h
Figure 3.19: Indicator correlogra,m for the cutoff z = -2.0.
Indicotor Crotecornetognon Aeauming Binornoitg (zl=-?.Orz?--l.O)
(L9
0.8
o.7
0.5
Q o.t
&if
0.3
o.2
o.t
oo.*t
3as678
h
Figure 3.20: Indicator crosscorrelogran for the cutoffs z
= -2,0
and.
zt =
56
CHAPTEL
3.
PRINCIPAL COMPONENTS AND TflE INDICATOR APPROACH:
tndcoror Crorcomclogron Frurirg Sirorrcliry (zl--2.orz2'
57
O.Ot
o.o
0.8
9.7
O.6
ix
0.s
0.4
o.3
&e
0.t
o
Figure 3.21: Indicator crosscorrelogra,m for the cutoffB z = -2.0 arrd, / = 0.0. Note that
the correlation rmlues a,re decaying in proportion to the sepa,ration of the cutoffs.
lnclicotor Croorcorrrtognon Ft&ning Binocnodrg (zt=-a.Orza= t.ol
0.e
o.8
o.,
0.6
Y
0.5
O.'t
0.3
o.e
0.t
n
-0te?a567aet0lt
h
Figure 3.22: Indicator crosscorrelogra,m for the cutoffs z = -2.0 ar'd z'
purposes both variables a,re uncorrelated.
=
1.0. For practical
CHAPTER
3.
?EINCIPAL COMPONENTS AND THE INDICATOB APP&AACH:
58
o.6
Q*.
&4
llr3
o.2
0.t
o
Figure 3.23: Indicator crosscorrelogra,m for the extreme cutoffs z = -2.0 and. z' = 2|A.
The correlation magnitude is insignificant and therefore both va,riables can be considered
uncorrelated.
cutoff -2.0 and 0 or
*1 a,re small compa,red with the direct correlations.
Thus the simple
indicator cokriging matrix U can be approximated by ignoring those small correlations and
it
can be reduced to only six different cova,riance blocks, as sketched below:
U_
C-2,-z C-2,-t
o00
C-2,-, C-t,-r C-r,o C-r,r
o
Co,o C-t,o
o
o
0
00
c_1,0
c_r,_r C-r,o
0
(3.6e)
C-r,-r C-r,-t
C-2,t
Q-2,-z
where C";,ri is the block crosecona,riance matrix between indicators at cutoffz; and z;. Note
that both symmetries have been accounted for in expression (3.65).
Figure 3.24 shows the five principal component correlograms and figure 3.25 those princi-
pal component crosscorrelograms diferent from zero. In figure (3,24) there is no distinctive
features between the fourth and fifth principal component correlogra,m. The reason of that
similssilt being the equality, in absolute value, of the fourth and fifth columns of matrix
(3.64). Both columns are the eigenvectors derived from almost equa,l eigenrnalues, which
entaile almost identical correlograme for the principal componente.
CHAPTEL
3.
PLINCIPAL COMPONENTS AND TflE INDICATOB APP&OACII;
59
Cffidogr-o. of the 5 Prhc*d Cofmrntr .
o.e
0.9
o.7
o.8
-t'
t
0.s
o.4
o.3
o.a
0.t
0
Figure 3.24: Principal component correlogre"ns for the five cutofis case. There is no distinction for the fourth and fifth principal component correlogra.m5.
Crosscorretogrorns of thr Prhcipot Conponente oiffer-ent froo Zero.
o.e
0.8
o,7
0.6
Q 0.,
O..l
0.3
&a
0.1
Figure 3.25: Principal component crosscorrelograms different from zero. Note the low values
of correlation.
CHAPTE&
3.
P&INCIPAL COMPONENTS AND THE INDICATOB APPBOACI{: 60
As in the three cutoffe case, the magnitude of the crosscorrelograms is small, thus an
approximation to the sirnple principal component cokriging system is the kriging of each
principal component (Y1). If the significant crosscorrelatione were to be used in the cokriging
eyetem, for exa,mple the correlations between the second and fourth principal component
and between the third and fifth priacipal component, the corresponding simple cokriging
matrix (Uy) would look as followe:
Ify =
where
C;;
Ct,t
0
o
Cz,z
o
0
o
Cz,a
o
o
o C2,4 o
Csp 0 Ce,s
0C.aA0
Cg,o 0 Cs,s
(3.70)
is the block crosscovariance matrix between the ith-principal component
the jth-principal component
(yr). A compa^rison with the cokriging matrix
(f;)
and
derived from
the inficators (3.65) reveals that the principal component system requires a lesser number
of block corra,riances and therefore lees storage and lees inference. Also an important fact is
that the cokriging of the principal components ca,n be separated, i.e. the eimple cokriging
system (3.66) can be decomposed in one system corresponding to C1,1 (since Y1 is not
correlated with the others principal components) plus two eystems corresponding to the
four cowriances, respectively C2,2, C2,4 and Ca,al C3,3 , C3,5 and C5,5 (since Y2 and
Ya are spatially correlated but are uncorrelated with Yt, Ys and Ys). Moreover, given
the low crosscorrelation levels of the principal components their cokriging eystem can be
approximated by a geries of simple kriging syetems.
These two exr.mples shown that indicator orthogonalization, assuming that Z(x) and
Z(x * h) are jointly bigaussian, presents definite advantages over the traditional IK and
CoIK ( Coindicator Kriging ) when used for modeling conditional cumulative density functions.
Chapter 4
IK based on Principal Component
Analysis
The last two chapters have presented the IK theory and the application of PCA to orthogonalize the indicator vectors I(x;z). The bigaussian model was considered as the joint
distribution of Z(x), Z(x* h) and indicator covariances and croescovaxiarlces were computed assuming such bivariate distribution . Examples considering three and five cutoffs
were shown which indicate that the indicator crossconelations magnitude can not be as-
null. They play an important role in the indicator cokriging system.
Orthogonalization of .I(x;z) entails that aome of the crosscorrelations between the elements of the transformed vector Y(x) are exactly zero and thoee different from zero are
shown to be negligible. This property can be capitalized upon to build an estimator of the
conditional distribution with the advantage over IK of considering more bivariate informasumed
tion via the transformed vector Y(x).
4.L An Estimator Based on PCA:
The Colndicator Kriging estimator (2.31) is the best bivariate-type estimator of the conditional distribution in least squaxes sense. However, practical problems inhibits the modeling
of. O
(K2)
cona,riances.
Orthogonalization of the indicator vector /(x; z) assuming a bigaussian model yields a
new vector Y(x) which holds the properties of theorem 3.1. A set of n' samples Y(xo) are
definined using the transformation (3.6) applied on the original indicator vectors, and the
61
CIIAPTEN
4.
tsASED ON PRINCIPAL COMPONENT ANAI,YS$
IK
following average can be defined:
o(v;Yr)
= hlrYl(r)
dx
t.
1,
=
...,K
(4.1)
(2.31):
Tbis integral can be eetimated by the ttaadard cokriging estimator
Kn'
o(V;Yro)
but from theorem 1.1,
it
=
t D ).'o Yr'(x.)
lf=l c=l
(4.2)
is Lnown that the crasconelrtions:
CY(h,k,ls')=$ rV& dd, ht
(4.3)
cven
The other crcscorrelations although different fiom rero are ignored, therefore:
Cr(h;&,&') nl 0
,
V
&,
ht coen o? krh' dih
(4.4)
.
direct correlations Thus,
eince their nagnitude is relatively small conpa,red with the
is pcaible tbrougb
rather than using the heavy colriging estimator $.2) e' simplification
the krigiug estimator (2.26):
trt
O'(V;Yr) =
and the weights
)p
e
= 1,...,I(
(4.5)
are derived from the constrained normal equation system (2.29):
v'
|
E l*"Yr(*")
c=l
Cv(xo
-
xo,&)
*
tt*
= eY(x"'V;
tc)
(4.6)
9=l
with tbe unbiasedness condition:
nt
f, l1p = f
lc
= l,...
rlf
(4.7\
F=l
pl conesPonding to
Note that the formulation (a.6) includes a Lagrange multiplier
to estimate
the assumption that ElYrl is unknown. By considering a local neighborhood
the mean nlYrl is allowed to change from one location to another, which is
O.(V;yr)
with the
inconsistent with.the assumption of stationarity. Mor@ver, it is inconsistent
would
ulique and orthogonal matrix A of orpression (3.6). The straightforward solution
given by:
be to consider a cimy'e kriging eotimator with the weights llo
CHAPTER.4. IK BASED ON PNINCIPAL COMPONENT ANATYSIS
63
,t'
DC"(*B - x',&) -- evQc.,V;k) a
0=l
The corresponding simple krigrng estimator ie:
nt
iD.(V; Yr)
=
(1
= L,...,tu'
(4.8)
,r'
- o=l
t )r")E[Yr] + D ]r,Yr(xo)
(4.e)
Unfortunately this approach requires knowledge of Elfi'l which has to be inferred from
the data. In presence of data clustering such inference can be difrcult and possibly biased.
The estimator based on (4.9) ensures consistency with the orthogond matrix A and
stationarity, therefore it satisfieo the theoretical assunptions . On the other hand the
estimator based on (4.5) is inconsistent with A and stationa,ritn but presents a definite
advantage
4.2
in presence oflocal departure from stationarity.
Unbiasedness:
Estimators (4.5) and (4.8) are both unbiased etimetors of the stochastic integral (4.1).
However the goal is to estimate integral (2.2L)z
A(V;zp)=
1f
V Juf(x;zp)dx
&
= 1,...,K
(4.10)
From (3.6) an inverse transformation can be applied to obtain:
A(V;z) = AO(V;Y)
(4.11)
where the vector O(V;z) is defined as:
and vector O(V;
Y)
A(V;z)=lQ(V121 6(V;zs)lr
(4.12)
o(Y;Y) = [o(Y;Yr) a(v;Ys)lr
(4.13)
is:
Estimator (4.11) will be called hereafter the IKPCA estimator or Indicator Kriging
based on Principal Component Analysis. Observe that this pa,rticula,r estimator is not any
CHAPTER
4.
IK BASED ON PRINCIPAL COMPONENT ANAIYSIS
more a simple finea.r combination of indicator data associated with a single cutoff.
It is a
linear combination of indicator data associated with multiple cutoffe.
Unbiasedness galls for:
E[O(v;
z)J
= E[AiD.(Y; Y)]
(4.14)
Using estimator (4.5) and ercpression (2.24), expression (4.14) can be written as:
fI=t Ar"Yr(v',) I
I
r(z)=AEl
:
L
Dl=t
(4.15)
I
)r"rr(x") I
where:
F(rx)17
[F(")
F(z) =
The principal components Y;(r.o) are expressed in terms of the original indicator vector
I(x";z).
Thus relation (4.15) becomes:
Il" ff=, Efr )r.a; J(x,; zi)
r(z)=AEl
L
:
DL, D[, lroo ;xl(t<-; z)
Taking the orpected value and assuming stationarity:
ll=tDfr)roo;rF(') I
I
F(z)=Al
:
L DL, Df, )roo;xF(d
I
J
Finally by doing the matrix multiplication:
Isr(s
tal=r )ii=rDf,
I
F(z)=l
L
filt
:
)1.o,
osF(Q f
I
Dl=r D;K=r\t'ox1a51F(z;) )
This expression is simplified by taking in account that the matrix
matrix such that:
AAT=I
A
is an orthogonal
CHAPTER
4. IK BASED ON PRINCIPAL
COMPONEN? ANAIYSIS
Thus expression (4.14) is finally written as:
Dl=t '\1o'F(21) I
I
F(z)=l
r
(4.16)
I
L
D:, \v'F(2fl l
AO(V; Y) is an unbiased estimator of A(V; z).
Similar treatment for estimator (4.9) proves that this estimator is also unbiased.
Since the weights )1o satisfy (4.7),
4.3 Estimation Variance
The estimation variance of estimator O*(V; Y1) is defined
o?xpce= E[a(V;zP)
as:
(4.17)
- aliD*(Y;Y)]2
kth row of the orthogonal matrix A.
Without loss of generality, supposee that V ie a point at ,ro and therefore expression
(4.17) is written as:
where the vector ap is the
o?xrce= E[/2(xo;
"*)]-
e^u[a[Y.(xo)/(xo;
"x)l+
r[a[Y.(xo)Y."(*o)"r]
(4.18)
The first term of expression (4.18) is:
,i]=
E[I2(*o;
E[/(xs;
,*)i= F(r*)
(4.1e)
The second term is expanded as follows:
KntK
-Znla[V*(r.o)I(*o; ,*)l =
-2Dt t
l=1 o=1 j=l
a14a7\151C1(:<o
- xo; z1,zr) + F(z;)F(zp)i
aplag\1iC1(r.o
-
and considering orthogonality of A:
KntK
-
Zflla[V*(to)/(*o;
"*\] =
The third term of (4.18) is:
-zDD j=l
t
l=l a=l
xqi zit
21,)
- 2F2(zP) (4'20)
CHAPTEN
4,
IK
BASED ON PRTNCIPAL COMPONENT ANAIYSIS
KK
E[aflY.(xs)Y.r(r.o)"*] =
K
lrrt nt
D f,1o!,r1'.r wCv(o;l) + mll
D
I al4asynint +E
l=1 a=l p=f
l=l j=|,{l
(4.21)
where:
K
=la6F(z)
(4.22)
l=1
Finalln the estimation va.riance is the combination of expressions (4.19) to (4.21). Note
rrtr;,
that such IKPCA estimation variance depends on the whole set of weights associated to
the principal components (Y1) and on both indicator cova,riances and crosecovariances
(Cr(h; zlrz*)). Contrarily the IK estimation variance (o?d depends only on the sing[e
indicator covariance (Cr(h;zp) and the weights associated to that particular cutoff (zp).
This can be seen from the IK estimation variance enpression:
n' nt
nt
o?rc
= C{gzp)
D )*"1*pC{x'-z[Cr(r.g - &;ze) + t p=l
xptzr)
(4.23)
a=l
a=l
This difference can be explained, using Projection Theory (Luenberger, 1969), by the
fact that each estimator is projected onto difierent linear manifolds. The IKPCA estimator
is a linear combination of indicator data corresponding to different cutoffs, while the IK
estimator is a linear combination of indicators corresponding exclusively to one cutoff.
Therefore the following inequality holds true:
o?xpc.n <
o?x
Q.24)
Thus the IKPCA estimator is a better estimator than the IK estimator in the sense of
estimation variance.
A comparison of the IKPCA estimator with the CoIK estimator shows that notwithstanding the improvement brought by expressing the IKPCA estimator as a linear combination of indicators for difrerent cutoffs, the approximation (4.4) amounts not to use the
whole linear manifold as CoIK does. Therefore, the corresponding estimation variance order
relations hold:
oborK 1o?xpc.e,3
"?x
(4.25)
CHAPTEN
4,
IK BASED ON PRINCIPAL
COMPONEIVT AIVATYSIS
As approximation (4.4) becomes more exact the IKPCA estimator tends towards the CoIK
estimator, and their estimation variance become equal.
4.4
Practice of IKPCA:
A successful application of IKPCA is based on the assumption that the principal component
In the case of a
shown in the last chapter
crosscorrelations are negligible compa,red with the direct correlations.
bigaussian
joint distribution ot Z(x)
and.
Z(x+ h), it
has been
that the principal component crossconelations are indeed negligible, therefore the IKPCA
approach is entirely justified. For biva,riate distributions difierent from the bigaussian, a
situation more likely in Earth Sciences, the principal component crosscorrelations should be
checked to compare their level with that of the direct correlations. Such check is essential
prior to applying IKPCA.
This section will focus on the practical implementation of IKPCA and address the
problem of checking the previous constitutive hypothesis. The question of checking whether
a bivariate distribution is bigaussian or not is discussed and its relevance in the practical
implementation of IKPCA is considered.
4.4.L Declustering the Univariate CDF:
The IKPCA is based on the orthogonalization of the indicator vector defined for different
cutofis. Such orthogonalization is accomplished by multiplying the indicator vector by an
orthogonal matrix AT derived from the spectral decompooition of the indicator covariance
matrix El(h). For the case of lhl = 0, the elements of the indicator coraniance matrix are
given by:
C 7 (0; zp, zv,)
= F s (min{zp, z k'})
-
F2 (zp) Fs(zp')
with F2(z) being the Z-univariate cdf. Therefore, knowledge of the univa,riate distribution
is a requirement to compute the orthogonal matrix A. From the data an experimental
FEQ) can be inferred; unfortunately in presence of clusters that inference of F(z) is not a
trivial ta^sk. In
such case FEQ) has to be declustered
in order to limit the bias introduced
by preferential location of the data. This problem is discussed in Journel (1983) and an easy
and efrcient solution is proposed. The basic idea is to build cells or blocks, with different
size, over the area being investigated. Each sample is weighted in inverse proportion to the
CHAPTEN
4.
IK
BASED ON PRINCIPAL COMPONENT ANATYSIS
68
/rl
'----/
0.7
0.6
N
lJ-
-
-
/
-
-/t
/t
/'
0.5
-
-
-[
,
I
I
t
,
|
0
|
|
t
I |
- flt ' t |
't
l' ' , t I
-/tt
,
'
'
lrt t , , |
O
I
I
r
O.Z 0.4 0.6
O.E
I
|
/, ; ,|
/ r' t ,
r l,
|
I
I
I
t
-'l I , ,
0.1
1
I
|
'
|
|
|
I
|
|
|
|
I
I
I
J
1.2 1.,+ 1.6
1.6
2
2.2 2.1 2.6 2.E
J
5.2 5.4 J.6
J.t
Figure 4.1: Choice of symmetric F(z) does not entail symmetric cutofis for Z(x).
number of samples found in each cell and the corresponding statistics is computed. For
example, if the sampling campaign has been preferentially focused towards the high grade6'
the declusterd mean would be that which is minimuml the corresponding weights provide
a declustered estimate for
the cdf F2@). This procedure minimizes the impact of clustering
at the univa^riate level but not at the bivariate level.
4.4.2
Selection of Cutoffs:
One of the characteristics of the bigaussian distribution is that the indicator matrix covariance E1(h) is symmetric and persfmmsfric, and as a consequence theorem 3.1 indicates
that the croscorrelation for certain principal components is zero at all h if the cutoffs correspond to symmetric quantiles. Therefore, the choice of cutoffs a,ffects the structure of the
indicator covariance matdx and the correeponding orthogonal matrix A.
The decision of choosing symmetric quantiles does not entail symmetric cutoffs on the
declustered FEQ), situation which can be observed in figure 4.1. If the main interest lies
around the high quantiles, it is necessary to specify enough cutoffs to discretize the local
A(V;zr) a,round those high quantiles. If symmetry and persymmetry is to be naintained
matrix, sf,mmsfdg quantiles must be considered. Such situation
increase the number of cutofis and therefore the number of correlogra,ms to model.
in the indicator
co\xariance
CHAPTEN
4. IK BASED ON PRINCIPAL
COMPONENT ANAIYSIS
However at least for the bigaussian case, this problem ie alleviated thanks to condition
(3.65) which indicates that the higher the principal component the leeser the direct corre-
lation. This result entails that the last principal component
correlogra,ms are likely
to be
pure nugget effect. This condition should be checked to retain only the significant principal
component correlations. For those principal components whce correlations are practically
pure nugget efiect, an estimate of O(V; Y3) can be obtained by simple arithmetic average
of the samples in the neighborhood.
4.4.3 Computation of E1(h):
Spectral decomposition of the indicator conariance matrix
orthogonal matrix
A.
lr(O)
allows computation of the
Numerical description of that procedure is discussed in Appendix A.
Note that there is no particular reason to orthogonalize the indicator cova,riance matrix at
h = 0, in fact that orthogonalization can be done for any h. Lemma 1.1 and theorem 1.1
do not depend on the choice of a particular h, they depend on the selection of the cutoffs
and on the bigaussian hypothesis.
For any bivariate distribution, the pa,rticular h should be chosen so as to minimize the
crosscorrelation of the resulting principal components. For example, if the average distance
between samples is hr then orthogonalization of Ey(h) should be done
at lhl = lrr which
entails that any principal component crosscorrelation at h1 is exactly zetoi beyond that
distance the increase in crosscorrelation is assumed to be negligible.
is done at lhl
= 0 the principal component
crosscorrelation
at the
If
orthogonalization
average distance h1
between samples may be difrerent from zero and this could impact the approximation level
of IKPCA.
All elements of matrix El(O) can be obtained through expression (3.60) once the cutoffs
have been selected. There is no need of complex computations to obtain the corresponding
values which can be read directly from the plot of the declustered F (r) (figure 4.2). For
the case of orthogonalization at lhl I 0 the elements of the indicator covariance matrix can
be computed from expression (3.22).
CHAPTEN
4.
IK BASED ON PBINCIPAL COMPONENT ANAIYSIS
70
o.6
N
Y
u-
o.s
0.1
o
0
0.2 0.4 0.6
0.E
t
1.2 1.+ 1.0
1d 2 2.22.+2.62.8 5
5.2 J.4 5.6 5.E
z
Figure 4.2: The elements of matrix >r(0) can be read from the declustetd, F|(z).
4.4.4
Checking for BigaussianitY:
One easy way is to apply the normal score transformation to the original data and from
the normal scores obtain indicator correlations or crosscorrelation for different cutoffs. Dxpression (3.24) gives the analytical exact expression for any indicator correlation or crosscorrelation assuming a bigaussian model. The code to obtain such theoretical covariances
and crosscona,riances is given in Appendix B.
comparison can then be made between the experimental indicator correlations or
crosscorrelations derived from the normal scores and the theoretical correlations and cross-
A
correlations derived from formula
$.2$. In Earth Sciences the most common
situation
will be a mismatch between theoretical and experimental correlations or closscorrelations,
hence the typical answer will be that bigaussianity is not satisfied.
It would seem at this point, that any technique based on gaussian hypothesis is hopeless
since in Earth Sciences such condition is the exception rather than the rule. However, the
point to check is not whether the original distribution is bigaussian or not, the important
point is to evaluate the impact of any departure on the estimates or any other goal for the
study.
Bigaussianity provides a particular case for successful application of IKPCA. The consequences of departure from that bigaussian model are not yet fully appreciated. The IKPCA
CHAPTEN
4,
IK BASED
OIV PRTNCIPAL
COMPONEN?
ANATYSIS
7I
goat is to approximate the CoIK system, therefore the relevant check consists in evaluat-
ing the relative magnitude of the principal component crosscorrelations with respect to the
direct principal component correlations. ff that relative magnitude is small then IKPCA
can be applied safely, whatever the binariate distribution bigaussian or not. If the relative
magnitude is la.rge then IKPCA is not recommended.
For the cases when the bivariate distribution of Z(x) and
Z(x*
h) is shown to be close
to bigaussianity, techniques like Multigaussian Kriging (Verly 1984) should be considered
instead of IKPCA.
4.4.5 Order Relations
The estimat e F|(x;
"* l(n'))
Problems:
does not necessarily satisfy the classical order relations:
F2@; u*
l("')) e [0, U
F26;zr l(n')) < r}$;zt+r l(n')) ,Yzr
The first condition is not satisfied because the kriging-type estimates a,re nonconvex
linea,r combinations of the conditioning data, therefore the weights can be negative and
the estimate can be outside of the limits defined by the maximum and minimum of the
conditioning data. The second type of order relations problems is due to the fact that
F2$;zr,l(n)) is built
independently of the estimatot
of.
F|(x;zr+t l(n')); indeed the two
respective kriging systems do not impose any such congtraint.
Correction of these order relations can be accomplished in different ways. Sullivan
(1934) and Journel (1987) propose different procedures to correct for order relations. The
following chapter presents an application of one such correction procedure to IKPCA derived
probabitty estimates.
4.4.6 Probability Intervals:
Probability intervaJs can be computed from the estimated conditional cumulative density
function:
Prob{a <
z(*) < D l("')} * Fi$;D l("'))
-
F|(x;o l("'))
(4.26)
CHAPTEN
where
4. IK BASED ON PRINCIPAL
COMPONENT ANATYSIS
F|(l@')) is the estimate of the conditional cumulative density as obtained
by
IKPCA, and n' is the number of conditioning data.
Probability of exceedence of a given threshold c is:
Prob{Z(x) > cl (z') } o
1- F26;cl ("') )
(4.27)
Note from ocpressions (4.26) and (4.27) that probability intervals and probability of
exceedence are independent of the choice of any
particular estimate z*(x). Therefore these
measureo of uncertainty are dissociated from the estimate
z*(x) retained.
4.4.7 OptimalEstimates:
Various estimates can be derived from the conditional cumulative density function depend-
ing on the criteria of optimality established from a loss function concept (Journel, 1984b).
There can be different loss functions and for each one an estimate exists, optimal in the
sense
that
it
minimizes the expected value of the loss function. The following expression
ahows the functional to minimize
loss function
in order to obtain an optimal estimate zL(.x) for a given
I(.):
EIL(z-(x)
- z(x))l({l = lo
LQ*$)
-
z(x))dF}(x;zl(n'))
(4.28)
Minimization of expression (4.28) can be accomplished by numerical optimization (Fletcher,
1988) and its numerical evaluation can be obtained by numerical integration. One simple
way to proceed is:
K
f L(z*(x)-rr(*))lF|(x;zp..1l(z'))-ri(*;zrl(n'))l
t: L(z*(x)-z(x))d,F|(x;zl(z'))= r=o
G.zg)
with z3(x) being an estimate of the conditional class mean, K the total number of cutoffs
and:
F|(x;
zs)
FE?;zx+t)
FE6;0) =
o
1.0
Expressions (4.28) or (a.29) can be minimized to provide the optimal estimate "L(:x)
such that:
I
CHAPTEN
4.
IK
BASDD ON PRINCIPAL COMPONEIVT ANAIYSIS
l*
t 1r.1*)
-
73
z(x)\d.r1$; rl(n'))
(4.30)
is minimum. The loss function should be chosen to match the goal of the study. However
and traditionally selection of the loss function has been based on simplicity criteria . The
numerical optimization (4.29) is possible with solutions independent of the particular form
of
Fi$; rl(n')) for very specific
loos function.
Difierent loes functions may entail very different estimated nalues, hence
to define ca,refully the loss function to be used.
it
is important
Chapter
5
A Case Study
The IKPCA approach will be applied to a data set resulting from a two dimensional noncon-
ditional simulation based on the spectral turning bands method (Mantonglou and Wilson,
1981) and a normal score transformation. The epatial correlation considered in the simulation has a geometrical anisotropy of ratio 2 to l, with direction of ma:rimum continuity
the z direction and minimum direction of continuity the y direction. Figure 5.1 shows a
greyscale map of the 1600 simulated values, where the horizontal
to vertical anisotropy
is
observed.
IKPCA is applied to evaluate conditional cdf for point values and composite point
conditional cdf within panels. Compa,rison of its performance is made against the IK and
MG approaches.
5.1 Structural
Analysis:
For this particular study, the problem ofinference ofthe aemivariogram or correlogram has
been dissociated from the estimation problem. For inference of the spatial correlation struc-
ture exhaustive knowledge of the 1600 data points is assumed. The exhaustive correlogram
is computed through the classic relation:
p(h) = t.o
-
r
tfu
N(h)
D ("(*. + h) -
z(x"))2
(5.1)
jv(h) is the number of pairs z(x. + h) found for the separation vector h.
"(x')
Recall that the mean and variance of the 1600 data points have been standardized to 0
where
74
CHAPTEN
5.
A
CASE STADY
40
I
E
1.0.
0.0
-
1.0
ffi .1.0.0.0
u
".1
- Gl.0)
40
Figure 5.1: Exhaustive data set considered to analyze the performance of IKPCA.
and 1. Figure 5.2 shows a greyscale map of the z-correlogram. At the center of the map,
p(h) = 1.0; the color code gives the value of p(h) t distance lhl away from the center in any
particular direction. Spatial variability or correlation along the a direction is largest. The
anisotropy is a geometric , with a ratlge of about 12 units in the a direction and 6 units in
the y direction.
Application of IKPCA does not require the structural analysis on the original Z(x) vaiable, however such analysis is recommended to detect patterns of variability and potential
problems on inference of principal component correlograms, e. g. clustering of data.
5.1.1 Indicator Conelograms and Crosscorrelograms:
One goal of the IKPCA is to use more bivariate information than the IK approach. Short
of a full CoIK, the indicator crosscorrelations are introduced through principal component
correlogra,ms. From the exhaustive data set of figure 5.1, nine symmetric quantile cutoffs
were selected. Table 5.1 gives the respective cutofis which entail a symmetric and persym-
metric indicator covariance matrix. Observe that the quantile-symmetry for this particular
case yields symmetry of the cutoff values 21. This situation is only true for symmetric
univariate distributions.
A total of 45 indicator
correlograms and crosscorrelograms were computed from the
CHAPTEN
5.
A
CASE STUDY
76
I
I
W
n
-29
o
0.6 0,4 - 0.6
0.0 - 0.4
.0.0
20
Figure 5.2: Z-Correlogram derived from the exhaustive information. Observe that the
direction of major continuity is along the c a:ris.
exhaustive data set. Figure 5.3 presents the indicator correlogram for the first cutoff, then
figures 5.4 to 5.11 the corresponding indicator crosscorrelograms between that first cutoff
and the eight others. Geometric anisotropy is observed along the
r (solid line) and y (dash
line) directions in all figures. The magnitude of crosscorrelation diminishes as the second
cutoff increases. However, the level of crosscorrelation when compared with the direct
correlation of figure 5.3 can not be assumed as null at least until the eigth cutoff. The IK
implicit assumption that indicator crosscorrelations are negligible is seen to be not satisfied
for this
case.
Figure 5.12 shows that indicator correlogram at the second cutoff has structure similar
to the first one: range and anisotropy remain the same. Figure 5.13 presents the crosscorrelogram between the second and third cutoff. It is observed that the crosscorrelation is
not negligible and that the assumption of null crosscorrelation should not be considered.
Figure 5.14 to 5.19 shows indicator crosscorrelograms between contiguoue cutofs. There
is a persistent and siguificant crooscorrelation which should be incorporated as source of
information in the estimation process.
Figures 5.20 to 5.26 present the indicator comelograms from the third cutoff to the ninth
cutoff. A comparison of the first and last cutoffcorrelogram, figure 5.3 and figure 5.26, with
the median cutoff, figure 5.22, shows the destructuration effect: as the cutoff goes away
CHAPTEN
5.
A
CASE STUDY
Indicoron Conretognom (21=-1.?81
0.9
0.8
o.7
o.6
0.5
N
s-
O.4
c)
0.3
O.?
0.1
0
-0.1
-0.a 6
10
h
15
?O
Figure 5.3: Indicator correlogram for the cutoff z = -L.28. The solid line presents the
correlation along the E.W direction and the dash line along the N-S direction.
Indic oton Cros scomef o gnom lzl= -L.?8 tz?= -O,8 4l
(LO
0.8
o.7
o.6
RI
N
N
f
P
{-,
0.s
o.4
0.3
o-?
0.1
0
-0.1
4.2
t0
t5
h
Figure 5.4: Indicator crooscorrelogram for the cutoffs z
that the crosscorrelation is not negligible.
=
e0
-1.28 a\d z =
Observe
5.
CHAPTEN
A CASE STADY
78
Indicoton Cnoesconnelo gnom (zl=
-1.?8
tz3=-0'5?l
0.9
O.8
o.7
(L6
(o
N
0"5
N
&4
o
o.3
s
I
oa
o.l
o
{Lr
{r.e
t5
t0
N
h
Figure 5.5: Indicator crosscorrelogram for the cutoffs z
second cutoff increases the crosscorrelation decreases.
= -1.28 arLd z = -0.52. As the the
Indicoton Crossconnelognom lzl=-I.?8t24=-0.?5)
(LE
o.8
O.7
(L6
I
N&5
N
0..t
rI
=(L3
t-,
o.2
&l
0
-o.t
-0€
t0
eo
h
Figure 5.6: Indicator crosscorrelogram for the cutoffs z =
-1.28andz=-0.25.
CHAPTEN
5.
79
A CASE STUDY
Indicoton Cnossconnelognom (zL=-I.?8 tz5=-0.00)
0.e
0.8
o.7
rtN
N
.c
c,
0.6
&s
0.4
0.3
0.4
(Lt
0
-0.1
4A
0
t0
h
Figure 5.7: Indicator crosscorrelogram for the cutoffs z
= -L.28 and z = 0.0.
Indicoton Crossconnelognom (21=-1.?8126=0.?51
0.0
0.8
O.7
(o
0.6
N
(Ls
N
0.4
o
0.3
s
O.2
0.t
0
-0.1
-0€
t0
n
h
Figure 5.8: Indicator ctmscomelogram for the cutoffs z -- -L,28 an.d z = 0.25
CHAPTEN
5.
A
CASE STUDY
80
Indicoton Cnossconnelo gn om lzl= -1.?8
tz7 =O.5?l
(LC
o.8
o.7
|\N
0.6
N
o.4
s.
(J
o.3
o.5
0.4
tLt
o
-o.t
-0.4
t5
l0
0
h
Figure 5.9: Indicator crosscorrelogram for the cutoffs z
= -1.28 arLd z = 0.52
Indicotor Crossconrelognom (zl='1.?8o28=0.84)
0.0
o.8
o.t
(,.6
@
N
0.5
N
0..1
-c.
&3
-
o
o.a
0.r
0
-0.1
-0.?
0
t0
t5
?o
h
Figure 5.1.0: Indicator crosscorrelogra^m for the cutoffs
crosscorrelation can be considered as pure nugget effect.
z=
-1.28 and.z
=0.84.
The
CHAPTEN
5.
A
81
CASE STADY
Indicolol' Ct-osscorrelo gnam lz|= -L'?B tz9=1'?8)
0.9
O.8
o.7
0.5
o)
N
0.5
N
o.4
o
(L3
I
s
oa
0.1
0
-0.1
{1.?
t0
h
Figure 5.11: Indicator crosscorrelogra,m for the cutofis
crosscorrelation is pure nugget effect.
z = -1.28
and
z - 1.28. The
Indicotor Connelogrom (z?=-0'84)
0.0
o.8
u7
o.6
N
0.5
E
O.4
c)
o.3
N
o.e
o.l
0
-0.1
-o.a
;
l0
?o
h
Figure 5.12: Indicator correlogram for the cutoff z = -0.84. Range and geometrical
anisotropy are similar to correlogtam of the first cutoff.
CHAPTEN
5.
A CASE STADY
Indicoton Cornelognom (z?=-0.84r z3=-0.54)
o.9
O.8
o.7
o.6
(Y)
N
o.5
(\l
N
t
0.4
(J
o.3
s.
o.a
0.1
0
-0.1
{.e 0
10
h
Figure 5.13: Indicator crosscorrelogram for the cutoffs a = -0.84 and z = -0.52. The
relative size of the crmscorrelogran respect to the direct correlograrn is not insignificant
Indicoton Connelognom (23=-0.5?t z4=-0.e5)
'
O.0
o.s
o.7
!f
o.6
t
o.s
N
0.{
G)
S&3
(J
oa
O.l
0
-0.1
-ue
l0
h
Figure 5.14: Indicator crosscorrelogram for the cutoffs z
= -0.52 and z = -A.25.
CHAPTEN
5.
A
CASE STUDY
83
Indicoton Crosscornetognom (24=-0.?5e zS=0.0)
o.9
0.8
4.7
0.6
t)N
O.5
tf
(L4
N
3
o€
c)
O.2
o.t
o
\
-0.1
4A
l0
h
Figure 5.15: Indicatot crosscolrelograrn for the cutoffs z
=
-0.25
an:d,
2
=
0.0.
Indicoton Crossconnetognom (25=0.0e z6=0.?5)
o.9
0.8
o.7
(o
N
t')
N
c-
(J
0.6
0.5
&tt
(L3
o.e
0.1
0
--_x-
-0.1
4.e
l0
h
Figure 5.16: Indicator crosscorrelogram for the cutoffs z
:
0.A and. z = 0.25.
CHAPTER
5.
A CASE STUDY
84
Indicoton CnossconFetognom (26=0,?5r z7=Q.5?)
o.e
o.8
o.7
1\
o.6
t05
@
N
€
o
O.'l
o.3
0.2
o.t
o
-0.1
-0.4
lo
h
Figure 5.17: Indicator crosacorrelogran for the cutoffs z = 0.25 and z = 0.52.
from the median the correlograms tend towards a pure nugget effect.
Bigaussianity:
The data set was generated by a gaussian related technique which imposes that the nonconditional simulation presents a multivariate distribution close to the multigaussianity.
Verly (1984) discusses several teets to prove multigaussianity and more recently the use of
indicator correlogra,ms to test bigaussianity has been recommended.
Expression (3.24) indicates that the correlograms of symmetric cutofis for a binormal
distribution a,re equal and that the crosscorrelogram between the median cutoff and a pos-
itive cutoff is equal to the crosscorrelogram between the median cutoff and the negative of
that same cutoff. Figures 5.3 and 5.26 ehow that at symmetric cutoffs correlograms are
reasonably similar. Figures 5.15 and 5.16 present similar crooscorrelograrns as expected
from a bigaussian distribution.
Figures 5.27 afi,5.28 present the crosscorrelogra;ns for the firet and second cutoff, and
for the eighth and ninth cutof. The greyscale maps appear to show significant differences,
the patterns of anisotropy appear different but only for large dietances, i. e. for low cor-
relation values. However, figures 5.4 and 5.19 corresponding to the same crosscorrelograms
along the two main directions of anisotropy appear similar, which is the expected result from
CHAPTEN
5.
A
CASE STUDY
85
Indicotor Cnossconnelogrom (27=O.S?o z8=0.84)
0.e
o.8
O.7
O.6
co
N
0.5
N
o..t
s.
o.3
G
(J
o.a
0.1
0
-o.1
-0.?
l0
h
Figure 5.18: Indicator crosscorrelogram for the cutoffs z = 0.52
an.d.
z = 0.84.
Indicoton Crossconnelogrom (28=0.84r zg=1.?8)
0.0
(L8
o.7
o)
N
q)
0.6
0.5
N
0..1
-c.
0.3
o
0.4
0.t
o
-o.t
-(Le
to
h
t5
a
Figure 5.19: Indicator crosscorrelograrn for the cutofis z = 0,84 and z
=
1.28.
CHAPTEN
5.
A
CASE STUDY
86
Indicoton Connelognom (23=-0.5?l
o.9
0.8
(L'
(L6
6o.s
N
E
o
Gl
o.3
o.e
0"1
o
-0.1
-0.2
st0t5e0
h
Indicator correlogram for the cutoff z
Figure
= -0.52.
Indicoror Cornelognom (24=-0.?51
0.0
0.8
o.7
o.6
\t
N
.c.
c)
&5
0.4
o3
o.e
0.1
r--\--
0
-o.1
-0.a
E
0
t0
l5
a0
h
Figure 5.21: Indicator correlogram for the cutoff z
=
-0.25.
CHAPTEN
5.
A
CASE STADY
87
Indicoton Connelognom (25=0.0)
(LE
0.8
O.7
0.6
uN
0.5
-c.
0.4
c)
0.3
0.4
0.1
0
-0.1
-o.a
o.
Figure 5.22: Indicator correlogram for the cutof z = A.0.
Indicotor Conretognom (25=0.?5)
0.0
0.8
O.7
0.5
6N
€
(J
o.S
o'4
o.g
o.e
0.1
0
-0.1
4.?
to
h
Figure 5.23: Indicator correlogram for the cutoff z = A.25.
5.
CHAPTEN
A
CASE STADY
88
Indicotor Conretognom (27=0.52)
GE
0.8
O.7
o6
r\
G5
I
0.4
N
E
(J
o3
O.?
(Lt
0
-0.1
{LE
E
0
t0
h
Figure 5.24: Indicator correlogram for the cutoff z = 0.52.
Indicoton Connelognom (28=0.94)
O.9
0.8
O.7
t'
o6
@
0.5
E
0..1
N
(J
I
t
0.3
o.e
0.1
o
-0.1
-&a
o
g10l5e0
h
Figure 5.25: Indicator correlolgram for the cutoff z = 0.84.
CHAPTER
5.
A CASE STUDY
89
Indicoton Connetognom (29=1.?8)
&o
o.a
o.7
o.6
oN
s
(J
0.5
O.t
o.3
oa
0.1
0
-o.l
-*r
3
h
Figure 5.26: Indicator correlogram for the cutoff z = t.28.
I
!
o.r'
o.r - o.e
ffi o.o. o.l
n'o.o
-2|0
20
Figure 5.27: Greyscale map of the indicator crosscorrelogra,m for the cutoff z
z = -0.84.
= -L.28 and
-il
CHAPTER
5.
A CASE STADY
90
I
I
W
tr
o
-zo
0.6 -
0.4.0.6
0.0 - 0.4
- 0.0
20
Figure 5.28: Greyscale map of indicator crossconelogra.m for the cutoff
z = 0.84.
z-
L.28 an;d
a bigaussian model. Fron these latter figures the conclusion ie their departure is small.
Note that the analysis based on extreme correlograms could be rnisleading and undeci
eive. The decision to reject bigaussianity from only indicator correlogratns can not yet be
considered as a reliable way to define if a data set is close to bigaussianity or not.
5.L.2 Principal Component
Correlograms:
Principal component correlograms are based on the orthogonalization of the indicator covariance matrix El(h) and the matrix multiplication of the indicator vector I(x;z) by the
orthogonal matrix
A".
the covariance matrix:
In this exercise, the orthogonalization is done at lhl = 0, i. e. for
CHAPTEN 5,
A CASE STADY
0.09
0.08
0.16
0.07
0.14
0.2L
E1(o)
-
91
0.06 0.05 0.04
0.L2 0.10 0.08
0.18 0.15 0.L2
0.24 0.20 0.16
0.03
0.06
0.001
0.04
0.09
(5.2)
0.25
Egrnfn.
0.02
Eymrn.
A"
is obtained by singular rnlue decompaition (Appendix A). Nine
principal components (yr) were obtained at the 1600 locations and their correlogra,ms and
The orthogonal matrix
crosscorrelograms were computed.
Figure 5.29 presents the first principal component conelogram; the original geometric
anisotropy is preserved and the correlation magnitude appeare greater than that of any
indicator correlogra,m. It seems that the first component synthesizes much of the variability
of the indicators. Figure 5.30 shows a greyscale map which appears quite similar to that
of figure 5.2 . This fact implies that inference of the first principal component correlogram
will have the same advantages and disadlantageo than inference of the Z-correlognm.
Theorem 3.1 proves that the crooscorrelognms for even and odd principal components
is null for all h if the data set is bigaussian and the cutoffs are quantile'symmetric. Figure
5.31 presents the crosscorrelogram between the first and second principal component with a
clear zero correlation level. Therefore, for practical purposeo theorem 3.1 holde for binariate
distributions whose departure from bigaussianity is not dra,matic.
The IKPCA approach does nol require bigaussianity, it requires that the level of croescorrelations be negligible. Figure 5.32 presents the first and third principal component
crosscorrelogram with again quasi zero correlation level. Note that orthogonalization at
lhl = 0 entails zero crosscorrelation only at the origin. The crossco:relations between the
first component and all other components is practically null, which can be observed in
figures 5.33 to 5.38.
Figures 5.39 to 5.46 are the principal component correlograns for the second to the
ninth principal componente. The second principal component shows the same geometric
CHAPTEN
5.
A
CASE STUDY
92
P. C. Connelognom
(91)
O.9
o.8
O.7
(L5
Ir
o.5
-c
c)
0.rt
o.3
o.?
0.1
0
-0.1
-0.e
l0
e0
h
Figure 5.29: First principal component correlogram.
I
0.6 -
W
0.0 - 0.4
E 0.4.0.6
n
-20
Figure 5.30: Greyscale map of the first principal component.
- 0.0
CHAPTEN
5.
A CASE STADY
93
P. C. Conrelogrom (gL g?)
GE
(L8
o.7
cl
0.6
fi
o.5
fr
(L4
s-
c)
&3
0€
&l
o
-(Ll
{.e o
h
Figure 5.31: First and second principal component crosscorrelqlram. A null correlation is
observed according with theorem 3.1.
P. C. Crosscont'elogrom (gL 93)
0.e
o€
u7
$6
(Y)
f|
().5
fl
O..l
E
o
0.3
o.?
0.1
o
-0.1
-(Le
t0
ao
h
Figure 5.32: First and third principal component crosscorrelogram. A null correlation is
observed for practical purposes.
CHAPTEN
5,
A
CASE STUDY
94
P. C. Cnosscorrelognom (gL 94)
o.O
0.8
9.7
sfl
O.6
fl
*
0.4
E
(J
o.3
o.5
0.4
0.1
0
{l.l
4A
0stlrszo
Figure 5.33: First and fourth principal componeDt crooscorrelogram. A null correlation is
is observed
P. C. Cnossconnstognom (glr
gS)
o.o
&8
o.7
(L6
lJl
fl
l
!
(J
o.5
0.4
0.3
o.e
o.t
0
{r.t
{La
c
0
l0
h
n
Figure 5.34: First and fifth principal component crosscorrelogram. A null correlation is is
observed
5.
CHAPTEN
A
CASE STUDY
95
P. C. Crossconnelognom (glr 96)
0.0
o.8
o.7
(L6
@
fl
0.5
)
0.4
I
E
c)
&3
0€
&t
o
-O.t
-O.2
t0
tg
n
h
Figure 5.35: First and sixth principal component crosscorrelogram. A null correliation is is
observed
P. C. CnosscornelogFom (9L 97)
0.0
o.8
o.7
r\
l
0.6
0.5
Jr
0.'t
-c.
o.3
o
0.4
(LI
0
-(Ll
{.e
lsaJ
h
Figure 5.36: First and seventh principal component crosscorrelogram. A null correlation is
is observed
CHAPTEN
5.
A CASE STADY
96
P. C. Cnossconnelogrom (gtr 98)
().e
&8
O17
o.6
@
t
o.s
t
0.4
8
0.3
&a
(,.I
0
-&t
4A o5lolseo
h
Figure 5.37: First and eighth principal component crooscorrelogram. A null correlation is
is observed
P. C. Cnossconrelognom (gL 99)
&e
o.8
o.7
ofi
I
-
f
=
t-,
0.6
0.5
0.4
0.3
o.e
0.1
0
-0.1
-0.a
--o
5
l0
15
a0
h
I
Figure 5.38: First and ninth principal component crosscorrelogram. A null correlation is is
observed
CHAPTEN
5.
A
CASE STUDY
97
P. C. Conretogrom (g?)
(LO
0.8
O.7
o.6
at
o.s
t
0.4
c)
o.3
trr
c
I
I
t
o.2
0.1
0
{.1
4.?
0
l0
eo
h
Figure 5.39: Second principal component correlogra,m. A severe destructuration can be
observed
anisotropy and a shorter practical range than the first component correlogram. From the
third to the ninth principal componetrt a strong destructuration of correlograms is observed,
as exp€cted from relation (3.66). For practical purposes there is no Bense to consider such
corelograms as source of correlation, their magnitudes and their practical ranges are small
enough
to wa,rrant considering them as pure nugget effect.
Verification of the constitutive hypothesis of IKPCA, figures 5.31 to 5.38 have already
presented crosscorrelations which are consistent with the constitutive hypothesis, and partial
verification of theorem 3.1 is given by figures 5.31 and 5.38. All remaining crosscorrelations
show the same pattern of very small correlation level. For exa,mple, figure 5.47 presents
the principal component crosscorrelogram for the third and fifth component where the
magnitude can be considered as null and, therefore, its effect on the constitutive hypothesis
insignificant. As the principal components gets higher the crosscorrelations become almost
inexistent.
The greyscale map of figure 5.48 shows that the crosscorrelation between the fourth
and sixth principal components can be considered as nought. There are not important
crosscorrelations between any of the principal components. Thus, the main assumption
required for application of IKPCA is seen to be verified to a very good approximation.
5.
CIIAPTEN
A CASE STADY
98
P. C. Conrelognom (93)
o.o
0.8
o.7
o.5
(r)
Jl
s-
(J
0.5
0.'l
o.3
O.?
(Lt
0
-o.1
{1.?
Figure 5.40: Third principal component correlogrrm. A very short practical range is observed.
P. C. Connelogrom (94)
0.9
tL8
(L7
&6
*J|
I
E
(J
&5
&{
O.3
o.?
&t
0
-0.1
-o.e
t0
h
Figure 5.41: Fourth principal component correlogram which can be considered as pure
nugget effect.
CHAPTEN
5.
A CASE STUDY
99
C. ConrEtognom (95)
o0
o.8
O.7
o.6
a)
Eo
0.5
O.4
0.3
&e
0.t
0
-o.t
4.e
Figure 5.42: Fifth principal component correlogram which can be considered as pure nugget
effect.
Connetognom (96)
(L9
(L8
O.7
0.6
(o
o.5
s.
0.4
L)
o.3
fr
o.e
&l
0
{.t
{r.a
Figure 5.43: Sixth principal component correlograrn which ca.n be considered as pure nugget
effect.
CHAPTER
5.
A
CASE STUDY
100
P. C. Connetognom (97)
&o
o.8
o.7
o.6
t\fl
I
.c.
(J
0.5
(Lrt
&3
oa
0.t
0
-0.1
-o.2
l0
h
Figure 5.44: Seventh principal component correlogran which can be considered as pure
nugget effect.
P. C. Conrelognom
(gB)
0.0
&8
o.7
0.6
ol
c-
o
0.5
O.4
0.3
0.4
&l
o
-o.t
-0.e
t0
h
Figure 5.45: Eighth principal component correlogram which can be considered as pure
nugget effect.
CHAPTER
5.
A CASE STUDY
101
P. C. Conrelognom (99)
0.9
O.8
O.7
0.6
ct)
)
t
(J
(Ls
o.4
0.9
0.a
0.1
0
-0.1
-o.2
t0
h
Figure 5.46: Ninth principal component correlogm,m which can be considered as pure nugget
efiect.
P.
C. Cnossconnglognom {g3r 95)
o.s
o.8
o.7
Lt)
)
cf)
)
E
c)
&6
0.5
0.4
0.3
0.e
o.l
0
-0.1
-&a
lo
h
Figure 5.47: Third and fifth principal component crosEorrelogram which can be considered
a^s a null correlation-
CHAPTER
5.
A
t02
CASE STADY
I
I
M
n
0.6.
0.4 - 0.6
0.0 - 0.2
.0.0
-20
Figure 5.48: Fourth and sixth principal component crossorrelogram grey scale map. The
crosscorrelation magnitude is so small that it can be considered null.
5.2
Estimation of Points:
Once the principal component crcscorrelograms have been computed and their level of
correlation checked as null, the IKPCA can be applied solving the constrained normal
equation system (4.6). According to relation (4.24) its performance should be superior or
equal to that of IK at least ia terms of estimation va,riance.
IKPCA being a nonparametric technique its effectiveness relies on a good inference of
the spatial va.riability (correlogra,m or variogram) .od enough informing data. From the
1600 known locations, 1224 points will be estimated each with an identical configuration of
22locil data. Figure 5.49 shows one point to be estimated and the corresponding 22 d"ata
locations. Since the data configuration is the same for all L224 points, only one constrained
normal system need to be solved.
5.2.L Modeling
Correlograms:
Solution of equation system (a.6) requires knowledge of the comelogram function. However,
if
the exhaustive information is made available (only for the purpose of that inference),
there is no need to model
it
using a positive function. The coefficients of the kriging system
can be read directly from the exhaustive correlogram.
CHAPTEN
5.
A CASE STUDY
103
X
oooo
05
Irrrl
Figure 5.49: Data configuration used by IKPCA.
all possible separation vectors h was
generated. This information is then input into the kriging system (4.6). In order to ensute
positive eatimation nariances, the corresponding cora,riance matrix (without the constraint)
has been factorized to compute the corresponding eigenrnlues (see Appendix A). It was
found that for the chosen configuration (figure 5.49) ail eigenvalues are positive. In the
hypothetical case of negative eigenvalues, a modification of the original matrix can be
accomplished by ad<ling a diagonal matrix whose norm is minimum (Gill and Murray,
A data
base of exhaustive correlogra,m values for
1978), preserving almost
totally the original structure of variability.
5,2.2 Performance of the Conditional cdf F*(x; zl(n'))z
Using the data configuration shown in figure 5.49, the conditional cdf .F*(x; zl(n')) has been
estimated
at L224 points.
Several criteria have been considered to measure the IKPCA
performance: reliability of the corresponding estimates of proportions, quantity of metal
recovery factor and tonnage recovery factor.
The IK and IKPCA estimates of the conditional cdf have been compa,red for the above
estimations.
Furthermore, a simplification of the IKPCA approach was considered, noting that for
the third to the ninth principal component correlogram the direct correlation is negligible.
CHAPTER
5.
A
CASE STUDY
104
Thia approximation is based on the results shown in the last section where the higher the
component the leseer the correlation. Consequently, the third to the ninth component were
estimated taking an arithmetic average of the corresponding22 samples.
Predicting Quantiles:
of the L224locations at which the conditional cdf
p-quantile q}(x") is retrieved from F*(x";zl(n'\\, such that:
For each one
.F*(r.";
qi(r-)l(n')) = p
Since knowledge of the conditional cdf
has been estimated, the
(5.3)
r'*(r."; zl(n')) is limited to only nine cutoffs,
the p-quantile qi(xr) need to be interpolated here througb a linear interpolation. The
approximation of linea,r interpolation aasumes that the interclase distribution is uniform
which is usually not true. However in this exercige that approximation ie applied equally to
all different techniquee being compared. Thus the errors introduced by this assumption are
shared fairly by all three techniques etudied: IK, IKPCA and the approximated IKPCA.
If the measure of uncertainty .F'*(xo; zl(n')) ie reliable then in average over all point
xa e (1224) , the actual proportion p* oI values z(u) S qi(*") should be close to p.
Therefore:
'p'=
1
t224
*1224 I;(*,;p)
(5.4)
?-r
should be close to p, with:
:/_-._\
r(xa;p) _l
=
t
I n0
[
if z(:<,)3qi(rk)
(5.5)
otherwise
Note that p is the predicted proportion and p* is the actual proportion.
Table 5.2 shows the scores (p*rp) for the three estimation algorithms IK, IKPCA and
the simplified version of IKPCA assuming negligible correlations for the third to the ninth
component.
From table 5.2,
it
is observed that for values of p less than 0.5 all the estimators are
overestimating the respective actual proportion.
IK slightly is the best estimator. IKPCA
and its approximated version that uses only two correlograms a,re almost identical. The
situation changes for proportions p greater that 0.5, with all estimators overestimating
5.
CHAPTEN
A
105
CASE STUDY
{'E
.lo
.*
.t
I'
3o
,s
o.s
E
Eo
l.
O.4
lrJ
dor
3}..
o.2
cl.t . '
0.1
'o
o
o.1 0.2 0.3 0.4 0.5 0.0 0,7 0.1
0.0
P-Actuol
Figure 5.50: Scatterplot of the predicted and actual proportion. (*) U( estimator,
exact IKPCA and (o) the simplified version of IKPCA.
(+) tne
the actual proportion. Figure 5.50 shows a scattergram of the predicted ver€us actual
proportions.
For this particular case, the use of more biva,riate information does not yield a better
estimator of the actual proportion. Note, however, that the simplified version of IKPCA
a considerable quantity of computational effort calling only for the solution of two
systems and, in a practical situation, the modelling of only two correlograms. The precision
6aves
of IK in practical terms is comparable to the precision of any version of IKPCA' therefore
the less demanding approximated IKPCA is an excellent alternative to IK.
Predicting Proportions:
In this second test two quantiles q;r(*") and q|(x") are retrieved from the conditional cdf
F'(*o; ,l("'\).The probability for z(x") to be in the interval lqh!r'),qL(:r")] it predicted
to be:
p=p2-h
with:
p2--l-m
(5.6)
CHAPTER
5.
A CASE STUDY
106
,l€
r{o
5
.
o.7
L
f+o
r+
0.6
8o
L
o-05
t+o
Tl
e(J
0.4
{+o
E
'E
o
f
o.3
UJ
.'
.
0.1
o
0
cl-
Cl1
0.1 0.2 0.t
0.4 0.5 0.0 0.7 0.t
o.9
Actuol Proportion
Figure 5.51: Scatterplot of the predicted and actual proportion. (*) tr{ estimator, (+) tne
exact IKPCA and (o) the simplified version of IKPCA.
The actual proportion of values falling this interval is:
f = #'f_r{*,,
*Xr - i(*,
;
prl
Therefore:
.t 1224
P-
- h,D-
tto'
1
t224
d - hF- tt-'*)
Table 5.3 present a comparison between the proportions predicted by
(5.7)
IK, IKPCA and the
eimplified version of IKPCA. For thie data set, all estimators are underestimating the actual
proportions p*, which is consistent with the last exercise. Figure 5.51 shows a scattergram
of actual and predicted proportions.
It
appears from that figure that the proportions for
extreme cutoffs are better estimated than for median cutoffs. The reason is found in figure
5.50: for example, the overestimation in figure 5.51 is maximum when p
=
0.5 and that
interrral corresponds to p2 = 0.75 and p1 = 0.25 on figure 5.50 where maxima deviations are
precisely found.
There is no clear advantage to estimate probability intervals using IK, IKPCA or the
simplified IKPCA. However, the last and less demanding estimator achieves equivalent
CHAPTEN
5.
A CASE STUDY
107
results.
Quantity of Metal Recovery Factor:
The point support quantity of metal recovery factor within a domain .4 is defined
Q@;")=
# I^r, - i(x;z))z(x)dx
with the indicator defined traditionally
i(x;
as:
(b.8)
as:
z)=
if z(x) < z
{ ;
otherwise
Therefore for this particular case the exact quantity of metal factor is:
.t
e@; z)
=
t224
hE
a=L
tt
-
r(:<.; z))z(x,)
(5.e)
and the corresponding estimator of the quantity of metal using an estimated conditional
cdf is:
1
Q*(A;";1 =
1224
fuE
K
D rnl[F*(:r,; zx+il@)) -
F"(r..; z*l(n'))]
(5.10)
:I tc=t
with rnl being the true exhaustive class mean for grades valued between [rxrzx+tl.
In this exercise there is no need to interpolate the conditional cdf, therefore additional
errors due to assumption of an interclass distdbution are not introduced. In practice mk
can be estimated from a global or a local measure of class central tendency, e. g. by the local
class mean estimated from neighborhood data, or by the global class mean estimated from
the entire data set, etc. The decision to ta^ke the exhaustive true class means is particular
to this application and those class means are sharing by atl techniques being compared. In
practice the rnp, will have to be inferred from the data.
IK, IKPCA and the reduced version of IKPCA are compared. Figure 5.41 shows a scattergram ofthe results derived from the three techniques. Again, IK appears to outperform
the IKPCA estimators. There are no dra,matic deviations from the true values, and the
reduced version of IKPCA is almoot equal to the full implementation of IKPCA using all
nine correlograms. Table 5.4 presents the corresponding numerical results.
CHAPTEN
5.
A CASE STUDY
108
0..35
0.4
.$;
o.35
l!
g
{.p
o.J
g
E
f
025
t
o.2
,{#
0.15
o.l
.i#
/r {o
0.05
0
0.1 0.15 O.Z 0.25 0.J 0.J5 0.4
o.'tlt
Estimoted QMRF
Figure 5.52: Scatterplot of the predicted quantity of metal necovery factor and the actua,l
values. (*) IK estimator, (+) tle exact IKPCA and (o) the simplified version of IKPCA.
Tonnage Recovery Factor:
The tonnage recovery factor is defined as:
T(A; z) =
1
-
1l
W J^i(x;z)d,x
(5.11)
or in a discrete form:
r (A; z) =
#*'fl
tt
-
i(x,; z)l
(5.12)
An estimator of the tonnage recovery factor based on the inferred conditional cdf is:
1
T*(A;"7 =
1224
fiprtt - F*('<"; ,l("')l
(5.13)
Figure 5.53 and table 5.5 present the results for the nine cutofis. Note that for this
particular test the conditional cdf need not be interpolated. The actual and predicted
values are almost equal. The approximated version of
the least efiort.
IKPCA provides the
same score
with
CHAPTER
5.
A
CASE STADY
r09
,'
a+
,
&
F
E
f
o.c
I
o.s
,J
o
a
.o
o2
.t6
0.1
.16'
L,
-o
0
I r,r
.'
I rrrrrrrrrrrrrrrr||.rrilrrrrrr.
o.1 0.2 0.J 0.4 0.5 0.6 0.7 0.!
..'..
0.0.
Estimoted TRF
Figure 5.53: Scatterplot ofthe predicted tonnage recovery factor and the actual values. (*)
IK estimator, (+) the exact IKPCA and (o) the simplified version of IKPCA.
A Comparison with the MG Approach:
It
can be shown that if a data set is multinariate gaussian distributed, the conditional cdf can
be obtained, by simple kriging. The conditional distribution would be gaussian with mean
and variance provided by simple kriging. For the realization (1600) here studied,
it
is known
that the multiva,riate distribution is close to the multivariate gaussianity . Certainln there
is some departure from multigaussianity which is evident from the previous comparison
of indicator correlograms, indicator correlograms and satisfaction of theorem 3.1, but this
depa^rture is not enough to inrnlidate the multigaussian hypothesis.
The MG estimates were obtained by eolving a kriging system with data configuration
identical to that used for IK and IKPCA; the exhaustive z-correlogram and mean were used.
The performance of MG, IK, IKPCA and the approximated IKPCA are compared for
the estimation of the following integral:
S(A;z) =
whose true value is:
lo,l^
i(x; z)dx
(5.14)
CHAPTEN
5.
A
CASE STUDY
110
9F'
#
c
o
'E
L
u
o.7
+
0.6
o
o-
o
L
0.5
(9
o.4
.5
(L
o
,6
r.9
{5
o,1
o
o
0.1 0.2 0.3 0.4 0.5 0.6
0.7
Estimoted Proportion
0.6
0.9
Figure 5.54: Scatterplot of the predicted S(V;z) factors and the MG factors.
estimator, (+) tne exact IKPCA and (o) the simplified vereion of IKPCA.
s(A;z) =
1
t224
hf,
i(x";z)
(*) tr(
(5.15)
The four techniques eetimate integral (5.16) using the following expression:
1
S.(A;
t224
"1= fi,D=,
a.(*r
zl(n'))
(5.16)
it
is an estimate of the actual
which is the composite estimated conditional cdf over A and
proportion S(Ai z).
Figure 5.54 shows that globally there is no distinction between the four estimators.
Table 5.6 presents the numerical results. The approximated version of IKPCA is almost
equal to the MG estimator, therefore the global performance of a bivariate estimator is
comparable, for this particular test, to the performance of a multirnriate estimator.
Figures 5.55 to 5.57 show scattergrams comparing
IK, IKPCA and the approximated
IKPCA estimates of type (5.15) with the corresponding MG estimates for the first cutoff
(z = -1.28). Observe that the IKPCA estimators yield less dispersion around the 45o
degrees line. However these regression lines depart from the 45o degrees line, i. e. a
conditional bias is apparent for both the higher an lower probabilities. The conditional bias
CHAPTEN
5.
A
CASE STUDY
111
*
*
tr
o
aL
o
o
o
L
0.7
0.6
0.5
(L
(,
o.4
.*l[
tt*J
r*; tiJ
*!t
"'{*\.*'*
* ** **Jl
I*
*f
-
,b-'\
!7.,. t;i- ** *
***tJ*,'T.*i1
Ij**-_*'*
,
+t].*I
li****rr*
"l
l
,:,*Jj;
:+.{$l lI *i-i -' -
*
0.4 0.5 0.6
lK Proportion
Figure 5.55: Scatterplot of the IK and MG estimate of .F*(:ro;
-f
.28).
of the IK is small and partially hidden by the dispersion.
Figures 5.58 to 5.60 present the scattergrams for the median cutoff (z = 0.0). The
IK estimates show the same dispersion around the 45o degrees line, and alternate vertical
bands can be seen. Sullivan (1984) noted the same banding and explained that for a given
data configuration there is a finite number of combinations of Ots and 1's, thus the IK
estimates for different locations are likely to be equal. Contra,rilg the IKPCA estimates
show lesser dispersion than the IK estimates and do not preoent bands. Recall that the
IKPCA estimator is a linear combination of indicators at different cutoffs. Significant
conditional bias is not observed for any estimator.
Figures 5.61 to 5.63 present the corresponding scattergrams for the extreme cutoff (z =
1.28). The IK estimates show the same dispersion around the 45o line and a conditional
bias for the lower probabilities. The IKPCA estimates contain lesser dispersion than the
IK estimates, but the conditional bias for the lower probabilities appear€ larger.
It seems at this point that the conditional bias is increased as the absolute difference
between the cutoff and the median increases. An explanation could be that IK and any
form of IKPCA are only bivariate-type estimators and their structural information level is
not enough to yield estimates of a full multivariate distribution.
Another source of approximation is that the indicator formalism amounts to discretize
CHAPTER
5.
c
o
A
Ltz
CASD STADY
0.9
Il-ri
0.6
f r t't***
tlot f**'
* :l-l-..
**
o.7
Eo
o
o
L
0.6
C,
0.4
i--tl *tr
*o--rto*{ .
*'** -o*
-*##;' t*
**
*
--
0.5
(L
-
0.9
i
oa-h-4;.
-
*
o.2
0.1
0
o
0.1 0.2 0.5 0.4 0.5
0.6 0.7 0.6
0.9
Exoct IKPCA Proportion
Figure 5.56: Scatterplot of the IKPCA and MG estimate of .F*(xo; -1.28).
0.9
0.E
c
o
o.7
to
o
o
L
0.6
()
o.4
o.5
(L
0.3
o.2
0.1
o
o
0.1 0.2 0.5 0.4
0.5 0.6
0'7
0.8
0.9
IKPCA(Reduced) Proportion
Figure 5.57: Scatterplot of the approximated IKPCA and MG estimate of F*(xo; -1.28).
CHAPTEN
5.
A
CASE STUDY
0.9
o.8
c
o
o
o
o
l-
o.7
.*-.r.$s:ffiil,;
0.6
.il.ffi.;
0.5
(L
(J
0./t
0.3
.T'h.tF--
0.1
o
*r*{**"'
**
Hr$t';y}f
'&#F*'H4.
o.2
o
0.3
O./t
IK
0.5
,l
0.6
Proportion
Figure 5.58: Scatterplot of the IK and MG estimate of -F*(xo;0.0).
-
. *t**b'*
o.8
,c
:o
E
L
'oo'o!
o.7
{'r[T{,fff'-T
'r:*.MiftI
0.6
* I * "'-f lFr+'I
0.5
O.'t
0.J
o.2
i{tr#n.*:'
0.1
0
^
--;tffi
(L
()
l; ttul
0
0.3 0.4 0.5 0.6 0.7
Exoct IKPCA Proportion
Figure 5.59: Scatterplot of the IKPCA and MG estimate of f'*(:ro;0.0).
CHAPTEN
5.
A
ffi
0.9
i
c
-
0.E
o.7
o
'tL
o.G
o
o-
o
0.5
.L
(J
114
CASE STADY
o.4
*:+f.Tr*i
-raT
I
-T
-t
.l*H
"tffii
?
I
0.1
-
0
o
o'1 oz
,fi.ot'ilorJJit t?lo"'tl'""
0'B
o'e
Figure 5.60: Scatterplot of the approximated IKPCA and MG estimate of .F.(x';0.0).
the range of z(x) into a finite number of cutofs, hence its resolution is highly dependent
on that number of cutoffe. If the number of cutofis for this exa,mple were increased and
a new comparison study done, it ie likely that for the same extreme cutofis (z - * 1.28)
the conditional bias will be lesser than using only nine cutoffs. Thus, the problem of
number of cutoffs deeerves further study. Regarding this problem, IKPCA offers an excellent
solution because the number of cutoffs can be increased without additional work. Yet the
will impact the expreesion of the first components. Relation (3.66)
ensures, at least for bivariate distributions clo6e to bigaussianity, that the higher principal
component correlogram* are likely to be pure nugget effect and therefore, no additional
la,rger number of cutoffs
computational work or modeling of correlognms is produced.
5.3
Estimation of Panels:
This exercise consists in the estimation of the following integral:
6(v;z) = + lri(xiz)dx
which represents the true proportion within V of z(x) ! z,
(5.17)
and
V is a square panel of
dimensions 8 x 8 units. The entire domain constituted by 1600 points has been divided in
CHAPTEN
5. A CASE STADY
115
o.9
o.6
c
.9
L
o
oo
L
--1-it1 t "13
-.
o.7
o.6
*'i"'*
***1'
0.5
(L
C'
rr- i'l^L|'
1
*f
**.i *l**l*"
.*
0.,+
0.3
o.2
.**
..+
t***
. * **
t*'
0
0.1
tt
t1
*
*
*I*t'* * o
*
*
0.2
*
I
*
,*
*f
o.t
0
*
't * **
*
0.7 0.6
0.1 0.4 0.5 0.6
lK Proportion
0.8
Figure 5.61: Scatterplot of the IK and MG eetimate of F*(xo;1.28).
-j:
0.9
o.6
c
.9
L
o
oo
o.7
o.6
o,5
(L
(,
O..t
o.J
o.2
0.1
0
**'it fi
.lt:.t*tli:
.+-t t**
'+4.Stl*
*t* *J*tt
***
**
i
0.J 0.+ 0.5 0.6
*
*
*
r
0.7
Exoct IKPCA Proportion
Figure 5.62: Scatterplot of the IKPCA and MG estimate of .F'*(:co;1.28).
CIIAPTEN
tr
'toL
o
o
o
l-
5,
A
CASE STADY
,t':qifTtl
o.7
,1$--i-*-
0.6
t
*..*'
-
li;{-
* .' *i '{"
o.5
(L
()
116
,,*
o.4
*,c{f
t
**
*
'l
*
0.1
o
0.5 0.4
0.5 0.6
0.7
IKPCA(Reduced) Proportion
Figure 5.63: Scatterplot of the approximated IKPCA and MG estimate of F*(:<.;1.28).
25 such panels, containing each one 64 points.
Recall that IK and IKPCA are methods developed precisely to estimate spatial integral
of type (5.17). With the data configuration shown in figure 5.64 and using the nine cutoffs
of table 5.1, the integral (5.17) is estimated. The true value is known from the 64 points
available within each panel:
d(v;z)=
#fl
i(xo;z) ,vru € v
(5.18)
Different estimators of (5.17) are built using IK, IKPCA and the approximated version
of IKPCA. Table 5.7 shows the average proportion over the 25 panels and compares them,
with the true proportions. There is no surprise: IK and IKPCA are globally unbiased
estimators. The approximated IKPCA that solves only two systems of equations is also
globally unbiased. Note that the data configuration used for this test is difierent from the
data configuration used to compare MG with IK and IKPCA where only t224 points were
considered, as opoosed to the 1600 pointa defining the 25 panels; thus their gtatistics are
difierent.
Figures 5.65 to 5.73 show the local performance of each panel compared with the true
values. It seems that for all the cutoffs no significant conditional bias is present and that IK,
IKPCA and its approximated version give almost identical estimates. The conclusion again
CHAPTEN
5.
A CASE STADY
LT7
o
o
o
o
o
o
o
0
rrrrl'rr'l
o
5
10
Figure 5.64: Data configuration for the estimation of panels.
is that the approximated IKPCA achieves results comparable to IK or the exact IKPCA,
at a fraction of their cost; thus it can be recommended as a fast and reliable alternative to
IK or the exact IKPCA.
5.3.1 Simple IKPCA and Ordinary IKPCA:
In the fourth chapter it was said that doing ordinary IKPCA, which assumes that E[I(x;z)]
is unknown and varies in space, implies that the indicator covariance matrix >r(h) changes
from one location to another, a situation inconsistent with the stationary hypothesis. It
was also argued, that no matter the inconsistency OK could be appropriate to account for
local depa,rtures from stationarity.
This section presents a comparison between the simplified version of IKPCA using the
simple and ordinary approaches, as applied to the estimation of integral (5.17). Table 5.8
presents the comparison for the average of 25 panels. Three significant digits are used to
emphasize the small differences.
Inspection of table 5.8 reveals that there are not significant differences and that both
estimators can be considered as equimlents for this case. Recall that the generation of this
data set was based on an unconditional simulation with stationary multira,riate gaussian
distribution, hence outliers are unlikely and local departure from stationarity not expected.
CHAPTEN
5.
A
CASE STUDY
118
0.9
o.6
c
o.7
.9
!-
o
0.6
o
t-
0.5
o-
(L
E
f
Ot
0.4
+
o
0.J
rB
o.2
***&'*
p*i
0.1
0
0.1 0.2
.
.#
'G!r
t
0.J 0.+ 0.5 0.6 0-7
Estimoted Proportion
0.9
Figure 5.65: Scatterplot of the composite distribution for z = -1.28 and the actual value.
(*) tr( estimator, (+) tne exact IKPCA and (o) the simplified verrion of IKPCA.
0.9
{O.
o.E
c
o.7
o
0.6
o
o+r
'
o-
o
l_
o-
E
l
'1o-
0.5
(*
O.,l
O
0.3
$r l*{
*l-
o.2
0.1
o
s6,
o*
+
t-
0.1
0.5 0.4 0.5 0.6
0.7
Estimoted Proportion
0.t
0.0
Figure 5.66: Scatterplot of the composite distribution fot z -- -0.84 and the actual value.
(*) tr( estimator, (+) tne exact IKPCA and (o) the simplified version of IKPCA.
CHAPTDN
5.
A CASE STADY
119
*€
0.9
o.
'e*
o.E
c
o
L
o
oo
l-
(L
E
f
o.7
o.6
r
0.5
ClOr
{.o
Q,'
*3
+ q.'
0.4
o
0.5
,ryf.
o.2
o.l
l
I
1'ro
o
o.3
0.4 0.5 0.0
0.7
Estimoted Proportion
Figure 5.67: Scatterplot of the composite distribution fot z = -0.52 and the actual value.
(*) IK estimator, (+) tle exact IKPCA and (o) the simplified version of IKPCA.
{.
o.9
.E+
O,E
c
0.7
.+
.9
b
o-
{
0.6
t
o.g
5
o./t
. '*$D t
o*
.
+,frp
o
.'*
o.2
0.1
0
,d
o.t
?'
o*l
ro
+
0.3 0.4 0.5 0.6
0.7
Estimoted Proportion
Figure 5.68: Scatterplot of the composite distribution lor z = -0.25 and the actual value.
(*) tr{ estimator, (+) tne exact IKPCA and (o) the simplified version of IKPCA.
CHAPTEN
5.
A
CASE STUDY
L20
..*'
c
.O.7
b
o-
0.6
+
a
.9
'!+
=
5
oro
Q*.'
o
t
o.s
t'
.*e
0.4
o
o.2
r+
1#
o***.
+
+
*.
+:' o*
0.1
0 0'r o'2 o'trti*1t"oo'irootl*,o.,0't 0'6 0'e
Figure 5.69: Scatterplot of the composite distribution tot z -- 0.0 and the actual value. (*)
IK estimator, (+) the exact IKPCA and (o) the simplified version of IKPCA.
I
0.9
9o'*q
o.6
c
o,7
b
o-
0.6
r€
or.
J
Or
+9
o
t
*.P
.& *i* +
.9
o.s
*
.*o
o.rt
r'
+
()
o+
.tE
o.2
o.1
o 0'r o'2
o'te"ti*1t"ao'Fropooltio.o't
o'E
o'e
Figure 5.70: Scatterplot of the composite distribution for z = 0.25 and the actual value.
(*) IK estimator, (+) tne exact IKPCA and (o) the simplified version of IKPCA.
CHAPTEN
5.
A
CASE STADY
L2L
t<)
.Q
.tt,w
0.9
c
,4
O.7
.9
b
o
o
{"}
+
.t++
,*oor
0.6
'Q.
t
o.s
:r
0.4
o*
o
4*.
.t';
I
ro
0.t
o
o
0.1 0.2 0.1 0.4 0.5 0.6 0,7 0.6
0.9
Estimoted ProPortion
Figure 5.71: Scatterplot of the composite distribution fot z = 0.52 and the actual value.
(*) tr( estimator, (+) tne exact IKPCA and (o) the simplified version of IKPCA.
$
,aff
c
r+f *'Q
++
.0.7
,tp'
.9
b
o-
0.6
i
o.s
-
0.4
'ot
o
:
O*
.7p
i
*t
o
o.t
o
0 0'r o'2 o'tertifrlt"oo'Fropthio.o't 0'6 0'e
Figure 5.72: Scatterplot of the composite distribution for z = 0.84 and the actual value.
(*) tr( estimator, (+) tne exact IKPCA and (o) the simplified version of IKPCA.
CflAPTEN
5.
A
0.6
O.7
b
o-
0.6
i
o.s
o
r
{*'
+
+P.
0.9
tr
o
L22
CASE STADY
{,1€lB-.l_ O
.q
o.4
()
o.2
0.1
o-
0 0'r o'2
o'terti*,1t"ao'iropooftio.o't
o'E
0'e
Figure 5.73: Scatterplot of the compooite distribution fot z = 1..28 and the actual value.
(+) tr( estimator, (+) tne exact IKPCA and (o) the simplified version of IKPCA.
Figure 5.741o5.76 show scattergrams of. simpIeIKPCA (SIKPCA) estimateo versus ordinary
IKPCA (OIKPCA) for the extreme cutoffs (z = * f .28) and for the median cutoff (z: 0).
The estimates of SIKPCA and OIKPCA appear identical without any persistent pattern of
overestimation or underestimation. No marked conditional bias is observed.
CHAPTEN
5.
A CASE STUDY
L23
0.9
0.6
s
O,7
b
o-
0.6
t
0.5
E
o..r
.9
o+*
o
a
,6+ t
*o. *
o
o.2
t*,'
o.1
0
o.3 0.4 0.s 0.6 0.7 0.E
Estimoted Proportion
o.9
Figure 5.74: Composite distribution for z = -L.28 obtained by IK, OIKPCA and SIKPCA.
(*) tr( estimator, (+) tne appro:cimated OIKPCA and (o) the approximated SIKPCA.
o
0.9
0.6
c
o.7
o
o
o
0.6
.o+
o+
.9
t-
(L
E
J
.f'
.l-
,if
o+i e
**.'
0.5
0./t
,'
o
r,S of
++o
0.3
o,2
*r o,
0.1
o+lo ro
o
0
0.1
*or
{o
0.2 0.5 0.4 0.5 0.6 0.7
0.E
0.9
Estimoted Proportion
Pigure 5.75: Composite distribution lor z = 0.0 obtained by IK, OIKPCA and SIKPCA.I
(*) K estimator, (+) tne approximated OIKPCA and (o) the approximated SIKPCA.
CHAPTEN
5,
A
L24
CASE STUDY
' wo'
c{o
*.oo ,t#
co
0.7
b
o
o.o
i
o.s
-
O.lt
*#i
,to
o
a
o
:
o
o.2
0.1
o-
o o'1 o'2 o'L.ti$,1t"oo'F.ooth,ono''
0.6
0.9
Figure 5.76: Composite distribution fot z -- 1.28 obtained by IK, OIKPCA ancl SIKPCA.I
(*) IK estimator, (+) tne approximated OIKPCA and (o) the approximated SIKPCA.
Table 5.1: The selected nine cutofis.
zk
-0.84
-0.52
-0.25
0
0.25
0.52
0.84
1.28
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
CHAPTER
5.
A
L25
CASE STUDY
Table 5.2: Predicted proportions for expression (5.4).
0.1 0.056 0.045 0.038
0.2 0.134 0.116 0.1086
0.3 0.237 0.237 0.227
0.4 0.362 0.361 0.363
0.5 0.502 0.500 0.500
0.6 0.643 0.650 0.641
0.7 0.767 0.776 0.783
0.8 0.862 0.884 0.884
0.9 0.930 0.9u 0.950
Table 5.3: Predicted proportions for expression (5.7).
P fi* frrn" P";t#''
0.1 0.156 0.140 0.134
0.2 0.281 0.288 0.277
0.3 0.416 0.4L7 0.424
0.4 0.530 0.539 0.556
0.5 0.641 0.662 0.684
0.6 0.728 0.767 Q.776
0.7 0.814 0.835 0.849
0.8 0.874 0.899 0.910
0.9 0.924 0.943 0.953
Table 5.4: Predicted quantity of metal recovery factor.
;z)
-1.28
-0.84
-0.52
;?
-0.25
0.00
0.25
0.52
0.84
1.28
0.11
0.22
0.29
0.33
0.34
0.33
0.29
0.23
0.13
Qir(A; z
Aiz
iz
0.L2
0.22
0.29
0.33
0.35
0.33
0.29
0.23
0.13
0.13
0.24
0.24
0.30
0.14
0.15
0.31
0.35
0.36
0.35
0.31
0.25
0.16
0.34
0.36
0.34
0.31
0.25
CHAPTER
5.
A
126
CASE STUDY
Table 5.5: Predicted tonnage recovery factor.
A;z
-L.28
-0.84
-0.52
-0.25
0.00
0.25
0.52
0.84
1.28
0.77
0.78
0.68
0.57
0.47
0.36
0.26
0.17
0.08
0.68
0.57
0.47
0.36
0.27
0.18
0.09
Table 5.6: Comparison between MG, IK, IKPCA and the approximated IKPCA'
ir(A;z
-1.28
-0.84
-0.52
-0.25
0.00
0.25
0.52
0.84
1.28
Si1"*,{Ai z
2
0.11
0.21
0.32
0.42
0.53
0.63
0.73
0.82
0.91
Table 5.7: Panel Estimation: comparison between
IKPCA.
'(A;z
0.
0.22
0.32
0.42
0.53
0.63
0.73
0.81
0.90
IK, IKPCA and the
iz
;2
-1.28
-0.84
-0.52
-0.25
0.1
0.00
0.25
0.52
0.84
L.28
0.5
0.6
0.7
0.8
0.9
0.2
0.3
0.4
0.2
0.29
0.43
0.54
0.09
0.2
0.29
0.43
0.54
0.64
0.64
0.54
0.64
0.72
0.80
0.90
0.72
0.80
0.90
0.72
0.80
0.90
0.2
0.29
0.43
approximated
CHAPTEN
5.
A
L27
CASE STUDY
Table 5.8: Panel Estimation: Comparison between ordinary IKPCA and' simple IKPCA.
6:i
z
-L.28
-0.84
-0.52
-0.25
0.00
0.25
0.52
0.84
1.28
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.088
0.199
0.294
0.429
0.543
0.640
0.72L
0.808
0.900
0.196
0.292
0.427
0.5,m
0.636
0.7L7
0.804
0.899
Chapter
6
Conclusions and
Recornmendations:
Indicator Kriging based on Principal Component Analysis, through modelling of principal
component correlograms incorporates more biva,riate information than the Indicator Kriging
approach. That is precisely the key idea of IKPCA.
As IK, CoIK or PK, the IKPCA is a non pa.rametric technique where the bivariate distri-
bution needs to be inferred from data and/or ancillary information. Its main hypothesis is
the assumption that the principal compouent crosscorrelograms are negligible. Satisfaction
of that condition and the availability of. enough data are requirements to apply satisfactorily
the technique.
6.1
Bivariate Distribution:
This research shows that the bigaussian case is a very favorable bivariate distribution which
Howsatisfies the IKPCA constitutive hypothesis to an excellent degree of approximation.
to bigaussian
ever, this fact does not mean that the application of IKPCA is restricted
distributions. Its application is restricted by the assumption that principal component
bivariate districgxfcorrelograms a,re almmt null. Checking the performance of IKPCA for
:"'iutions strongly different from the bigaussian needs to be done. For distributions without
can be applied
significant departure from bigaussianity the case study shows that IKPCA
with reasonable results.
128
CHAPTEN
6. CONCLUSrcNS AND NECOMMENDATIONS:
L29
The exact implementation of IKPCA can be approximated by assuming that the autocorrelation of higher principal components is almost null. This additional hypothesis can
and must, in practice, be verified . This second approximation cuts the number of correlograms to model and the computation costs to a fraction of the effort required by the
full IKPCA. It is conjectured that this second hypothesis is likely to occur whatever the
bivariate distribution. Recall that the first principal component is defined as a variable
which contains most of the variability of all indicators at h = O, hence the spatial variability expressed by higher principal component correlogra,ms should be small. This latter
approximation also needs further investigation.
6.2
Number of Cutoffs:
The performance of IK or IKPCA on a multigaussian field is a,ffected by the number of cutoffs. Increasing the number of cutoffs means increasing the resolution level of the indicator
and therefore, increasing the number of correlograms to model and computational costs .
Such increase which in theory is possible is not recommendable in practice. However, if
a few number of principal component correlograms can express and explain almost totally
the spatial nariability of all indicators, the number of cutoffs could be increased without an
increment on modelling and computational times. If this is the case, the conditional cdf
can be discretized in a large number of cutoffs and the etrors introduced by interpolation
of the estimated conditional cdf's would be minimized.
6.3
Inference of Principal Component Comelograms:
Practical application of any geostatistical tool relies on inference of some measure of spatial
correlation. IKPCA is no exception and requires inference of the principal component
correlogra,ms and crosscorrelograJns which appeax as linear combinations of the original
indicator correlogra,ms and crosscorrelograms (expression 3.40).
For the case here studied there is a strong similarity between the z-correlogram and the
of the first principal component correl_fftt"o*ponent correlogram. Thus the inference
to'ogrr- will face the sa,me problen than the inference of the z-correlogra,m: in particular
spatial clustering. However, a principal component correlogram being a weighted average
of indicator correlograms and crosscorrelograms its structure is likely to be better defined.
CHAPTER
6. CONCLUSrcNS AND RECOMMENDATIONS:
Recall that the z-cova,riance can be expressed in terms of an infinite gum of indicator covariances and crossconariances (Matheron, 1982; Alabertr 1987).
Prior although timited experience (Rossi, 1988; Schofield, 1988) has also sho\pn that a
structure observed on the z-correlogram does not entail observation of a eimilax structure
on the indicator correlograms: it is possible to observe a reasonable structure on the zcorrelogram and none on the indicator correlogra,ms. Spatial clustering and codification in
a finite number of 0's and l's of the original variable are possible causes for such inference
problem.
The conjecture is that inference of spatial correlation for the principal components should
be easier than inference of indicator correlogra,ms. Such conjecture should be verified under
diferent conditions ofspatial clustering and for difierent bivariate distributions.
6.4 Indicator Kriging:
One of the main insistences along this investigation is that IK is an extreme simplification
of CoIK, where the crosscorrelation is assumed null and irrelevant for the goal at hand. For
the case study, the performance of II( was surprisingly good for all the tests considered.
The assumption of null indicator crosscorrelation although not supported by the data does
not yield poor performance. Therefore, for bivariate distributions close to binormality IK
appears as an efficient technique.
The conditional bias observed for some tests may be explained by the finite number of
cuto s, whose importance has been underlined above. Here is precisely the advantage of
the approximated IKPCA: increase the number of cutofs and possibly decrease the level of
conditional bias.
6.5 Indicator Kriging based on Principal Component Analysrs:
There is no question that this approach uses more bivariate information than the IK ap.:
performance of IKPCA in estimating
.-;;Aioach. The case study has shown a reasonable
quantity of metal recovery factor, tonnage recovery factor and proportions. The deviations
that occur when quantiles and probability interrals are estimated may be due to the inappropriate linear interpolation, i. e. the assumption of uniform distribution within classes.
CHAPTER.
6. CONCLUSrcNS AND RECOMMENDATIONS:
131
Estimation of quantity of metal and tonnage factors, which do not require auch interpolation
shows excellent performance.
Unfortunately, the conditional bias observed for IK remains for IKPCA, when estimating
the conditional distribution. The above argument about lack of resolution generating such
conditional bias may again apply. One study could consist in increasing the number of
cutoffs to check if the conditional bias is reduced significantly.
6.6 Nonlinear Combinations
of Indicators:
The principal components a,re expresed as linea,r combinations of indicators, with weights
ensuring that the crosscovariance is exactly zero for h =0. This property entails that the
transformed variables are referred to new a:ris which are straight lines.
Applying the same idea, a transformed variable using nonlinear combinations would imply that the new axes are not necessarily straight lines. For the particulat case of indicators,
valued as Ots or 1's, nonlinear combinations of indicators yields back linear combinations
of indicators: any function , linear or not, of indicator values is a linear combination of
same indicators (Journel, 1986). Thus, the idea to use nonlinear combinations of indicator
data would not bring additional advantages over the linear principal components approach.
However, one may consider improving the estimation by introducing products of two of
more indicator data, which would call for trivariate-type covariances. The present frontier
of bivariate-type estimators must be broken.
Appendix A
Spectral Decomposition of E/(h)
Principal Component Analysis requiree the decomposition of the indicator cova,riance matrix
E1(h)
as:
Er(h) = AAAT
(A.1)
with A being an orthogonal matrix whose columns ale the conariance matrix eigenvectors
and A a diagonal matrix whose diagonal elements a,re the corresponding eigenvalues. Therefore, the numerical problem consists in either the difierent computation of decomposition
(A.1) or the computation of the eigenralues of the covariance matrix >r(h).
Solution for the eigenvalues need not call for solving the traditional characteristic polynomial:
det[D1(h)-.\I]=0
(A.2)
with ,\ being an eigenvalue of the conariance matrix. The very reason to avoid equation
(A.2) is that whenever its order is greater than 3 its solution calls for iterative methods
which can demand la.rge CPU time and possibly without convergence.
The solution is to twin the original problem into an easier yet equivalent formulation.
This-simplification is achieved through the concept of similarity transformations (Stoer and
,;dulirsch,
matrices
1980) where the original matrix
T-r
and
El(h) is premultiplied and postmultiplied by
T defined such that the eigenvalues of the new matrix
remain the same.
The proof consists in showing that the characteristic polynomial of the transformed matrix
is equal to that of the original matrix:
L32
APPENDIf
A.
SPECTLAL DECOMPOSffION OF
detlT-rly(h)T
>t(H)
-,\Il = det[T-1(Er(h) -
133
.\I)T]
therefore:
d,etlT-rD1(h)T
A.1
- )Il = det[T-1]det[(>1(h) -
,\I)]det(T) = derlDr(h)
- rll
(A.3)
Householder Tbansformations:
Householder matrices are symmetric and orthogonal, and defined by the following expression:
9 nrorT
P=I--;' ura
with the vector o being defined to
(A.4)
ensure:
Px = ll*ll"r
with
q
(A.5)
being the ith-column of the identity matrix. The goal of transformation (A.5) is to
yield some zero entries for a given matrix.
This last relation is used to build a similar matrix to the indicator covariance matrix.
Premultiply and postmultiply >r(h) by Householder matrix as follows:
Pf>r(h)Pr
the resultant new matrix is symmetric and its eigenvalues are equal to the eigenvalues of
the original covariance matrix. A repeated multiplication and poetmultiplication of E1(h)
by Householder matrices yields a triangular matrix
H
(Stoer and Bulirsch, 1980):
H=p!1_r...pJEr(h)pr...px-z
z
-;; frith
K
(A.6)
being the number of cutofis and Ps the Householder matrix defined by:
"-= [tJ ,*-*1 w]
(A.7)
APPENDIK
and
Ir
A.
is the lc x
SPECTLAL DECOMPOSITION
lc
Or >r(H)
134
identity matrix and the vector v is defined such that:
I ci(ut zx+t,zx\f
,t.
r,
I
trr-rll
-2o" tl
I
L citt; zKtzk) J
where the symbol
tion
&
*
Ci$; z*+r, z*) is an element at the posi-
ind.icates a nonzero value and
* l,lc of the transformed matrix after a (K-2) Householdet transformation.
Note that
matrix (A.6) presents a structure which is more a,nenable to the extraction of eigenvectors.
L.2 The QR Algorithm:
This method is an stable and iterative method that uses the Householder'transformations
and the concept of similarity to obtain the eigenvalues and eigenvectors. Convergence and
numerical stability have been proved in Francis (1961) and in Wilkinson (1965).
The idea of this algorithm is to apply Householder transformations to obtain a triangular
matrix. This is done by premultiplying E1(h), (/{
-
1) times as follows:
f.
Pr-r...Pr>t(h)=l
lo
i
-l
*l
-l
(A'8)
The resulting triangular matrix is called R;, being an index referring to the number
of iterations. The Householder matrices also define the orthogonal matrix Q;:
Qr = Pr-r
Since the Householder matrices are symmetric,
. ..
it
Pr
comes:
Qt=Pr...Pr_r
(A.9)
Therefore, the QR decomposition of the indicator covariance matrix is:
-"
.a
E}(h) = QrRr
where the upperscript on the indicator cova,riance matrix refers to the original
matrix can be generalized to obtain relation:
matrix. This
APPENDD(
A.
SPECTNAL DECOMPOSffION OF E1(H)
135
>i(h) = Q;Ri
(A.10)
Ei*t(h) = R;Q;
(A.11)
and:
Both expressions (A.11) and (A.12) define a fa,mily of similar matrices (Francis, 1961)' since:
Ei*'(h) = ed>i(h)Qd
Applnng backward the recurrence relation (A.11), the following equation is obtained:
>i*t(r,) = (er . ..e;)">}(hXer. ..Qr)
(A.12)
and this last matrix converges to a diagonal matrix whose elements are the eigenvalues.
The form (A.12) is similar to the familiar relation:
A = A"Er(h)A
thus by identification:
A?=(Qr...Qr)"
and:
A=(Qr...Qi)
(A.13)
Application of the QR decomposition (A.10) and the recurrence relation (A.11) provides
the numerical procedure to obtain eigenvalues and eigenvectors of the indicator covariance
matrix, and more generally of any covariance matrix.
I
:Q'
APPENDIK
A.
SPECTLAL DECOMPOSITION OF
A.3 The Singular
tr(H)
136
Value Decomposition:
This decomposition, as the QR factorization, decomposes the indicator corta,riance matrix
E1(h) as:
>r(h)
- Usv"
(A.14)
with U and V being orthogonal matrices and S a diagonal matrix. This decomposition is
related to the spectral decomposition: the elements of S (o;), called singular values, are
the nonnegative square roots of the eigenvalues of E?(h). Indeed, from the aingular value
decomposition of >l(h) (A.14) the following relation is derived:
>?(t') =
us2ur
(A.15)
relation which corresponds to the spectral decomposition of E?(h). The square of the
singular values ai correspond to the square of the eigenvalues of that matrix. Therefore:
o;
= ll;l
(A.16)
with Ai being the eigenrralues of the original indicator covariance matrix. Note that if the
matrix is pooitive definite the singular values a,re equal to the eigenrralues. If the matrix is
a non singular matrix, the singular values will be positive even if some of the eigenvalues
are negative, therefore these singular values are useful to obtain the orthogonal matrix U
but are useless to test positive definiteness.
The numerical procedure (Golub and Reinsch, 1970) yields from the original matrix
El(h) a bidiagonal matrix Js such that:
Jo =
where
Pr...P1E1(h)Qr...Qr-z
P* and Qp are Householder matrices (Golub and Kahan, 1965). Defining:
P=Pr...P1
t-
a'
Q = Qr
the following matrix is triangular:
...Qx-z
(A.17)
APPENDIK
A.
SPECTLAL DECOMPOSTTION OF >r(H)
L37
(A.18)
Jf,Js = Q"E?Q
This matrix, by an argument of similarity, shows that the singular values of Js are equal to
the singular value of the indicator conariance matrix. Again, the Householder transformation has simplified the original problem to a simpler one with a concise structure.
The bidiagonal matyrix Js is diagonalized by a,n epecial QR algorith such that the
sequence:
Ji+r = ClrJiHi
converges to a diagonal
matrix. Note that the matrices Gi and
IIi
are orthogonal matrices
and i is an iteration index. Thus, the eingular values are the elements of the diagonal matrix
J;a1 and the orthogonal matrices U and V a,re:
tI = PH,. .. . Hr
(A.1e)
YT = G1...G,nQr
(A.20)
with n being the total number of iterations. For the particular
matrices U and V are identical.
a
-;
k'
case of
>r(h) the orthogonal
App"ndix B
Computation of Indicator
Crosscovariances
***:t*tr
crl{r,t:irl.:}**:}**:}t********!t:t:}**1.:t:t*************ttt}**t}******t}**t}***:t**tf
c
c
Conputation
c
of Iudicator
Covariancee
ald
CrogEcovariances
assnrning a bigaussian uodel.
c
c
c
c
Vinicio Suro-Perez.
c
Applied Earth Sciencee Departnent
Stanford UniversitY
c
c
Decenber, 1988
c
c
c
Cr*rl.
i.'
*
rtrt* ri * r***rl *rltt** * **tt * *{.tttt****t}t}
***tt * *****
*** ****ttt*t}*ttt}l"t*
* {'* *** ** ti tl *
connon/Etrucl/nst, cO, c (4), aa(4), it (4), cosx (3), cosy (3),
*anix(4), aniy(4), aniz(4)
double precision nam
connon iup, iout, teatk, dum,nam, q90
dineneion r(600),y(600),vr(6OO),ok(35,35),okv(35'35)
138
cosz
(3),
B.
APPENDIX
COMPUTA?ION OF n{'DICATOR CROSSCOVARII{NCES
139
double precision cov,zl(lO)
c
0utput file rith indicator correlogrrrng and croescorr€lograns.
c
c
open(8,
fils.
t
crosscova. nor' )
resind(8)
c
Input file citb tbe Param€tsrs of tbe z-variogra.m
c
c
open(9,filec t cova.Par' )
recind(9)
c
Reading aunber
c
of gtnrcturee
and uugget
effect.
c
read(9,r,)nst,
c0
c
Reading
c
si1L, rang€, tlpe of nodel and anigotroPy ratioE in r
and y.
c
do 250 k=l,nst
read(9,'r)c(t),aa(k), it(k),anix(k),aniy(k)
c
It iE
c
congidered onlY the 2D-case.
c
aniz(k)=1 .0
continue
250
c
Anisotropy directiona.
c
c
.?
ta'
.]
read(9,*)(coex(k),k=1,3)
read(9,*) (coey(k),|=1,3)
:
read(9,,r) (cogz(k),h-1,3)
c
c
lfi:mber
of cutoffs
and number
of lags required in your crosscorrelogran.
-,
:-,|
-'i:
APPENDD(
B.
COMPIJTATION OF TXDICATOR"
CROSSCOVARIANCES
c
Spaclng
c
of lags.
c
read(9,'t)Dzc, n1ags, dxtl
c
c
That cutoffs.
c
read(9, *) (21(i), i=1,nzc)
r(1)=0. O
do 90 11=1'nzc
do 95 tl=Il,nzc
rrite(8,'r)11,tl
do 901 tJhcl,nlage
deltasdxxxr,float (kj h- 1 )
x11-r(1)+delta
c
c
Conputing the
indicator correlogran or correlogram'
c
calL cova(x(1),0.0,0.,:11,0.0,0.,cov'
21,I1,k1,1flag)
rite(8,902)delta,cov
fornat(2(3x'f18.9))
*
902
901
95
90
continue
continu€
contiaue
stoP
end
c*****titr**
C**rf*****rl**t****!t*****rt************rt************lttlt*tt*****tf
-;Zi
aubroutine cova
(xl,!!,zt,r2,y2,z2
rcovr21,11 rk1,
c
c
covariance betreen
tro
PoiDtg
iflag)
140
APPENDD(
B.
COMPUTATION OF INDICATON CROSSCOVANT{NCES
c
c
-calculatee covariance betreen tro Pointg
given a variogram nodel.
c
c
c
c
***input***
c
c
tt ryL,zt
c
x2ry2rz2
-real coordinatee of first point
-reaI coordinateg of gecond point
c
corrnon
c
Btructural variables (eee belos)
G
c
***output**'r
c
-calculated covariance
c
c
c
:l*rlCOnnOn*,**
c
c
c
comnon
lstr.;ctl -
covariance parameterg
c
c
c
Dst
c
c0
c
c(4)
c
c
c
c
iat'
c
c
c
aa(4)
it(4)
=t
=)
-nunber of neeted structux€g (nax. 4).
-nugget const'nt (ieotropic).
-nultiplicative factor of eacb nested
Btnrcture.
(eill-cO) for spberical,exponential,
and gauseian nodele.
slope for linear model.
-parameter a of eacb nested structure.
-typ€ of each nested stmctur€:
-apherical nodel o1 3eng€ ai
-exponential nodel of paraneter a;
141
APPENDIX
B.
COMPATATION OF INDICATON CNOSSCOYARIANCES
c
=3
c
c
=4
c
c
L42
i.e. Practical rt?rgs ie 3a
-gaussian nodel of Paramet€l a;
i.e. practical range ia a*aqrt(3)
-pocel nodel of Poner a (a nuet be
gt. 0 aad Lt. 2). if llnear nodel,
a-1 rc-gIope.
c
rarniugl liuear nodeL cannot be used for sk or
ut of tbe drift
c
c
raruingJ cnar must be eupplied in data Etatement
iu eubroutine cova. cmar ie the
naximun variogram value needed for
kriging rhen uging Poror nodel.
c
c
c
c
c
c
gt. 4 or It. 1, then itrY = g.
c
c
c
c
coer(3)
coey(3)
cosz(3)
-direction cosineg of tbe r€ctangular
anix(4)
aniy(a)
aniz(4)
-anisotropy ratios to be applied to each
of tbe anisotropy axes and each nested
Btructure (can be different for each
of the nBt structuree).
if no anisotropy needed tben,
c
c
c
c
c
of anieotroPy areE (tbeae coEines
are relative to tbe initial ayeten of axes,
and are tbe earne for all nat atructures).
lf no rotation ia aeeded then:
coar(3)= 1. 0. o.
coay(3)= 0. 1. 0.
coaz(3)= 0.0. 1.
ryBtem
c
c
c
c
cv
;-a-'*
c
c
c
anix-aniy-aniz=1
.
raraing! the direction of the previous geometric
aaiaotropy ie identical for all nst
st:ructurea.
APPENDIX
B.
COMPATATION OF INDICATON CNOSSCOVARI{NCES
143
c
c
c '|'i*norking paranet€!8 not in
Comon****
c
c
dr,dyrdz
-compon€Dte
of distance
betseen pointa along
aniaotropy area
c
c
c
dx1'dyl,dz1
-componeDtE
of dietance after
couPensating
for geouetrical aaisotropiea
c
c
c
cBa:3
-maxiuum variogran value needed
for
triging rhea ueing Pos€r nodel.
ite value ie defined ln a data etatemont
c
c
c
c
comon/strucl/net, cO, c (4), aa(4), it (4), cosr(3), coey(3),
anir(4),aniy(4),aniz(4)
't
double precieion rho,cov, z1(1)
data cmax 160.0l
c
c
c
C**tl*{.rotate
axeg
c
dx= ( x2- r 1 ) r, co e x ( 1 ) + ( y2- y 1 )'r
c o Ex
(2) + (22-
z
1
) * co E x ( 3 )
1)+(y2-y1) *co ey(2)+(22-21) *cosy(3)
62-(-2-xL)*coez( 1) + (y2-y1) *co ez(2)+(22-21) *coez(3)
dy= (r2-:c1) *cosy(
}l=da*da+dy'rdy+dz*dz
-;4
,4.
;
-
cov=O.0
for very ebort dietances
if (h.gt.0.0001) go to 1O
C**:f'r*Covariance
cov=cO
cosz
(3),
APPENDIX
B,
COMPATATION OF INDICATON CNOSSCOYARIANCES
do 1 i=1,ngt
i.t(it(i)
.ue. 4) go to
2
cov=cov+cmar
gotol
2
1
cov-cov+c(i)
continue
go
to
120
c
c'i't*'i{.covariance
for longer distancee
stnrcture bY Btructure
c
10
do tOO i-l,ast
c
c**'r** stt"uctural distance
dxl=dr*anir(i)
dyt=dy*aniy(i)
dz1=dz*aniz(i)
b=sgrt (dr1*dx1+dy1*dy1+dz1rdz1)
it(it(t) .ne. 1) go to 20
c
c*rl***Bpherical model
h=h/aa(i)
if (h .ge. 1.0) go to 100
6es=s6vtc(i) * (r . -h*(t . 5-. S*b*h) )
to 100
it(it(i) .ne. 2)
go
20
go
to
30
c
c*****exponential nodeL
- 6ov=cov+c(i)'rexp(-hlaa(i))
,;rt
80
30
.to
it(it(i)
1oo
.ne. 3) go to
c
s*****gaussian model
4o
L44
APPENDD(
B.
COMPUTAflON OF TNDICATON CNOSSCOVANIANCES
hhE- (h*h) /
(aa(i) *aa( i) )
sevl=s(i) r,exp(hh)
cov=cov+cov1
go
to
100
c
c*****porrer model
covl-cnax
40
of porer aa(i)
- c(i)*(h*,raa(i))
cov=cov*coYl
c
100 continue
c****'r't'r**** rho(h)
L20
rho'cO
do 110
isl,agt
if(it(i) .eq.4)then
rbo=rbo+cmar
else
rho=rho+c(i)
endif
continue
110
c
c
Conputing the z-correlogram.
c
rbo=cov/rho
c
c
conputiag the indicator covariance or crosscovariance.
c
call
trangcova(2I (11)
,
z1
(k1) , rho, cov, ifJ'ag)
rEturn
-, "tti
ia"
c*:l*****rt****!i*****,1****tl******tt***tltl****tt***tl**tl**tt:l*tl**tt*|}*tt*tt****
c
eubroutine transco v a(zL, zn, rho, valuo' if lag)
145
APPENDIX
B.
COMPT]TATION OF INDICATON CROSSCOVARIANCES
146
c*rtrlrtrl*:t*****ti**rt*rl**rt**tt******rl**r|**********tlrt***t*tr*ti***:l***rttl****tl
c
Iadicator Covariance or Crosgcovariance.
c
c
c
c
zlz
FirEt Cutoff.
c
zm2
Second Cutoff.
c
c
rbo:
value:
c
lf).ag:
c
z-correlogram.
Indicator croaacovariaace C(b; zl,zn)
if has the value 4 the numerical integration is
not reliable, and therefore, the iadicator covarian-
c
ce aleo
c
crlrt**lrrl*rtti*rt**rl**r*rl***rl*************tl**tt*tttl******tltl.*tt***tl*tl*****'t***'*tr****:t
c
c
c
precision value,b, rho
erternal f
doubLe
b=dasin(rho)
c
c
CaU.ing the function
to evaluate integral
(3.24)
c
value= cadre(f
100
,
o. OdOO,b,
iflag,21,
zn, rho)
continue
l6turn
end
.ta
t*
c*,f
:l*,1,t*:l****:t*******{.:t*rt****:r**********r}r}r}******t}tt*:t******{'*t}**'r****:r*******
c
double precision function f (x,z\,zn'rho)
APPENDD(
B.
COMPT]TATION OF INDICATON CNOSSCOYARI{NCES
L47
c********r3**:l*rt**rt****rtr|******rt**rt*,lrl***rt*rl*rt**:lrt**tr**tt**tl:l*:lrt*tt**:t*,1********
c
of erpreaston
Integrand
c
(3.24)
c
c
I:
Indepeadent variable.
c
zr.;
Firat Cutoff.
c
zmz
Second Cutoff.
c
rho:
z-correlogra,B.
c
**:}***t**t*
c***{.rtrtrtr}*r}*:}r}*rtr}**:}rt**r}r}**r}rt***l*r}***rt*rl***:t***r}t}tr*:}***t}*t}***t}**tf
inplicit real*8
(a-h,o-z)
pi=3.141592654
pL2=(2.
*pi)
a=dsin(x)
c23456?89012345678901234567890123456789012345678901234567890123456789012
if
(z1.ue.an)then
f=
(
1.
d0/pi2) *aerp (- (21* zL
(2 . d0* (dcoe
+a*a
-2. do*a'rzt*a)
I
(r)
)*(dcoE(x))))
eLse
f=(
1.
d0/pi2) *derp(-(21**2)/
(1.dO+dein(x)
))
endif
return
end
c
c**,t,t*{.{.***,i*ltrt*******rt******tr*****r}rt**:t*r}rtrt**rt*tftt*************tt**tr***t'****tt'F
ca
double precision function cadre(f ,e,b,iflag,zl,zrn,rho)
c
C**,t,1***rr**rt***tt**t
c
***tt***tt**tr*******************tl*********tl*************'f
***
APPENDD{
B.
t{trnericaL iDt€gratloa
c
148
COMPATA?ION OF TT,TDICATOR CEOSSCOUANIANCES
of
(3.24)
.
c
c
C
f:
erteraal function to iDt€grate.
c
a:
c
b:
iaferior linit.
auperior linit.
4 .... l{o convergence on the eolution.
f,irgt cutoff.
c
iflag:
c
zL:
zn:
rho:
c
c
eecond
cutoff.
z-correlogram
c
c
Author: de Boor, C., 1971, Cadre: An algorithm
for nunerical quadratute' in llathenatical
Softrare, Rice, J.R. (ed.), P.201, Acadernic
Prege, t€YrYork,1971.
c
c
c
c
c
c
C**:t{.*rr*:}r}**rt*r}*******rtr}:}************************************ttt}*t**tf
inplicit
real*8 (a-h,o-z)
dineneion t(10, 10),t(10),ait(10),dif (10),rn(4)
dinensiou ts(2049),ibegs (30),begin(3O),finig(30),
est(30)
double precision lengtb, JunPtl
logical Mconv, aitken, rigbt , reglar, reglsv (30)
_J?.
':<
double precision algalO2
data tohnch, aitlow, tt2tol, aittol, j urnptl, naxts, maxtbl,
tnxstgel 2.e-16,1.1,0.15,0.1,0.01,2049,10,30/
data rn/O.7142005,0.3466282,0.8437510,0' 1263305/
. data a1g402 /0.3010299956639795/
aerrO.000000001
rerro.000000000001
cadre=O.0
****'l*tt***
APPENDD(
B.
COMPATATION OF TIIDICATAN CROSSCOVANIANCES
€rror30.0
iflag=1
length.dabE (b-a)
if(length.eq.0) r€tum
€rrr3 dninl (0 . 1 , amarl (dabs (rerr) , 10. *tolnch)
)
erra= dabE(aerr)
st€pnn = dnaxl (IeDgtv2*r.ustge,
dnaxl (lengtb, aba (a) , abe (b) ) *tolnch)
8ta8€t0.5
iatage-1
cureet=O.0
fneize=0. O
prever=O.
O
reglar=. false.
beg=
.
fbeg= f (beg,z1 ,a,rbo)12.O
te ( 1) =fbeg
ibeg=1
end-b
fend= f (end,zL,zn,rho) /2.
t8(2)r fend
0
iend=2
riglrt-. f aIse.
EteP -sa6 -5.*
ast€P
-
dabs(etep)
if(aetep.lt.etePrnn) go to
950
t(1,1)=fbeg+fend
tabs= dabs(fbeg) + daba(fend)
I=1
n=1
b2conv=.faIse.
aitken=.false.
go
to
10
149
APPENDIX
B.
COMPUTATION OF INDICATOR C&OSSCOVANIANCES
coDtiDue
9
10
In1 -1
1*1+1
n2= n*2
fn=
n2
istep-(iead-ibeg)/n
if(istep.gt.1) go to L2
ii- iend
iend= iend + a
if(ieud.gt.naxts) go to
900
Seya-step/fn
iii= iend
do 11 i=L,r'2,2
te(iii)= ts(ii)
te (iii-1)=f (endiii= iii - 2
ii= ii -1
11
i'rbovn,z1,zm,rho)
iateP = 2
ietep2- ibeg + LatePl2
L2
gum=O.
sumabs=O.
do 13 i=ietep2,iend,iatep
BuD= Bur + te(i)
sumabg= aumabs + aaUE(ts(i))
t(1,1)= t(1-1 ,L)12. + eun/fn
tabg = tabal2, + eumabs/fn
13
absi = asteP*tabg
n=
it=
1..
.
D2
1
vint= step*t(1,1)
tabtlm= tabs*toltncb
fnsize= dnarl (fneize, abs (t(1, 1) ) )
ergoal = drnarl(asteP*tolnch*fnaize'
150
APPENDIX
B.
COMPUTATION OF INDICATON
CROSSCOVANIANCES
stag€*dmar1(era,errr*abg(cuteBt + vint) ) )
fextrP -
1.
do 14 l-1,1m1
L4
15
16
L7
18
fertrP s fertrP*4.
t(i,1)= t(I,i) - t(1-1,i)
t(l,i+1) = t(l,i) + t(i,I)/(fextrp -1.)
€rrer r aetep*dabe(t(1,I))
if (1.91.2) So to 15
if (dabs(t (1,2)) .le.tabtIn)go to 6o
go to 10
do 16 i=2,1n1
diff=O. O
if (dabs(t(i-1,1)) .gt.tabtln)diff = t(i-1,Irn1)/t(i-1,1)
t(i-1,1n1)= diff
if (dabe(4.-t(1,1n1)) .1e.b2toL)go to 20
if (t(1,1n1).eq.0.)go to te
if (dabe(2' -dabs(t(1,Ln1) ) )'rt'i'"tFt1)go to 50
if (1.eq.3) 8o to 9
h2conv=.false.
if (dabe ( (t ( r, lu1 ) -t ( 1, r-z) ) /t ( 1, lrnl) ) . Ie
go to 30
if(reglar) go to 18
if (1.eq.4) go to 9
if(errer.le.ergoal) go to 70
go
to
91
if(h2couv) go to 2t
aitken=. false.
2L
tr:22
h2conv=.true.
fextrP= 4.
. it= it+1
vint = step*t(I,it)
errer=abs(eteP/ (fextrP-1 . )*t(it-1,1) )
if(errer.le.ergoal) go to 80
.
aittol)
151
APPENDIX
B,
COMPT]TA?ION OF TNDICATAR CNOSSCOVARIANCES
if(it.eq.1n1) go to 40
if (t(it,1n1).eq.o.)go to 22
if (t(it,1n1) .le.fertrp)go to 40
if (dabE (t(it.,1n1) /4. -fextrp) /fextrP. lt. aittol)
fextrp=1631rP*4.
to 22
if(t(1,1n1).lt.aitlor)
if(aitken) go to at
go
30
go
to
91
b2conv=.faIse.
31
aitken=.true.
fextrP=1 (1-2,ln1)
if (fextrp.g!.4.5) go to 2t
if(fextrp.lt.aitl.on) go to St
if (dabe (fextrp-t(1-3,In1) ) lt(t,ln1) . gt.h2tol)
go to 91
sing = fertrP
fe:trn1=fextrp -
1.
ait(1)-0.
do 32 L=2,L
ait(i) =t(i,1) + (t(i,1)-t(i-1,1))/fErtnl
r(i)=t(1,i-1)
dif(i) = ait(i) - ait(i-l)
it=2
33
333
vint= etep*ait(1)
errer=€rrer/fextnl
if(errer.gt.ergoal) go to
34
alpha = dloglo(eing)/alg402
iflag= naxO(iflag,2)
-a4
'.:
to to 80
it=it+1
if(it.eq.1n1) go to 40
if(it.gt.3) go to 35
h2next=4.
-
L.
r52
APPENDD(.
B.
COMP(ITA?ION OF INDICATOR C&OSSCOVARIANCES
Bilgnx= 2.*sing
35
if (h2next.lt.aingnr)
fextrP - siugnx
go
to
Se
singnr- 2*sing!r
to 37
fextrp- h2next
h2nert r 4.*h2nert
go
do 38 i=it,lm1
r(i+1) =0.
if(dabg(dif(i+1)).gt. tabtln) r(i+l)= dif(i)/dif(i+1)
b2tfet = -h2to1*fertrP
if (r(1)- fextrp.lt.Mtfer) go to 40
if(r(1-1)-fextrp.lt.h2tfer) go to 40
€rrer - ast€p*dabe(dif(I))
fextml= fextrp - 1.
do 39 i=it,I
ait(i) = ait(i) + dif(i)/fextnl
dif(i) = ait(i) - ait(i-l)
go to 33
fextrp - dnaxt(prever/errer,aitlow)
Pr€ver=€rr9r
if(1.1t.5) go to 10
if (f-it. gt.2.and. istage.lt.nxetge)go
to
if (errer/fertrp**(nartbl-l) .lt.ergoal)
go to 90
if(errer.gt.ergoal) go to 90
SO
go
to
10
diff=dabe (t(1,1) ) *2. {.fn
to
60
80
-go
slope- (fend-fbeg) *2.
fbeg2= fbeg*2.
do 61 i-1,4
dif f =dabe (f (beg+rn ( i) * EteP' 21, zm, rho) -f beg2-rn ( i) * slope)
if (diff .gt.tabthn) go to ZZ
153
APPENDIX
B.
COMPUTATION OF IIIDICATON CROSSCOVARIANCES
continue
go to 80
slope r (f€Dd-fbeg)*2.
fbeg2= fbeg*2.
i-1
7t
72
diff=dabs(f (Ueg+6(i)*step,zI,a,rbo) - fbeg2-ra(i)*sJ'ope)
err€l = dnarl(errerrastep*diff)
if (errer. gt.ergoal)go to 91
i'i+1
if(i.le.4)
go
to
7L
iflag=3
cadre- cadre +vint
€rror= 6rror + err€r
if(right) go to 85
istage=igtage-1
if ( istage. eq. O) retura
reglar=reglsv ( istage)
beg=6"t1o(istage)
end=finie (iatage)
curegt - cureat - eet(iatage+1) + vint
iend= ibeg - 1
fend = te(iend)
ibeg=ibege (ietage)
to 94
curest=culegt+vint
go
stag€= stag€*2.
iend= ibeg
ibeg =156gs(istage)
end=beg
beg=6"tio(istage)
fend= fbeg
fbeg ='te(ibeg)
gotoS
154
APPENDIX
B.
COMPTJTATION OF INDICATAN CROSSCOYARIANCES
91
reglar*. true.
if(iatage.oq.nxstge) go to
93
if(right)
90
950
go to 95
reglev(ietage+1) - reglar
begin(ietage)=beg
ibegs (ietage) =ibeg
atage=Btagel2.
right=.tnre.
94
beg= 16"t+end)/2.
ibeg= (ibeg+iead)/2
ts(ibeg)= te(ibeg)/2.
fbeg= te(ibeg)
goto6
95
nnleft= ibeg - ibegs(ietage)
11(6ad+nnleft.ge.naxta) go to
iii = ibega(istage)
ii=lend
96
i=iii,ibeg
iigii+1
ts(ii)- ts(i)
do 97 i=ibeg,ii
ts(iii)= ts(i)
97
111=iii+1
do 96
iend=iend+1
-
ibeg= iend - nnleft
fend= fbeg
fbeg= ts(ibeg)
finis (istage) =end
end=beg
beg=b"t1tt(istage)
begin(ietage) =sn4
reglev(iatage)= reglar
istage= ietage + 1
900
155
APPENDIX
B.
COMPUTATION OF INDICATOR CNOSSCOVANT{NCES
reglar = r€glsv(istage)
est(istage) . vint
cur€st=curest + €st(istage)
gotos
iflag=4
to 999
iflag'5
cadrescuregt+vint
go
950
999
return
end
156
Appendix C
Computation of Principal
Component Crosscovariances
C*****rt**tttttf
* *rt:t rlrr* * *rtl. :t rl r*
*!ttt*******t|:t*:t:;**lt*titttl
c
c
c
Conputation
of Principal Gomponent
Covarl.ances.
c
c
c
c
c
c
c
Vinicio
Suro-Perez
Applied Earth Sciences Departneat
Stanford University
c
c
c
December, 1988
c
Ct,********!t*rl**rt*********rt***rl*!t***rf
*******rtrltt***tr**f
connon/etrucl/net, c0, c (4), aa(4), it (+), cosr(3),
*anir(4) , aniy(4) , aniz(4)
doubLe precieion nam
comnon inp, iout, testk, dunrnanrq9O
L57
********:t****:l
cosy
(3), cosz (3),
C.
APPENDIX
COMPUTATION OF PNINCIPAL COMPONENT CROSSCOVARIANCES1s8
dineneion x(ooo),y(600),vr(600),ok(35,35),okv(35,35)
doubLe
precisioa z1(10)
c
c
Iuput file cith the paraneter of the z-variogran.
c
filE=
rerind(9)
opea(9
,
t
cova
.
par' )
c
c
I{unber
of Btructurea
and nugg€t effEct.
c
read(9, *)nst, c0
do 250 k=1,n8t
c
c
SiII,
rang€, type of variogram and anisotroPy ratios.
c
read(9,*)c(k),aa(I), it(k),anir(t),aniy(k)
c
c
It is
congidered 2D-case.
c
aniz(k)=1.0
continue
250
c
c
Anisotropy directions.
c
read(9,'r) (cosx(k), k=1, 3)
read(9, *) (coey(k) , k=1 ,3)
read(9, *) (cosz (k), k=1, 3)
c
c
l{umber
of cutoffe, Dunber of lage and spacing betreen 1ags.
-.- t
read(9, *) nzc, nlag, drxx
c
c
Cutoffs.
APPENDIX
C.
COMPUTATION OF PRINCIPAL COMPONENT CROSSCOVARIANCES1s9
read(g, *) (21(i), i-1,nzc)
c
c
Conputation
of Principal comPon€nt
closacovariancee.
c
call findo(nzc,nlag,dxxr)
stop
end
c
rt*rlr***tl*tt:l***
c*rt*rf r|*rt*rtrt:i*:|*r|**rlrt***rl***rtrt*****rt**trl****r***rl*rlrf
c
eubroutine f indo (nzc, n1ag, dxrr)
tt****tt**
C**'t**rt*******rl**rt*lrrtrl***t|******************************tl
c
c
Conputatioa
of Principal Component Croeecovarl'anceg.
c
c
c
nlzc. nunber of cutoffe.
nlag: number of lags.
c
dxxx: spaciag betreen lage.
c
c
Crl.:l*rirt:t{.*:t**r}*:t**tt***!t*
*****
{r
* * :t *** *
rt
* * * *tt tf
dineneion x(1),y(1),vr(l),ra(64),ya(64),ta(64),
ot(35,35),okv(35,35),
3
diet(4,17),nog(4,17),nho1e(4),dis(30)
4
double precieion uam,tmat(10, 10)
double precieion covrnat( 10, 10), covmatl ( 10, 10), covnat2 (10, 10),
*
vect(10,10)
couunon inp, iout, teEtk, dtut, nan, q90
nat,c0,c(4),aa(4),it(4),coex(3),cosy(3),
cosz(3),anix(4),aniy(4),aniz(4)
'l
double preclsion tnom,tmax,zl(1),tcov(3O,30),cov
conmon lerT:ucLl
C. COMPATA?ION
APPENDIX
*
OF PNWC.JPAL COMPONENT CNOSSCOYARIANCES16O
,tveco(lo)
c
c
Input file with the orthosonal uatrix
c
A.
c
open( 15,
file='
covmatgvd. dat' )
resind(15)
c
c
Output
fil€ sith the principal
comPonent
crosscolrelognqe.
c
opeu( 18 , file= ' crsvd.
dat' )
rewind(18)
c
do 25 11=1,nzc
c
c
Reading the orthogonal
Batlir.
c
read(15, *) (covmat(11,kl),k1=1,nzc)
c
25
continue
ng=nst
crtrt '|*rtrtrt rtrtrt*tri**rf *rl*rl**** ***'t*** *** tt****t***rf *tttt ****{r *****r}{r *:f :t*
ctt{.*
Loop
to
conputo the nzc(uzc+t)12 block covariances
c
c Conputing the 11,k1 block covariance
c
do 90 l1=1,nzc
do 95 k1=1,11
-Eu
nrite(18,*)11,k1
c
c
Conputing the covariance betseen y(:),y(x+h)
APPENDIX
C. COMPATAflOff
OF PNINCIPAL COMPONENT CROSSCOVARI{NCESIGI
do 80 kk=l,Dlag
delta-dxxx*f, loat (kk- 1 )
ia=in+1
c
ca)
Conputing Signa
natrix
c
do 70 jtz =t,azc
do 65 i-nz=L,jn'z
call cova(O.0,0.0,0.,delta,
0.0,O. rcov,
zL,jaz,inz, iflag)
tnat(inz,jnz)-cov
tuat(Jnz, inz).cov
c
c
Testing convergoDce of the numerical iutegration.
c
if (if1ag. gt. 3)rrite( 14, *)nbl,11,tl,
tnz,jnz,iflag
65
70
coatinue
continue
c
c Conputing the principal
comPoDent closBcovariance.
c
c
The assunption
ie that z(x) is
bigaussian.
c
do 66
=L,ttzc
vect ( 1, inz) =s6r,nat (inz,
continue
66
c
Ln.z
l.
1
)
a(1) *SIGI{A
c
call
nat (vect,
1
,nzc
rtnat rnzc, covmatl)
APPENDIX
C.
COMPUTATION OF PRINCIPAL COMPONENT CNOSSCOVANIANCEST0?
do 67 L\z-L,nzc
vect ( inz, 1 ) -covaat ( inz,
k1
)
continue
67
c
c
a(I)'rSIG!tA*a(n)
c
call
nat (conmat1,
1
rnzc, vect, 1, covnat2)
c
covccovmat2(1,1)
c
c
tlriting the principal comPon€nt
covariance.
c
srite ( 18,73) de1ta, cov
fornat(2(3r,f15.7))
73
80
95
90
continue
contiaue
continue
c
return
end
c
C*******t3*f*:l*rl*rlrt**{.******t}***********tt**tl***t}******************
c
gubroutine nat (a,m,n,b,n1, c)
c********r*****rt***rt****rl*l.rf
***rr**rlrl*****rl**rl**rrrl*****rt**rtrf
c
c
l{atrix l{ultiplication.
c
c
-€
c
c
c
a:
m:
D:
b:
natrir rith dinengions m X n
number of rose of a.
number of colunns of a.
natrir sith dinension n I nl
*******
APP EN DD(
C.
CO MPT]
TA?ION OF PNIN CIPAL
ONENT CNO S S C OYARIANCES163
r€Bult of matrix nultiplicatioa: c'ab
c:
c
C OMP
c
|t
c*****1.*:l********rl*rt*rt***rt*tt**l.rt*****rl*rl*f
*rl*******rl****'ltlttltt*tt*c
inplicit real*8(a-b,o-z)
double preciaion a(10, 10),b(10, 10),c(10, 10)
do 100 i=l,m
do 2OO 1=1,n1
c
200
100
(i
,1) =o. o
continue
coatinue
do 300 i-1,m
do 1100 J-1,4
do 5OO 1=1,n1
c(i,t)=c(i,I)+a(i,J
300
*U(j ,1)
coDtinue
500
400
)
continue
continue
return
eud
c't*{.*rt**:t**'t**rt**'}***r}:t**!t*r}*****t}***********t}*******t}*******t}*tt***
Bibtiography
[1]
Alabert, F. G., 1987, Stochastic Imaging of Spatial Distributions using Hard and Soft
Information, M. Sc. Thesis, Stanford University, 197 pp.
l2l Anderson, T. W., 1984, Multinariate Statistical Analysis, Wiley and Sons, New York,
675 pp.
I3l Atkinson,
520 pp.
14l
W., 1978,Introduction to Numerical Analysis, Wiley and Sons, New York,
Borgman, L. and Fba.hme, R. 8., 1976, Multiva.riate propertiee of bentonite in northeastern Wyoming, dn Gua,rascio, M. et al (eds.), Advanced Geostatistics in the Mining
Industrn D. Reidel Pubtahing, Dordrecht.
t5l Bryan, R. C. and Roghani, F., 1982, Application of conventional and advanced methods to uranium ore reserve estimation and the development of a method to adjust
for disequilibrium problems,
APCOM, p. 109.
dn Johnson,
T. B. et al (eds.), Proceedings of the 17th.
M., 1981, The eimulation of space-dependent data in Geology, dn Craig, R.
G. and Labovitz, M. t.(eds.), Future Trends in Geomathematics, Pion, London, 318
t6I Dagbert,
pp.
l7l Davis, B. and Greenes, K., 1983, Estimation using spatially distributed multivariate
data: An si(ample with coal quality data, Mathematical Geology, vol. lb, p. L45.
.[8] de Boor, C., 1971, CADRE: An algorithm for numerical quadraturerdn Rice, J. R.
("d.), Mathematical Software, p. 201, Academic Press, New York.
J. E. and Schnabel, R. 8., 1983, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall, 378 pp.
tel Dennis,
164
BIBLIOGNAPHY
r65
[10] Eisner, E., 1967, Numerical integration of a function that has a pole, Communications
of the ACM, Vol. 10, p. 239.
[11] Ftancis, J. F. G., 1961, The QR traneformation:
transformation, Computer J., Vol. 4, p. 265.
A unitary analogue to the LR
[12] Gill, P. E., Murray, W. and Wright, M. H., 1981, Practical Optimization, Academic
Press, London, 401 pp.
[13] Golub, G. and Kahan, W., 1965, Calculating the singula,r rialues and pseudoinverse
of a matrix, SIAM J. Num. Anal., p. 205.
[14] Golub G. and Reinsch C., 1971, Singula"r value decomposition and least square so'
lutions, 1r2 lfilkinson, J. H. et al (eds.), Handbook of Automatic Computations,
Springer-Verlag.
[15] Golub, G. and Van Loan Ch., 1983, Matrix Computations, The John Hopkins University Press, 476 pp.
[16] Guibal, D. and Remacre A., 1984, Local estimation of recoverable reserves: Comparing various methods with the reality on a porphiry copper deposit, p.435, in Verly,
G. et al (eds.), Geostatistics for Natural Resources Cha,racterization, D. Reidel Publishing, Dordrecht.
[17] Journel,
A. G.,
eelective mining
1980, The lognormal approach
to predicting local distribution of
unit grades, Mathematical Geology, Vol. 12, p. 28b.
[18] Journel, A. G., 1982, The indicator approach to estimation of spatial distributions,
dn Johnson, T. B. et al (eds.), Proceedings of the lzth. ApcoM symposium.
[tg] Journel, A. G., 1983, Nonpa,ra,metric estimation of spatial distributions, Mathematical
Geology, Vol. 15, p.445.
[20] Journel, A. G., 1984, The place of nonpara.metric geostatistics, dnVerly, G. et al (eds.),
p. 307, Geostatistics for Natural Resources Cha.racterization, D. Reidel Publishing,
Dordrecht.
[21] Journel, A. G., 1986, Constrained Interpolation and Soft Kriging, dn Ra,mani, R. V.
(*d.), p. 15, Proceedings of the 19th. APCOM Symposium.
BIBLIOGNAPIIY
166
[22] Journel, A. G., 1987, Geostatietics for Environmental Sciencea, Project No. CR
811893, Enviroment Protection Agencn Las Vegas, 135 pp.
[23] Kim' Y. C, Performance Compa,rison of local recoverable reeerve eetimatee using different kriging techniques, dn Lemmer.I. C. et al (eds.), p. 65, Proceedings of the 20th.
APCOM Symposium.
[24] Lynch, R. E., 1967, Generalized trapezoid formulas and errors in Romberg quadrature,
dn Mond, B. (ed.), Blanch Anniv. Vol., Office of Aerospace Research, Washington, D.
C.
[25] Lyness, J. N. and Ninha,m, B. W., 1967, Numerical quadrature and asymptotic expantions, Math. Comp., Vol. 21, p. 162.
[26] Luenberger, D., 1969, Optimization by Vector Space Methods, Wiley and Sons, New
York, 326 pp.
[27] Luster, G. L., 1985, Raw Materials for Portland cement: Applications of conditional
eimulation of coregionalization, Ph. D. Thesis, Stanford University, 532 pp.
[28] Ma' Y. H. and Roger, J. J, 1988,Local geostatistical filtering: Application to remote
sensing, Sciences de la Terre, No. 27, p. 17.
[29] Mantoglou, A. and Wilson, J., 1982, The turning bands method for simulation of
random fields using line generation by a spectral method, Water Resources resea,rch,
Vol. 18, p. 1379.
[30] Marcotte, D. and David, M., 1985, The bigaussian approach: a simple method for
recovery estimation, Mathematical Geology, Vol. lZ, p. 62b.
[31] Marechal, A., 1984, Recovery estimation: a review of models and methods, ln Verly,
G. et al (eds.)' p. 385, Geostatistics for Natural Resources Cha.racterization, D. Reidel
Publishing, Dordrecht.
[32] Matheron, G.,1976, A simple substitute to conditional expectation: The disjunctive
kriging, rn Guarascio, M. et al (eds.), D. Reidel publishing, Dordrecht.
[33] Matheron, G., 1982, Pour une analyse krigeante des donnes regionalisees, N-782,
Fontainebleau.
BIBLIOGNAPHY
167
[34] Matheron, G., 1982, La Destructuration des hautes teneurs et le krigeage dee indicatrices, N-761, Fontainebleau.
[35] Ortega, J. and Rhenboldt, W. C., 1970, Iterative Solution of Nonlinear Equations in
Several Va,riables, Academic Press, 572 pp.
[36] Rossi, M. E., 1988, Impact of Spatial Clustering on Geostatistical Analysis, M. Sc.
Thesis, Stanford University, 113 pp.
[37] Rozanov, A. Y., 1982, Ma,rkov Random Fields, Springer-Verlag, New York, 201 pp.
[38] Sanjivy, L., L983, Analyse krigeante de donnes geochimiques, Sciences de la Terre,
No. 18, p. 141.
[39] Sandjivy, L., 1984, The factorial kriging analysis of regionalized data: its application
to geochemical prospecting dn Verlg G. et al, Geostatistics for Natural Resources
Characterization, D. Reidel Publishing, Dordrecht.
[40] Schofield, N., 1988, The Porgera Gold Deposit, Papua New Guinea: A Geostatistical
Study of Underground Ore Reserves, M.Sc. Thesis, Stanford University, 219 pp.
[41] Sullivan, J., 1984, Conditiond recovery estimation through probability kriging, dn
Verly' G. et al (eds.), p. 365, Geostatistics for Natural Resources Characterization, D.
Reidel Publishing, Dordrecht.
[42] Stoer, J. and Bulirsch, R., lgS0,Introduction to Numerical Analysis, Springer-Verlag,
609 pp.
[43] Strang, G., 1980, Linear Algebra and its Applications, Academic Press, London, 414
pp.
[44] Switzer, P. and Pa,rker, 8., 1975, The problem of ore versus waste discrimination for
individual blocks: The lognormal model, dn Guarascio, M. et d, D. Reidel publishing,
Dordrecht.
[45] Switzer, P., Lg77, Estimation of spatial distributione from point sources with appplication to air pollution measurements, Bull. Int. Statist. Inst., vol.4zrp.123.
[46] Verly, G., 1983, The multigaussian approach and its applications to the eetimation of
local reserves, Mathematical Gelogy, Vol. lb, p. 263.
BIBLIOGNAPHY
168
[47] Verly, G., 1984, The block distribution given a point multivariate normal distribution,
dn Verly, G. et aI (eds.), Geostatistics for Natural Resourceg Characterization, D.
Reidel, Dordrecht.
[48] Wackernagel, H., 1985, The inference of the linear model of the coregionalization in
the case of a geochemical data set, Scieacee de la Terre, No. 24, p.8l
[49] Wackernagel, H., 1988, Geostatistical te&niques for interpreting multiva.riate spatial information dn Fabbri, A. (eds.), Quantitative Analysis of Mineral and Energy
Resources, D. Reidel Publishing, Dordrecht.
[50] Wilkinson, J. H., 1965, The Algebraic Eigenrnlue Problem, Ordord University Preee,
662 pp.
[51] Xiao, H., 1985, A description of the behavior of indicator va,riograms for a bivariate
normal distribution, M. Sc. Thesis, Stanford University, b1 pp.