Talk:Kriging
From Wikipedia, the free encyclopedia
[edit] Repeated attempts to break the Neutral point of view rule
All articles and policies must follow Neutral point of view, Verifiability, and No original research.
This article should :
- describe what Kriging is
- tell where it comes from
- say how it is used, and by who
- say how it is connected to other interpolation and approximation methods
This article should not:
- express the point of view of one particular person
- say that Kriging is good or bad
- be specialist-only understandable
—The preceding unsigned comment was added by 160.228.120.4 (talk) 10:23, 8 February 2007 (UTC).
==Neutral Point of View
I agree that this article violates the neutral point of view rule. There should be NO content talking about "controversy" regarding Kriging and/or its validity. Kriging makes certain assumptions and if those assumptions are valid, Kriging is valid. Case closed. There's nothing wrong with Kriging or its validity. Every single mathematical model makes certain assumptions and is valid only when those assumptions are met. Just because a model is used incorrectly once in a while (i.e., applied when it's assumptions are not valid), does not mean there is anything wrong with the model. —Preceding unsigned comment added by 67.100.171.250 (talk) 16:31, 30 January 2008 (UTC)
[edit] Ongoing discussion with Merksmatrix about the NPOV
Dear Merksmatrix
First, I think you do not understand very well what linear prediction is and what Kriging means. To my opinion, you tend to confuse the data and the probabilistic model. Do you want to prevent people from fitting linear models because the underlying process that generated the data may not be that linear ? Anyway, if people want to use Kriging, why do you want to prevent them ?
Why do you persist to use wikipedia to diffuse your own point of view, against the NPOV ?
What I do understand is that assuming continued mineralization between boreholes does not make sense. You can do whatever you like but you ought to study Matheron's seminal work before you assume continuity between measured values in ordered sets, interpolate by kriging, select the least biased and most precise subset of some infinite set of kriged estimates, smooth its pseudo kriging variance to perfection and rig the rules of classical statistics in the process. Please do sign your message!!!--Merksmatrix 19:40, 8 February 2007 (UTC)
- First question: do you acknowledge that you are breaking the NPOV ?
To my opinion, you are breaking the NPOV, for the very reason that you are claiming that Kriging is not statistically well-founded (which, to my opinion, is not an interesting point of view).
Whether you do or do not acknowledge, I propose the article be reverted to a neutral form till a solution is settled. Any revert without justification may be consider as vandalism. If you want to modify the article, do not break the NPOV. In particular, stop using some serpentine ways, by for instance, cluttering the article with specialist-only understandable lingo.
- Second question: do you really think that Matheron's seminal work has importance to explain what Kriging is ?
I would like to point out that i have read some of his work. I personnally find a lot of his notes quite useless and besides, very difficult to read (this is my point of view). What is important for someone who wants to know about Kriging, is to understand what Kriging is, and why it is used.
- Third question : do you really consider yourself as a scientist ?
In science, if someone finds something not suited for his purpose, nobody will prevent this person from using something else. If you have better to propose, make a publication ! Be a scientist, not a religionist.
Antro5 18:16, 9 February 2007 (UTC)
Answer to first question: I’m assisting the one and only person who is trying to find some sort of missing link between the theory of kriging and the practice of polynomial curve fitting by giving references to the literature.
Answer to second question: The objective of your exercise is to provide a historical perspective of polynomial curve fitting. In your opinion, the theory of kriging plays a role in the practice of polynomial curve fitting. Agterberg, Matheron, Koch, Link, and scores of other scholars do not agree with you. Surely, you would not want push your own view on those who want to know what kriging is all about, would you? Matheron dabbled in classical statistics before drifting into geostatistics. His work remains relevant because it shows the earliest contortions of the most seminal of geostatistically gifted minds.
Answer to third question: If you really want to know what the united geostatocracy and the krigeologists of the world think about my work, you should visit my website.--Merksmatrix 23:30, 9 February 2007 (UTC)
Your answer shows precisely what is wrong with your posture, here on wikipedia. You want to defend an opinion about Kriging, which is your opinion, by the way. You do not understand that you cannot do that on wikipedia, because of the NPOV. You want to make a link to your own website. This is not possible. Your are not an institution, you do not refer to well-established publicly available information, and your website is not neutral. Please read the NPOV.
Besides, you have reverted the article without justification, and prior to any discussion. Your attitude does not respect fairness and can be assimilated to POV pushing (see http://en.wikipedia.org/wiki/WP:POVPUSH).
I propose to revert, once again, to a neutral form (ie. without your POV). If you do not agree with the content, please, do not revert. Explain precisely the changes you intend to make and give justifications about the NPOV.
You can, if you want, issue a warning (see WP:TD). But please give reasons.
About your answers. I understand you do not agree with the usage of Kriging in Geostatistics. This article should describe what Kriging is. I do not see the point of discussing on Wikipedia whether it is moral or not to use Kriging in Geostatistics. I am not a Geostatistician and I do not want to quarrel with you on this point. There is already a section in the article dealing with this point. What is very problematic, to my opinion, is that you want to clutter the first paragraph of the article with technical assertions, with unfair purpose. The other problem is the reference to your website.
A solution ?? Since your POV is not related to Kriging but on its usage in Geostatistics, maybe you should discuss your view in another article.
Antro5 12:07, 10 February 2007 (UTC)
[edit] Proposal for revision (OLD ?)
This article gives a brief overview of what Kriging is and describes it using many links to other (complex) entities. I would like to make this article more self-contained and give some insight on the ideas behind Kriging and what are it's pros and cons.
I propose the following sections:
- Idea(s) behind Kriging
- Does each kriged estimate have its own variance?
- Simple Kriging
- Best Linear Unbiased Estimator
- Pro's and Con's
- Extensions of Simple Kriging
- Software
-- Scheidtm 19:59, 15 March 2006 (UTC)
[edit] Comments
- Sounds good, but isn't the Best Linear Unbiased Estimator a consequence of the Gauss-Markov theorem ? Do you need a whole section to explain it? -- hike395 02:22, 16 March 2006 (UTC)
-
- Hmmm, I am not that familiar with Gaussian processes. But "Locality" would be a good substitute anyway. -- Scheidtm 21:16, 16 March 2006 (UTC)
Can any Wikipedian tell me whether or not each distance-weighted average had its own variance before it was reborn as variance-deprived but honorific kriged estimate? That’s the crux of the matter! The rest are details! Please be concise and succinct for a change because I've been fed circular logic and opaque dogma by the geostatistical fraternity since the early 1990s.
I know spatial dependence may be assumed because Journel said so in 1992. The original reference behind Journel’s cryptic remark (“a decision rather”) ought to be posted under References where the first three seminal textbooks on geostatistical fiction should be similarly honored. Another work of sublime interest is Armstrong and Champigny's A Study of Kriging Small Blocks, in which the authors caution against oversmoothing. Apparently, the requirement of functional independence can be violated a little but not a lot. What I enjoy more than most people is fuzzy logic. Invoking WP’s vanity policy when authors refer to their own reviewed and published works reflects a subtle sense of humor.--Iconoclast 17:45, 13 April 2006 (UTC)
- We're not invoking the vanity policy, but WP:NOR. You have read it, yes?
- With a bit of reflection, you will see that it is impossible to write a collaborative encyclopedia, one which anyone can edit, without specifically disallowing original research from each contributor. By forcing all editors to provide verifiable sources, attributable to others, not themselves, and to cite them, we have in place a mechanism which avoids endless, frustrating, back-and-forth edit wars.
- Can you provide a source for your assertions, which is not written by yourself? That is the crux of the matter. Antandrus (talk) 00:45, 14 April 2006 (UTC)
A question about the variance of "samples with different weights" was posed on AI-Geostats Open Website on October 7, 2005, and the formula was posted on October 10, 2005. The webmaster didn't post the entire exchange in which several subscribers took part. Plain logic dictates that this variance formula applies not only to area, count, density, length, mass and volume-weighted averages but also to distance-weighted averages aka kriged estimates. I would have been aware if some geostatistical scholar had issued an exclusion edict for kriged estimates. However, tenets tend to change fast when common sense threathens geostatistics. Journel postulated that spatial dependence may assumed "unless proven otherwise" but was troubled that somebody would apply "Fischerian" [sic!] statistics to prove otherwise. Please let me know if more references are required. --Iconoclast 16:38, 14 April 2006 (UTC)
[edit] comments of the author of the figure
Dear all,
I think that the last version of this article has introduced confusion and inexactness. For instance, in the first paragraph, is is claimed that Krige developed Kriging. this is false. Matheron did, in the 60s, using Krige ideas published in its MSc report.
about the controversy, I would say that this is irrelevent. I do not think that this article should be the place to discuss the validity of modeling by random processes.
References are irrelevent too. Good references are Matheron's published work, Cressie, Chiles and Delfiner, Wackernagel and Stein.
At last, I would say that this is an error to think that Kriging can only be used for spatial modeling. there is not theoretical restriction to consider other types of phenomenons denpending of one, two or more factors.
Belated hello to the Author of the Figure, Please let the readers of this page know whether it makes sense to replace the variance of the single-distance-weighted average with the kriging variance of a set of kriged estimates? Is it possible that this practice violates the requirement of functional independence and ignores the concept of degrees of freedom? Does the data set in your figure display a significant degree of spatial dependence? Thanks for your response! JWM --Iconoclast 22:30, 10 July 2006 (UTC)
The Author of the Figure should peruse Matheron's introduction to Journel and Huijbregts's Mining Geostatistics to find out who coined the term geostatistics and why! It would be useful if the primary data for the Figure were posted to allow the application of a proper test for spatial dependence. JWM. --Iconoclast 18:30, 3 August 2006 (UTC)
-- Maybe we do not agree on what Kriging is exactly. Kriging starts with the hypothesis that the observations (the data) are sample values of a random process with known or unknown mean m(x) and covariance k(x,y). Note that the covariance need not to be stationary. Then, Kriging is just a linear predictor. Nothing more. The practical question is : when can we make the assumption that the observations are sample values of a random process ? The answer is, to my opinion, that it can always be done. A random process is just a model and statistics can tell us if the chosen model is probable or not.
[edit] Further revision proposal by Scheidtm
Kriging' is a regression technique used in geostatistics to approximate or interpolate data. The theory of Kriging was developed from the seminal work of its inventor, Danie G. Krige, by the French mathematician Georges Matheron in the early sixties. In the statistical community, it is also known as Gaussian process regression. Kriging is also a reproducing kernel method (like splines and support vector machines).
Figure: example of one-dimensional data interpolation by Kriging, with confidence intervals
[edit] Idea Behind Kriging
As Kriging was developed in Mining, it will be explaned in this setting here. It can and is used in other contexts, too. Please keep this in mind, when reading this article.
Kriging is often used to predict the distribution of some interesting quantity in a geological survey. For example one wants to determine the gold concentration in a mine field from a limited number of exploratory diggings.
Each of the results could be regarded as a single draw from an unkown random distribution, whose form is determined by the geological processes moving and layering the material in the neighbourhood of the place of mining. But as different places would have different geological neighourhoods and histories, the random distributions would also (slightly) differ, so that a general prediction of ore content would be difficult, because one does not know the differences between these random distributions.
Kriging escapes from these difficulties by using the prior knowledge, that these random distribution only differ slightly. It does this by treating all measurements as one draw from a single probability distribution, which is then called a random process or better a random field. The additional assumptions made about this process encode this prior knowledge, and not only allow to predict the wanted quantity, but also allow to give confidence intervalls for predictions.
[edit] Simple Kriging
- Give assumptions of simple kriging, develop formulas for prediction, confidence intervalls.
- correlation and standard forms (gaussian, exponential, spherical).
- discontinuity at origin (Nugget Effect) => interpolating or smoothing
- differentiability at origin => roughness.
[edit] Best Linear Unbiased Estimator
- Describe features of Kriging
[edit] Pro's and Con's
- to be developed
[edit] Extensions of Simple Kriging
- Describe how assumptions are relaxed, what is predicted by each of the advanced Kriging methods.
[edit] Software implementing Kriging
- Give list (does not strive to be exhaustive).
- The Stanford Geostatistical Modeling Software ( S-GeMS )
I agree with Scheidtm's proposed reorganization of this article. However, I think it is clear that we need a better diagram that more clearly illustrates the application of the technique. Would Emmanuel be interested in producing a revised version of Example_krig.png? Matt 02:49, 22 August 2006 (UTC)
[edit] Confusing: "lost the correspondingly infinite set of variances"
I marked this article {{confusing}} because of the phrase, "lost the correspondingly infinite set of variances" in the introductory paragraph, which is not well-defined before it is used, nor wiki- or hyper-linked. I suggest that the first three paragraphs need a complete re-write as a better introduction, with less jargon and bias (2nd paragraph, hyperlinked to Geophys. web site, shows bias.) --James S. 19:16, 2 April 2006 (UTC)
I moved the two troubled paragraphs to "History" and added a {{SectPOV}} tag in front of the hyperlink. --James S. 19:20, 2 April 2006 (UTC)
- The two paragraphs seem to be pushing a POV that geostatistics is some sort of hoax. This is unlikely, considering that statisticians (other than non-geostatisticians) use Gaussian Process Regression, and have shown that it is a Bayesian technique (where the kernel function describes a Gaussian Process Prior over functions).
- I saved the list of methods named after Krige, but deleted the POV. -- hike395 21:16, 2 April 2006 (UTC)
- I think I finally understand Dr. Merks' objection --- in the Bayesian analysis, spatial dependence is an assumption, while Jan is advocating performing statistical tests on the spatial dependence before blindly using kriging. The latter is a frequentist viewpoint (as I understand it). I did some quick research on what statistical tests are commonly used in spatial statistics, found three, and cited them. -- hike395 16:00, 7 April 2006 (UTC)
In mathematical statistics, one-to-one correspondence between central values (the arithmetic mean and various weighted averages) and their variances is sine qua non. In geostatistics, however, one-to-one correspondence between distance-weighted averages-cum-kriged estimates and their variances is null and void. In other words, the infinite set of variances was lost on Krige's watch and the variance of the SINGLE distance-weighted average was replaced with the perfectly smoothed pseudo kriging variance of a SUBSET of some infinite set of kriged estimates! Geostatistics is a scientific fraud because spatial dependence between (temporally or in situ ) ordered sets is assumed! Remember Bre-X. That's all!--Iconoclast 00:53, 8 April 2006 (UTC)
- I believe I addressed your objections in a way that is NPOV and verifiable --- some people assume spatial dependence, other people test for it. Citations for both viewpoints are included in the article. -- hike395 21:29, 8 April 2006 (UTC)
[edit] latest revert
Two problems with the article, that I reverted:
- The previous version claimed that Krige knew certain facts. This is very difficult to verify: a high standard is needed. Do we have any citations to show what Krige was thinking of?
- The paragraph about Fisher's F-test. Again, this seems like original research. I can only find material about applying that particular test from Dr. Merks himself (his web site [1], comments at ai-geostats [2], comments at amazon.com[3]) and no place else. Again, if this is supported in the common literature, I'd be happy to add it to the paragraph that lists common statistical tests applied to spatial data.
-- hike395 21:37, 8 April 2006 (UTC)
[edit] My two cents
I'm going to chime in here: while I appreciate Mr. Merk's contributions, I need to emphasise that our core policies include no original research, and in this case that means including information which is not verifiable by reference to published sources not by the contributing author. Kriging is accepted both by the scientific community and by policy makers worldwide. Continued insertion of the disputed material is in violation of our POV policy as well as NOR and V. Thanks! Antandrus (talk) 18:27, 10 April 2006 (UTC)
[edit] Fact or Fiction
Sir Ronald A Fisher was knighted in 1953 because of his work on analysis of variance, the essence of which is his F-test. It was Snedecor who called it Fisher's F-test. One might suggest that Fisher's F-test does not qualify as "original research" under WP's core policies. I don't know what Krige "knew" but what I do know is he didn't know each and every distance-weighted average had its own variance long before Fisher was knighted. It would be a lot worse if Krige did know about one-to-one correspondence between distance-weighted averages and variances but decided to ignore it. Neither do I know if Matheron and his students knew that its rebirth as an honorific kriged estimate would make its variance vanish without leaving a trace in geostatistical literature. If fact, I know very little because prominent geostatisticians rather assume, krige, smooth and rig the rules of mathematical statistics than respond to the simple question: Does or doesn't each kriged estimate have its own variance? What a pity that this question violates WP's core NPOV policy! So why not play Clark and the Kriging Game rather than waffle with weasel words? By the way, the ordered set of data in the above figure does not display a significant degree of spatial dependence. Wikipedians ought to check that out! --Iconoclast 16:17, 12 April 2006 (UTC)
- The description of the F-test is not original research, talking about Ronald A Fisher may not be. However, you yourself have said that the application of the F-test to spatial dependency is not generally accepted in geology. I can't find any other references to the use of the F-test applied to spatial dependency, other than your own work. Therefore, the application of the F-test is original research, according to the WP rules.
- Asking questions on Talk pages does not violate NPOV. WP:NPOV talks about the phrasing of the content of an article. If you say "Kriging is clearly invalid, because of blah blah blah", that's an POV phrasing. It's like journalism, you have to use "he said/she said" language. An NPOV phrasing, for example, would be:
- Kriging is a commonly applied technique to model distribution of ore.[1] However, some practitioners question the assumption that spatial dependence follows a stochastic process.[2] Other practitioners recommend using statistical tests to test the assumption of spatial dependency.[3][4][5]
- See what I mean? The article doesn't say that the field is invalid (that's a particular Point of View). Perhaps it should say that kriging is commonly used, but some people question the assumptions and/or use statistical tests to check the assumptions.
- -- hike395 09:43, 13 April 2006 (UTC)
[edit] References
- ^ Cressie, Noel A.C. (1993). Statistics for Spatial Data. Wiley-Interscience.
- ^ Philip, G. M.; Watson, D.F. (1986). "Matheronian Statistics --- Quo vadis?". Mathematical Geology 18 (1): 93-117.
- ^ Fortin, Marie-Josee; Dale, Mark R.T. (2005). Spatial Analysis: A Guide for Ecologists.
- ^ Ullah, Ullah (1998). Handbook of Applied Economic Statistics, 265.
- ^ Schabenberger, Oliver; Pierce, Francis J. (2001). Contemporary Statistical Models for the Plant and Soil Sciences, 653.
[edit] Making this page useful - Give sources or get out
The continued resistance of the one "author" here to provide additional citations to back up his beefs has rendered this entry utterly useless. Quit trying to impose your squatter's rights on the discussion and abide by the request or leave it be. Using Wikipedia to direct people to your site is crappy - this is the ONLY page I've seen this problem persist by such stubborn dogma. Dogma is opinion, not informed, collaborative dissent and disagreement. You clearly are confusing your role here as an "educator" and instead are an impediment (and frankly a parriah in my eyes) to my understanding since I can't verify what you're saying because you can't be bothered.
This comment additionally applies to all the other connected concepts that your put under the umbrella of your disagreement with kriging (do you contest variograms and semi-variograms really or jsut kriging?). Please... GET ON WITH IT, or over it.
209.116.30.220 18:13, 24 July 2006 (UTC)
I'm attempting to do what needs to be done to ensure that scientific integrity and sound science prevail on Wikipedia. I'll post more references if and when required. Wouldn't it be of interest to verify whether the primary data set for the kriging figure displays a significant degree of spatial dependence? You were talking to the undersigned, weren't you? Anonymity is somewhat confusing! JWM. --Iconoclast 16:00, 25 July 2006 (UTC)
- I do not object to the inclusion of a section, 'Controversy', that questions the validity of the statistical technique, based on referenced sources. However, I don't think this article requires 8 references to your own published works (perhaps your user page would be a more appropriate place?). Furthermore, it is my opinion that the opening paragraph of this article should introduce the topic, Kriging, in a manner that is accessible to the encyclopedia reader. Launching straight into a discussion of "what Krige, Matheron and his following did not know in those days" seems to obfuscate rather than elucidate Matt 01:19, 22 August 2006 (UTC)
Sorry, Matt, but I question the validity of the geostatistial technique of assuming spatial dependence, interpolating by kriging, smoothing pseudo kriging variances, and rigging the rules of mathematical statistics. Why not have somebody explain what kriging is really all about? And what about verifying spatial dependence between the ordered set of measured values in the above Figure1? JWM. --Iconoclast 18:47, 22 August 2006 (UTC)
- Hi Jan, I didn't mean to imply that your contributions to this article are unimportant. However, in my opinion the Kriging article should primarily be aimed at introducing the topic to readers who are unfamiliar with the technique (and possibly with geotatistics in general). It is first required to explain exactly what kriging is, before its shortcomings can be adequately addressed. A prominent and detailed Controversy section serves the purpose of warning the reader to treat the technique with caution, and not to accept its conclusions at face value. --Matt 12:50, 27 August 2006 (UTC)
could someone include usage in a sentence? I've found this useful on other WP pages that give it at the top when capitilization is a question. Didn't want to screw it up, so I'll let one of the many debating experts here decide whether to include it.
[edit] Make Information, not War
I came to the Kriging page in order to understand what kriging is, since I encountered the term in a software package (in non-geostatistical context -- it had to do with interpolating sampled elevation points). I expected to:
- learn how data are interpolated in the kriging method
- find at least one equation defining the method
- learn how kriging compares to other methods of interpolation: linear, quadratic, spline, etc.
- see a diagram of kriged data, preferably compared with diagrams of data interpolated by other means
- learn the relative strengths and shortcomings of this method of interpolation
But I was disappointed in that respect. On the other hand, I do not give a rat's fart about:
- the wickedness of prof. Krige
- the metaphysical issues of having one's own variance
- historical references
- name-calling among prominent geostatisticians
- correct capitalization of the word “kriging”
The only useful information I found was buried halfway down the page and read: “The Kriging estimate is a weighted linear combination of the data. The weights that are assigned to each known datum are determined by solving the Kriging system of linear equations, where the weights are the unknown regression parameters. The optimality criterion used to arrive at the Kriging system, as mentioned above, is a minimization of the error variance in the least-squares sense.” However, and very regrettably, the alluded-to set of linear equations was not given anywhere on the page.
Does anyone here have the discipline to adequately explain and illustrate the term in question before launching into controversies and edit wars? The article as it stands now consists of a lot of obscure discussion of abstruse side-issues, with regard to a main topic that is not even decently summarized. I do realize that the editors are all expert geostatisticians, who know kriging as the back of their hand; but most encyclopedia readers have no such prior knowledge, and expect to find it in the article. Respectfully yours, Freederick 15:16, 6 November 2006 (UTC)
[edit] A short tutorial on Kriging
The following paragraphs come from a paper that I started to write but never finished.
-- The author of the Figure --
The objective of this section is to present Kriging, a method to interpolate or approximate scattered observed data, which can be used to model non-linear phenomena or complex systems in engineering. The interpolation (or approximation) is obtained by linear prediction of a spatial random process. Kriging is very computationally practical and its implementation is easy, since it consists in solving a system of linear equations. This presentation shall explain the theory of this method and shall also explain the fundamental connections between Kriging and other similar methods based on the theory of reproducing kernels, namely, radial basis functions (RBF) [1], splines \citep{schoenberg64:_splin, duchon76:_inter, Wah90} and support vector machines (SVM) related methods \citep{vapnik95nature,smola98tutorial,schol02}. The aspects concerning the choice of a kernel will also be presented.
[edit] History
Kriging originates from the early 50's work of D.G. Krige, a South-African mining engineer whose aim was to elaborate maps of ore grade from scattered samples \citep{krige51:_witwat}. The method was adapted and formalized by the French mathematician Georges Matheron, who gave it its present name \citep{Mat63}. Kriging is nowadays one of the basic tool of \emph{geostatistics}, a branch of statistics that deals with the description of phenomena involving spatial factors, such as ore prospection, meteorology, oceanology, etc. In this context, Kriging cannot be dissociated from geostatistical concepts such as \emph{stuctural analysis}, which is the step that consists in choosing a covariance function from the observed data. Geostatisticians have a long experience with data modeling and this experience proves to be helpful for the choice of a kernel, a fundamental issue in practice in reproducing kernels methods. We shall also consider \emph{Intrinsic Kriging}, an extension of Kriging also developed by the geostatisticians, which makes it possible to deal with non-stationary processes, more specifically, random processes comprising unknown trends. An overview of the history of Kriging in the context of geostatistics can be found in \citep{cressie90origin}; see also \citep{chiles99,cressie93statistics} for comprehensive references on the subject. Because of its spatial origin, Kriging has long been restricted to problems where there were only two or three factors -- corresponding to a position -- and it took quite some time to realize that it could also be used in the world of engineering, with more factors of a more diverse nature (see, e.g., \citep{Sac89}). Kriging also has strong connections with the theory of time series, and basically uses the same concepts. Note also that in the community of pattern recognition, Kriging is better known under the name of \emph{Gaussian processes} \citep{Wil95}.
[edit] Linear prediction and Kriging
Consider a \emph{system} with output denoted by $f(\x)$. The output depends on the values taken by the system inputs, denoted by a vector $\x \in \RR^d$. This vector of inputs will be referred as the \emph{factors} and can be any quantity that characterize the conditions under which the system operates. The objective of Kriging is to predict the output of the system for a given $\x$. For this purpose, a \emph{black-box model} is built based on a finite set of observations $f_{{\x}_i}$, $i \in \{1,\cdots,n\}$ of the output of this system, for various values $\x_i$, $i \in \{1,\cdots,n\}$. An observation $f_{{\x}_i}$ is not necessarily equal to $f(\x_i)$ since the output may be corrupted by a noise. Mathematically, the problem of predicting $f(\x)$, based on the observation set $(\x_i, f_{{\x}_i})$, $i=1,\cdots, n$ can be formulated as one of function approximation or interpolation.
Since the system remains uncertain despite the observations, a natural idea is to model the output of the system by a random process, denoted by $F(\x)$. The observed outputs $f_{\x_i}, i=1,\cdots,n$ are thus considered to be realizations of the random variables $F(\x_i)$. The observation noise, which can corrupt the output, is not taken into account in this first section. With this probabilistic formulation, a first approach to predict the system could be to simulate the output \emph{conditionally} to the observed random variables (see conditional simulation in annexes). Such an approach is shown on Figure~\ref{fig:simu}, where several simulated realizations, or trajectories, of the process are represented. Since each conditional trajectory interpolates the data, the simulation can be seen as one possible way of predicting the system. However, it is often preferred to choose \emph{one relevant prediction}, for instance an ``average trajectory, smoother than the realizations of the random process, in order to minimize a risk of wrong prediction.
The kriging method is to choose the \emph{best linear predictor}, which is explained in the remaining of this session. \emph{Linearity} implies that for all $\x$ the predictor $\hat{F}(\x)$ of $F(\x)$ is obtained as a linear projection on the space $\HH_S = \mathsf{span} \{F({\x}_1), \cdots, F({\x}_n)\}$, \emph{i.e.} a linear combination written as \begin{equation}
\label{eq:1} \hat{F}(\x) = \sum_{i=1}^n \lambda_i(\x) F({\x}_i)\,.
\end{equation} where $\forall i \in \{1,\cdots,n\}$, $\lambda_i(\x)\in \RR$. The \emph{best} approximation corresponds to choosing an orthogonal projection. In order to define this orthogonal projection it is assumed that the space of random variables is endowed with the with the classical scalar product, the expectation of the product of two random variables, that is, $(X,Y) = \EE[XY]$. The hypotheses on $F(\x)$ must also be specified at this stage. $F(\x)$ is assumed to be a stationary, second-order random process defined by its \emph{mean} $b=\EE[F(\x)]$ and \emph{auto-covariance function}, or in short \emph{covariance}, written as \begin{equation} \label{eq:2} R(\x,\vb{y}) = \cov [F(\x), F(\vb{y})]\,. \end{equation}
This covariance plays a fundamental role in Kriging since the prediction mainly depends on the choice of a given covariance, as will be discussed in Section~\ref{sec:choosing-covariance}. Note that the hypothesis of stationarity will be discussed in Section~\ref{sec:regul-krig} when introducing intrinsic Kriging. For the time being, it will also be assumed that $F(\x)$ is a \emph{zero-mean} process. If $b$ is known and differs from zero, it can be subtracted from $F(\x)$.
Orthogonal projection is obtained when the prediction error $\hat{F}(\x)-F(\x)$ is orthogonal to $\HH_S$, \emph{i.e.} \begin{equation} \label{eq:3} \EE[(\hat{F}(\x) - F(\x))F({\x}_i) ] = 0\,, \forall i \in \{1,\cdots , n\}\,, \end{equation} or equivalently, the variance of the prediction error, written as $\var[\hat{F}(\x) - F(\x)]$, is minimized. This is a classical least-square regression problem and its solution can be written using the well-known linear prediction formula (see Annex~1) \begin{equation} \label{eq:4} \hat{F}(\x) = \bm{\lambda}\tr \vb{F} = \vb{r}\tr(\x) \vb{R}^{-1} \vb{F}\,, \end{equation} where $\bm{\lambda}(\x)\tr = [\lambda_1(\x), \cdots, \lambda_n(\x)]$, $\vb{r}\tr(\x)$ is the row vector of covariances, $$ \vb{r}\tr(\x) = [R({\x}_1, \x), \cdots, R({\x}_n ,\x)]\,,$$ and $\vb{R}$ is the covariance matrix of the random vector $$ \vb{F} = [F({\x}_1), \cdots, F({\x}_n)]^{\mathsf{T}}\,. $$ The covariance matrix $\vb{R}$ is in general full rank so that its inverse exists (of course, one should not inverse the matrix to solve the linear system). However, when the number of observations increases the matrix can be ill-conditioned and leads to numerical instabilities.
Note that the predictor (\ref{eq:4}) is unbiased since the mean of $F(\x)$ is known. A simple example of linear prediction is illustrated by Figure~\ref{fig:ex_krig}, which represents an interpolation with the output depending on one factor only. Thus, Kriging gives the possibility to predict a system for values of factors that have not been observed. The interpolation property means that when the factors are assigned values corresponding to past observations, the prediction is equal to the already observed output. It should be also intuitive that the more observations are made the more precise the prediction becomes, which is explained below.
The main properties of Kriging are best explained by the behavior of the variance of the error of the prediction, which is given by the Pythagorean relation \begin{eqnarray}
\label{eq:var_error} \var(\hat{F}(\x) -F(\x)) &=& \var F(\x) - \var \hat{F}(\x) \\ &=& R(\x,\x) - \bm{\lambda}(\x)\tr \vb{R} \bm{\lambda}(\x) \\ &=& R(\x,\x) - \vb{r}\tr(\x) \vb{R}^{-1} \vb{r}(\x)\,.
\end{eqnarray} It is then straightforward to assess the quality of the prediction with confidence intervals (error bars) deduced from the square root of the variance of the error (error bars are also shown on Figure~\ref{fig:ex_krig}).
[edit] To be continued
—Preceding unsigned comment added by Antro5 (talk • contribs)
[edit] References
[1] Powell, M. J. D., Radial basis functions for multivariable interpolation: A Review, Algorithms for Approximation of Functions and Data, Oxford University Press, J.C. Mason and M.G. Cox Eds, pp 143-167, 1987
—Preceding unsigned comment added by 160.228.95.69 (talk • contribs)
[edit] Where is the meat?
Quoting from the article: “The Kriging estimate is a weighted linear combination of the data. The weights that are assigned to each known datum are determined by solving the Kriging system of linear equations,...”
Quoting from the last (anonymous) edit on the Talk Page: “Kriging is very computationally practical and its implementation is easy, since it consists in solving a system of linear equations.”
Where are the goddamn equations? Are they legendary? IIUC, they should be the main point of the article, which is well-nigh useless without them. Freederick 19:45, 2 December 2006 (UTC)
- Maybe you can read portuges ?
-
- No. Freederick 22:45, 18 January 2007 (UTC)
References to Matheronian voodoo statistics ought not to be removed!--Merksmatrix 22:21, 3 February 2007 (UTC)
- Duh? Is that slogan somehow related to my request? What I was asking is that some critical data be added, not removed. Voodoo will do, for lack of better, as long as I can write a program realistically interpolating non-gridded elevation values based on that voodoo. I'm an engineer, not a mathematician; I'm comfortable working with empirical equations. Freederick 13:45, 2 March 2007 (UTC)
Dear Mr Nick Didlick aka Merksmatrix,
First, I think you do not understand very well what linear prediction is about and what Kriging means. To my opinion, you tend to confuse the data and the probabilistic model. Do you want to prevent people from fitting linear models because the underlying process that generated the data may not be that linear ? Anyway, if people want to use Kriging, why do you want to prevent them from doing that ?
Why do you persist to use wikipedia to diffuse your own point of view, against the NPOV ?
If you have business in telling revisionist stories against Kriging, good for you. But not on Wikipedia.
[edit] History section
The introduction as of now contains too much history in my opinion. I think the origin of the method should be postponed until after the method has been described, and in a dedicated History section. Berland 05:54, 6 February 2007 (UTC)
The first have of the history section is essentialy a repetition of the introduction. But the second part it is incorrect and polemic.
I will discusse the incorrect parts of the history section as it is now (March 2007) in detail citing the current state in emphased like this and marked with a >:
>Matheron, in this Note Géostatistique No 28, derives k*, his 'estimateur' and a precursor to the kriged estimate or kriged estimator.
The estimator is not called k* in the contribution. 'estimateur' is just the french word for estimator. kriged estimate and kriged estimator are not normally used. I suspect that it was intended to make Matheron ridiculous using strange terms. Futhermore the kriging estimator has several of forerunners in publications of Matheron and Krige.
>In classical statistics, Matheron’s k* is the length-weighte average grade of each panneau in his set.
In classical statistics the kriging estimator is the best linear unbiased predictor. The object to be estimated in this early publications is an area-weighted average. However the estimator and the object to be estimated are still different concepts. The kriging estimator is not length weighted in any sense. Neither the estimator nor Matheron have some specific a set other than a dataset. panneau is french, and probably not understood by most readers of english wikipedia, especially it is used as a technical term from minining industry. The description is thus only a strange and doubtable description of the kriging estimator itself.
>What Matheron failed to derive was var(k*), the variance of his estimateur.
Matheron was well able to compute variances of linear combinations (such as the kriging estimator) of observations from a random field, as e.g. can be explictly seen in his script on stochastic processes [[4]] on page 108 (page 328 of pdf using E[X]=0 as stated before).
>On the contrary, Matheron computed the length-weighted average grade of each panneau but did not compute the variance of this central value.
Again it is not length weighted, thus the first part of the sentence is wrong. But more important the computation and minisation of the estimation variance (i.e.d the variance of difference of the estimator and the true value) is the central core of the whole theory developed by Matheron. The estimation variance is e.g. given in Matheron (1971) The theory of regionalized variables and its applications [[5]] on page 65 formula 2-15. Thus the second part of the part of the sentence is missleading.
>In time, Matheron replaced length-weighted average grades for sampling units such as blocks of ore with more abundant distance-weighted average grades for sample spaces where spatial dependence need not be verified but may be assumed.
The content of this is unclear. Matheron neighter used length weighted nor distance weighted averages for kriging. He indeed in earlier publication always directly looked at blocks of ore, while in later publications he used the easier approach based on regionalized variables. Maybe also the way from a more a applied to a more mathematical theory seams to be outlined here. However I don't think that anybody not knowing the details will read this in this sentence. By the way the sentence would have to be categorized as original research, since the author gives no citations on that facts.
>In Matheron's new science of geostatistics, both central values metamorphosed into either a kriged estimate or a kriged estimator.
Unclear: Which two central values? What is the meaning of metamorphosing here: "The values were called kriging estimator?" or "The values were modified to become kriging estimators", ...
>Matheron’s 1967 Kriging, or Polynomial Interpolation Procedures? A contribution to polemics in mathematical geology, praises the precise probabilistic background of kriging and finds least-squares polynomial interpolation wanting.
A very well designed polemics can be found here in Wikipedia: Indeed there was a polemic discussion back in 1967 going on between Prof. Krige and Prof. Whitten. Matheron, opposing this polemic style (therefor the title) settled the problem scientifically by giving clear arguments and a numerical example. There is however no polemics in the contribution itself, as the sentence above suggests. There is no praising, but a stating of the probabilistic model and there is a clear discription of the field of application of the polynomial interpolation also. The conclusions are left to the reader. Anyway this contribution is not a relevant milestone in the history of kriging.
>In fact, Matheron preferred kriging because it gives infinite sets of kriged estimates or kriged estimators in finite three-dimensional sample spaces.
We can not know why Matheron preferred kriging (indeed I never saw an inventor of a theory not supporting his own theory), but there is certainly no hint to infinite sets or to three dimensional space. I even did not find kriged estimates or kriged estimator, but only the usual term kriging estimator. It is very strange than to here that Matheron prefered kriging because of things he never mentioned.
>Infinite sets of points on polynomials were rather restrictive for Matheron’s new science of geostatistics.
Again infinite sets are not even mentioned in the cited publication. The only occurance of finite is the discussion finite variances (which is not the number of variances, but the value of the variance).
In conclusion the history section in its current shape is a contribution to polemics and should be rewritten or removed.
Boostat 16:44, 25 March 2007 (UTC)
[edit] Let's improve the article by adding the meat.
To my view article, history, and discussion look more like a battlefield than like a encyclopaedic definition or wiki type collaboration.
It obviously needs some major revisions.
In my opinion the structure proposed by Scheidtm seems a good starting point, but needs still to be filled and completed. The short tutorial part provided by someone seems more adequat for wiki-books or as part of an external tutorial. It is very important to put in the information Freederick requested.
I would therefore propose to fill in more relevant material loosely following the Scheidtm scheme. And we could rearrange the article to a nice form afterwards.
Some issues with the content:
- Has anyone a true reference for the claim that the Master Thesis of Krige and not his work at the mine was the seed?
- Kriging is known in mathematical statistics as Best Linear Unbiased Predictor or Estimator, or Kolmogorov-Wiener-Prediction, in Geodesy as Collocation, it is related to Splines in Kernel Reproducing Hilbert spaces and Radial Basis Function Interpolation and can be related to the Regression under the Assumption of a multivariate Gaussian distribution, and might be used with polynomial Regression surfaces and has approximatly 20 other relations to mathematical techniques in Approximation, Numerics, Functional Analysis,... but is it really necessary to try to mention all relations in the first paragraph???? Especially because all the relations are not as simple as suggested by the article. E.g. Only simple Kriging is directly translatable to a standard Bayesian technique. .
- The "Black Box Modelling" section seems to be a hint to a non-standard application and is confusing, since kriging is linear technique and the section uses it as non-linear estimation and lacks any details helping to understand what this is about,
- The notation in the section on kriging interpolation is not really understandable to those not already familiar with the standard notations for random fields. The confidence limits in the graphics are not really explained and only hold in the special case of Gaussian random fields.
- The article lacks information on
- Variogram modelling
- Assumptions and prerequirements of Kriging
- The zoo of kriging techniques. Kriging is not one method, but a family of methods.
- The kriging equations!!!!
- The concept of the kriging error, the kriging variance and the errors in the errors and maybe the human errors on the errors of the errors. :-)
- A hint to alternatives to kriging
- The "Controversy" section is very narrow scoped, using an argument like: There exists a hypothetical and a manipulated example in which kriging is not applicable because no spatial correlation exists. This is a very few for a techique for which many other really critical issues need to be checked before a reliable application of kriging. E.g. Trend, Stationarity, Reliablity of variogram estimation, Gaussianity, quality of data, ...
- The "Related Terms" section seems an unsorted random list of terms having been used somewhere sometimes by someone. The only true information, that "conditional simulation" is used a substitute, is simply wrong (as long as one does not refer to "Multiple Point Statistics", and it would be POV of Standford), since most conditional simulations are indeed based on kriging. We need to put a hierachy and some hint what is what here.
- The second part of the History section claims that Matheron was not able to compute the variance of the estimator. This is not true since he proved in [1] that the variance of the estimator plus the kriging variance is the variance of the random field for simple kriging of a second order stationary random field, which is an application of nonequivariate Gauss-Markov Theory. The true quarrel is about which one of "the variance of the kriging estimator" and "the variance of the difference of the kriging estimator and the true unkown value" is the right measure of uncertainty for the kriging estimator. The choice of Matheron was the second one.
- The reference section could be used to hint to a set of useful books like the books of Chiles and Delfiner, Clark, Deutsch and Journel, and Journel and Huijbregts ... at least.
- We should reference free software such as GSLIB and the R packages
- The link section should be NPOVed. E.g. the Library of the Ecole des Mines de Paris is really not a chronicle of any journey, but a online library of the publications of the whole school.
Thats all for the momenent.
Boostat 12:01, 2 March 2007 (UTC)
Right on Boostat. I like what youve done so far. This article actually says something. SCmurky 22:32, 6 March 2007 (UTC)
Just corrected some small formatting, typos and similars. Unfortunately, I'm new in wikipedia, and didn't realize the "this is a minor edit" until it was too late. Sorry. Tolosimplex 12:35, 14 March 2007 (UTC)
[edit] Nice
The Mathematical Details is quite clear and it seems to describe nicely what Kriging is. It does not look to me too technical at all, even if I am not a statistician. Maybe after some general smoothing the tag can be removed. Jmath666 21:05, 18 March 2007 (UTC)
- I took off the flag. The article is much better now than it was when the tag was put up. :-) Freederick 00:48, 19 March 2007 (UTC)
[edit] Pronunciation
How does one pronounce "kriging"? Soft or hard g?
Also, is the "i" long or short? Does it rhyme with Blitzkieg or bridge? —Preceding unsigned comment added by 65.125.90.222 (talk) 18:12, 19 September 2007 (UTC)
Since posting the above question about the "i", I read the following:
"Pronunciation: Hard “g” (as in Danie Krige) or soft “g” (á là Georges Matheron), take your pick" [6] —Preceding unsigned comment added by 65.125.90.222 (talk) 18:52, 19 September 2007 (UTC)
[edit] My god you guys can't half waffle on :-)
Why oh why can't you people give a straight answer?! It's infuriating! I'm sure this subject is simple to the mathmaticians out there, but for those without a mathematical background this is fairly heavy going. It would really help to have a plain english step-by-step guide to this subject that, after much reading around, doesn't appear to be all that difficult and shouldn't be that much trouble to do. Might be wrong though, and feel free to shoot me down in flames... 90.204.128.225 21:26, 20 June 2007 (UTC) Adam
[edit] A Typographical Error?
In the section Simple kriging error the following appears:
Should this actually be:
In case the difference is not immediately obvious, appears below the rightmost term in the former versus in the latter.
R. A. Hicks (talk) 08:36, 24 January 2008 (UTC)
[edit] Two comments on the mathematical details section
1) I agree with R. A. Hicks about the typo in the simple kriging error expression. The only way that the subsequent line
- which leads to the generalised least squares version of the Gauss-Markov theorem (Chiles & Delfiner 1999, p. 159):
follows is if Hicks' proposed change is made.
2) I worked through a derivation of the RHS of the line:
- :
in the section "General equations of kriging", and it seems to me that this formula is only valid if it is assumed that E[Z(x_0)] = E[Z(x_1)] = ... = E[Z(x_n)] (i.e. using the assumptions of ordinary kriging). This is not necessarily a problem, except that the expression immediately below,
implies that they may potentially be different (i.e. an assumption of universal kriging). Currently, the various types of kriging are not introduced until the next subsection. I propose shuffling the subsection order to present the kriging types first, then presenting the correct series formulas in their respective subsections (this includes adding them to the simple kriging section, since some people may not be familiar with the fact that quadratic forms can be re-written as dual summations).
If there are no objections (I'll naturally wait a couple days), I'm going to effect these changes.
Fun with aluminum (talk) 14:52, 2 March 2008 (UTC)
[edit] How's it pronounced?
It's not obvious from the spelling how this term should be pronounced. Should it be "crigging", "cryging", "kreeging", or "k-rigging", or something else? -- 80.168.224.207 (talk) 19:46, 6 March 2008 (UTC)
- Oh: I see the question has already been asked above. The answer seems to be "people can't agree on a single pronunciation." -- 80.168.224.207 (talk) 20:15, 6 March 2008 (UTC)