Below we provide some explanations on some possible questions relevant to Markov chain geostatistics. The purpose is to avoid further misunderstandings.
1. What are the supporting theories of Markov chain geostatistics?
Bayes' theorem (particularly Bayesian updating) and Markov chain theory. Others, such as the spatial conditional independence
assumption of nearest data within a neighborhood (exended from a cardinal-neighbor property of Pickard random fields to sparse sample
data situation, see Li 2007a, appendix) and the spatial stationarity assumption, are also needed for simplifying the MCRF full solution
and implementing the approach. As a stationary Markov field in which any data sequence along a monotone path is a Markov chain, the
Pickard random field is important for deriving the single-chain-based multi-D Markov chain (i.e., MCRF) model.
2. Is Markov chain geostatistics a Bayesian spatial statistics?
There is no doubt on this point. The MCRF model (including all MCRF models, such as the general full model, general simplified model,
and specific models for specific purposes) was derived using Bayes' theorem (using spatial sequential Bayesian updating). It represents
a new way to deal with nearest data within a neighborhood and thus should be a fundamental Bayesian spatial model at the neighborhood
nearest data level. Although it looks simple, developing such a fundamental geostatistical model for spatial categorical sample data was
never an open plain task. It might be unimaginable in some people's eyes. We do not have the special capability to get there by one step,
given our previsouly limited knowledge in statistics. But we did it through a long hard way, because our research and personal experience
brought us to the road. After the MCRF approach was proposed, published and developed into useful techniques, some people suddenly found
the MCRF model is very simple and wanted to repeatedly repropose it in different ways with different names. That was really a historical
joke and shame created by some people in geo/spatial-statistics. The webpages (From the CMC model to the
MCRF model, The single-Markov-chain random-field idea, Markov chain random
field theory) explained the motivation and the derivation process of the MCRF model and also illustrated its relation with and difference
from the coupled Markov chain (CMC) model. Although the MCRF model was derived using Bayes' theorem and the factorization based on the
relationship of joint probability and conditional probability, the MCRF approach is different from the usual Bayesian inference methods in
statistics, because the latter are focused on using the Bayes' theorem to infer parameters. In the MCRF model, the parameters are estimated
from sample data through transiograms rather than inferred through Bayesian analysis. We later found that the MCRF model, which could be
visualized as a probabilistic directed acyclic graph, conforms to Bayesian Networks. However, constructing such a neighborhood-based
Bayesian network on spatial data and unobserved locations through reasoning is difficult, if not impossible, due to the lack of causality
in spatial data. Fortunately, the single-Markov-chain random-field idea and the spatial sequentical Bayesian updating idea helped overcome
the challenge of causality.
3. Is a Markov chain random field a Markov chain?
Yes. We think so. At its simplest, a MCRF is a conventional 1-D (continuous or jumping) Markov chain. But at other situations, a
MCRF is not a conventional 1-D Markov chain. It is a LOCALLY-CONDITIONED Markov chain in a space, or we may say it is a Markov chain
locally-conditioned by data on nearest locations in different directions within a neighborhood. Although the MCRF model is
derived using Bayes' theorem and thus it should be a Bayesian spatial model, this does not impact its Markov chain nature. Indeed,
Markov networks (including Markov random fields (MRFs)) and Bayesian networks are different. A conventional Markov chain is a simple
Bayesian network, and a MCRF, as a locally-conditioned Markov chain, can be described as a complex neighborhood-based spatial Bayesian
network. Apparently, both a locally-conditioned Markov chain and a neighborhood-based Bayesian network on spatial data points are
novel in spatial statistics.
The MCRF model was initially called "spatial Markov chain" models when related
manuscripts were submitted to journals in 2004 (the MCRF term was only mentioned in contents in some manuscripts but was not used in
manuscript title or for referring to the whole approach). Consequently the idea was also called "spatial Markov chain" theory. The
idea covers a series of specific single-chain-based spatial Markov chain models at different dimensions or data situations. Later, I
felt that the term "spatial Markov chain" might be too general to clearly express the unique nature of the new model, I decided to use
the more unique term "Markov chain random field" (MCRF) in the revised version of the manuscript of Li (2007a) to represent the general
solution and all of the single-chain-based Markov chain spatial models, but the term "spatial Markov chain" was still kept in Li (2007a)
to refer to specific spatial Markov chains in one to multiple dimensions within the framework of the MCRF theory. It is unfortunate that
during last more than ten years, the irrational use of the term "spatial Markov chain" by some people on incorrect or irrelevant models
had messed it up.
4. Are there any conflicts between the CMC model and the MCRF model?
We don't think there are any conflicts between the two models. We had no arguments with Elfeki and Dekking openly or privately on
any scientific issues related to the CMC and MCRF models. We pointed out that the CMC model has some deficiencies in simulating
categorical soil variables in our paper (Li et al. 2004) when we modified and extended the model and tried to make it more practical
for predictive soil mapping (mainly the small class underestimation problem and
the layer/patch inclination problem) based on our extensive tests and research records on soil
data. The small class underestimation problem was actually demonstrated clearly in simulation cases (in two cases) by Elfeki and Dekking
(2001). Their paper did not explicitly discuss the small class underestimation problem, probably because their understanding to the
problem was not the same as ours at that time. For the layer inclination problem, Elfeki and Dekking (2001, 2005, 2007) also showed it
more or less in their simulation cases, although not being so clear as in our simulation cases. There are differences between soil
type/layer patterns and geological lithofacies layer patterns (e.g., the lithofacies layers they simulated are extremely straight and
long), which might cause some difference in simulated results. Changes in grid cell size (length and width) and the use of subjective
parameters (i.e., improperly-set horizontal transition probability values) in some simulation cases, which may implicitly alter class
proportions and correlation ranges, may be other reasons. We neither wrote wrong equations nor provided wrong case studies with improper
interpretations to mess up their model. Our case studies used widely existing soil patterns and objectively-estimated parameters (i.e.,
transition probabilities estimated from real data). In fact, I suggested to form an international collaboration group with Elfeki (he
further recommended N. Park) in Fall 2005 when he contacted me to express his collaboration intention. However, things did not develop
as expected later (probably due to interference and misunderstandings of others).
Our study didn't aim to expose mistakes of others. The CMC model was initially proposed
in Elfeki's PhD dissertation (Chapter 3, p. 64-98) published in 1996, but it was not well tested (Elfeki only simulated a hypothetical
formation of inclined layers without data analysis, and focused his research on solute transport). My postdoc report was done and written
in 1999. The purpose of the postdoc research was to use the model to field water study rather than to detect its defects. Using the CMC
model was suggested by one of my collaborators in Belgium. I developed a Fortran computer program for the model based on the model
description in Elfeki's PhD dissertation. Although our simulated results were not good (out of our expectation) and the postdoc research
did not reach its goal, the report did not say anything bad when presenting simulated results. We didn't know Elfeki was still in the
Netherlands at that time; otherwise, we should have contacted him after we met problems with the model. The report was neither published
nor put online before 2015 (We had to put it online recently (after 2015) to clarify confusions caused by some persistent misunderstandings).
We never submitted a manuscript to any journal on the topic before we believed we had basically understood the defects of the CMC model
and made our own progress (contribution) in spring 2003. We never reviewed a manuscript for any journal before summer 2005. They had
sufficient time to publish any progress they made on the model without our impact, if they had. So it is difficult to say our research
impacted their research on the CMC model. Even after our papers that pointed out the defects of the CMC model were published and we
demonstrated, proved, and solved the defects of the CMC model in publications, the CMC model was continuously used by some researchers in
journal publications. In fact, the CMC model was even used and developed in a series of publications in engineering geology since
2016, by some Chinese researchers in China and in the Netherlands. We did not say anything so far (we think we have proved the defects
of the CMC model sufficiently and also made sufficient clarification), although we could see (others and the authors themselves could not
see?) the unreasonable artificial parameters they used in order to obtain better results (They either modified the values of horizontal
transition probability matrices used as input parameters or set the values improperly, causing them to be in conflict with real data, that
is, the class proportions and correlation ranges implied by the input transition probabilities are thoroughly different from those in the
real data. For example, a minor class with a short autocorrelation range may become a major class with a long autocorrelation range if some
related transition probability values are changed in the input transition probability matrix). If some people are determined to mess up
the scientific field by tricks, nobody can persuade them. We made large efforts to try to solve the major defects of the CMC model with
adherence to its basic ideas (i.e., use two or more 1-D Markov chains and assume they are independent of each other), but could not solve
the small class underestimation problem. Finally, it was the single-Markov-chain random-field idea with the spatial sequential Bayesian
updating idea that solved the major problem. Anyway, the MCRF model is related with the CMC model, and we recognized the contribution of
the CMC model to the MCRF model. But theoretically the two models are very different. They used different statistical assumptions and
principles to construct different models. Although the CMC model and the simplified MCRF model look similar in model formula, they are
different in the head-tail directions of some transition probability terms, and it is exactly the difference that makes them generate
different results.
All of the defects of the CMC model we mentioned and solved (e.g., layer inclination
tendency, small class underestimation) were displayed by Elfeki and Dekking themselves in their published papers (see Elfeki and Dekking
2001, 2005, and 2007), and all of the defects have their scientific reasons (explained and shown more clearly in this website) and also
occurred in some other models, for example, in a Markov random field model (see Norberg et al. 2002 for small class underestimation), in
some Markov mesh and autoregressive models (see Gray et al. 1994 and Sharp and Aroian 1985 for pattern inclination). However, after
our major papers about the MCRF approach were accepted for publication in 2006, some open misunderstandings soon arised: The CMC model
suddenly became a perfect and solely-correct Markov chain spatial model in some's eyes or hands. The MCRF model was then misdescribed as
the CMC model and a simplification of other different spatial models and taken as a new model by some people in open publications.
Unimaginably, after we clarified the misunderstandings, somebody without experience in geostatistical research even described the MCRF
simulation algorithms as path-based methods without models. And a research group (Mr. Z.Z. Wang and his graduate student X. Huang in
Central South University) in China without experience and knowledge in geostatistics even redescribed the published coMCRF model as a
spatial hidden Markov chain model with a case study using the CMC model and claimed it as their original new model, while publishing a
series of misleading cheating articles. It seems that wherever this group saw an existing nonspatial statistical model with conditional
probabilities [e.g., linear Bayesian updating model for integrating expert opinions (e.g, for diagnosis of a disease), and feature
variable-based (e.g., spectral value) (Naive) Bayes/Bayesian classifiers], they wanted to claim a transition probability-based spatial
statistical model on it, even though the conditional probabilities have nothing to do with spatial transition probabilities. Probably
in their eyes, conditional probabilities are equal to Markov transition probabilities and further equal to spatial Markov transition
probabilities. At the same time, some researchers (mainly Chinese researchers) were crazily using
and developing the CMC model to make articles with various tricks of concealing its defects, as if the MCRF model was built on lies.
Because these things misled and scared many people, they harmed us very much while serving the fights in geostatistics (as well as in
pedometrics, environmental informatics, and GIS) and the fights in China. No matter what unknown reasons they had, using such manners to
mislead people and trouble others was improper. Where were their conscience and integrity!? Did they know how, what time, and under what
situations the MCRF model was developed, proposed and published? Did they know how long time it would take to develop a new geostatistical
approach with practical methods from initial exploration of encountered scientific issues? ......
We have no problems with any studies that attempted to explore the possible relationships
between our model/method and other spatial/nonspatial models/methods, or even explain our model/method from different perspectives, as
long as their studies are honest, objective and respectful. However, those publications that provided false data and wrong interpretation
or irrationally claimed a new model on our model by tricks were improper, because they were cheating, misleading and consequently
troublemaking. In addition, we don't think that purely playing mathematical symbol and equation games without physical meanings can be
claimed to be geostatistics or spatial statistics. Even a pure statistical theorem/principle has its physical meaning; otherwise, Bayes'
theorem and detailed balance principle have no difference from the conditional probability formula. The MCRF model as a nonlinear
geostatistical model has its rationality and unique physical meanings for spatial data.
5. Why was Markov chain geostatistics (mainly MCRF model) misunderstood repeatedly?
Don't know much. There seemed no convincing reasons for those misunderstandings. If they read our articles carefully, they should be
able to understand the important scientific progresses we made for solving existing specific scientific issues. Maybe they heard some
rumors from others or were encouraged by others; or maybe they were really confused on the progresses we made. After all, most people are
not experts in geo/spatial-statistics. Some researchers might not have the expertise in both geostatistics and Markov chains. But the
spatial sequential Bayesian updating over nearest data and the spatial conditional independence assumption of nearest data within a
neighborhood are not difficult to understand. It is also not difficult to make a test to see the difference in results between the
simplified MCRF model and the CMC model using the same neighborhood (see Zhang and Li 2008). Our papers on the MCRF approach were openly
published. Whoever has problems or evidence of scientific misconducts, they can write comment letters; however, so far nobody ever wrote
one piece. Under such a situation, any attempt to trouble the MCRF approach in their publications is irrational. As to some people
committing scientific frauds in articles (e.g., intentional cheating in data), that is already irrelevant with misunderstandings. No
misunderstandings on others can justify one's own fraud. In fact, some people immediately found the fact that the CMC model underestimates
small classes in their testing studies published in 2008 and in 2014, reapectively. But they did not clearly tell the truth; on the
contrary, they messed the MCRF model and the CMC model together. It seemed that other things played a role in the "misunderstandings",
and it was obvious that thirst for publishing articles in some good journals was also one of the major reasons.
There seemed some irrational accusations since spring 2004 (or as early as Winter 2003),
such as "Elfeki and Dekking did not say their model had deficiencies, why did you say?", "Used others' technology but did not provide a
good evaluation", "Impacted others' research". These were typical misunderstandings. If they read the article of Elfeki and Dekking (2001)
carefully, they should be able to see that our researches in this topic including the MCRF model, published since 2004, had no real
conflicts with the publications of Elfeki and Dekking on their CMC model, published since 2001, except for some different emphases. As
to the MCRF model, it is completely different from the CMC model both in equations and in theories, in addition to that the MCRF model
solved the major deficiency of the CMC model - small class underestimation. All the deficiencies of the CMC model we mentioned and solved
were displayed by Elfeki and Dekking themselves in their published articles (see small
class underestimation and layer/parcel inclination; in addition, transiogram joint modeling
methods solved the problem in estimating transition probabilities from sample data for implementing multidimentional Markov chain models).
Elfeki and Dekking never expressed their disagreements to us on our research. The CMC
model should be able to generate valuable results in some cases in geology, but with obvious deficiences in most situations, as shown
by themselves. But for soil or land cover patterns, it does not work well in most situations due to the complex and heterogeneous
landscape patterns (there are always small classes) as well as the wide use of scattered point sample data; that was why we made efforts
to solve the existing problems shown in our studies. Some things that occurred in last more than one decade arround the MCRF approach
were unreasonable and has gone beyond science and rationality. It not only ruined the chances of collaborative researches and blocked
the normal advancement of geostatistics, but also created a series of chaos, delimmas and reputation loss. There are always some
opportunists who want to create chaos.
6. Why did you write comment letters? Why not just ignore them and do your research?
We don't like to write comment letters. Before 2011, we never wrote a comment letter on others' published articles. But later we
had to do so. Our comment letters aimed to clarify misunderstandings, rather than attack or stain others. So far we never wrote a
comment letter on any article that did not seriously misunderstand our research in Markov chain spatial simulation and cause serious
problems on our reputation and career. If we did not clarify some misunderstandings, who knows what would occur further on us. If one
has problems with our research and publications in Markov chain spatial simulation, he/she should deal with it through a normal/formal
way, for example, communicating with us face to face or through emails, or writing a comment letter on our published papers and letting
us respond, which would be welcome and solve problems. It is not suitable to harass others through improper manners, purely based on
one's imaginations or prejudices. In addition, we could be polite to the first misunderstanding and the second misunderstanding in
open publications, but we might not be so polite to further "misunderstandings" that were essentially frauds and/or aimed to harm us.
However, after repeated clarifications and exposing a couple of scientific frauds, there is really no necessity to argue further on
scientific issues with those intentional troublemakers. If some people have so strong desire to show off their tricks in troublemaking
or plagiarising, and some journals want to publish their rubbish, let them publish. Sooner or later, people will see the truth.
7. What is the difference between the MCRF model and the MRF model?
The MCRF model and the conventional Markov random field (MRF) model are different in both mathematical formulas and implementation
methods, although they both are based on Markov property. The MRF model is equivalent to the Gibbs distribution. To our knowledge
at the time we were developing the MCRF model more than ten years ago (i.e., 2004), the MRF model was mainly used as a typical lattice
model (it does not matter whether the lattice is regular or irregular) (see Cressie 1993), usually using a 3x3 grid cell square
neighborhood on image data, and it had existed for a long time since 1974 as a well-established spatial statistical theory (in fact,
studies on MRFs began in 1950s). It was often used for image processing without conditioning on sample data. The MRF model were normally
implemented using iterative algorithms, such as Gibbs sampler or the Metropolis-Hastings algorithm. When it was used with sample data,
an initial image was still needed as the starting point of iterations. Because of some reasons (see Tjelmeland and Besag 1998, Norberg
et al. 2002), it was not regarded as a good method for conditional simulations on sparse sample data in earth sciences previously. That
might be one reason why people in earth sciences continuously made efforts to develop multi-D Markov chain models after 1970s. MRFs were
expanded to Markov networks, which might deal with nonspatial data. But it seems difficult to say that Markov networks are a geostatistical
model for local probability distribution estimation based on sparse spatial data. This did not mean that the conventional MRF models or
Gibbs distribution could not be extended or simplified based on some ideas from other methods or under the inspiration of other progresses,
including the ideas of the simplified MCRF models [Looking back at the things that occurred during last 14 years (since winter 2003 or
spring 2004), it was not difficult to guess that some misled influential people probably had repeatedly called for international actions
to check the research in multi-D Markov chain modeling. However, it is obvious that some checkers were just foolers and/or takers, rather
than truth seekers].
However, the MCRF model (Li 2007a) is a spatial Bayesian updating model at neighborhood
nearest data level. It was initially proposed to solve the small class underestimation problem of the coupled Markov chain (CMC) model
(see From the CMC model to the MCRF model) and then generalized as a geostatistical model for
simulating categorical spatial variables. The MCRF model was initially based on the single-Markov-chain random-field idea (in order to
avoid the conflict transitions that occurred but excluded in constructing the CMC model). The mathematical derivation
procedure of the MCRF model is simple: It assumed a locally-conditioned Markov chain and the local conditioning on nearest data
in different directions within a neighborhood followed a sequential Bayesian updating process (it was done by decomposing the joint
probability distribution of nearest data and the central unobserved random variable into a series of likelihoods and a prior probability
using the relationship of joint probability and conditional probability). To simplify the full solution with multi-point likelihoods, I
further applied the spatial conditional independence assumption of nearest data (extended from a property of Pickard random fields) to
the MCRF model. This derivation process has no relation with maximum entropy or MRF (i.e., Gibbs distribution), and the model formula
of MCRFs is also unique. There was no experiential addition or elimination of any components. Although the entire derivation process
looks simple, it was based on spatial thinking of neighborhoods and spatial auto/cross-correlations, which make it thoroughly different
from nonspatial statistical models. Therefore, the MCRF model is a typical fundamental Bayesian spatial statistical model at neighborhood
nearest data level. These points were clearly presented in Li (2007a), Li and Zhang (2008), Li et al. (2013) and Li et al. (2015).
Visually, the MCRF model is a probabilistic directed acyclic graph, and we later found that the MCRF model essentially conforms to a
special Bayesian network built over the neighbrohood of spatial data. But this does not mean that Bayesian networks were a geostatistical
model or automatically become a geostatistics. At most, we may say that the Bayesian network idea might be used to construct or interpret
a geostatistical model like the MCRF model or the MCRF model does not violate the rules of Bayesian networks. Because the MCRF model was
derived using Bayes' theorem, as a Bayesian model it is normal that it can conform to Bayesian networks. However, because spatial locations
(data points and unobserved locations) do not have causality, it is difficult, if not impossible, to construct a Bayesian network on spatial
data points and unobserved locations through reasoning. In addition, the MCRF model does not need to use iterative algorithms to perform
simulations (but if we shrink its neighborhood to immdediately adjacent grid cells on a lattice, it may be run with iterative algorithms).
Therefore, the MCRF model and the MRF model are theoretically different with different
formulas. However, this does not mean that the MCRF model has no relation with the MRF model. The Pickard random field model (which
was often regarded as a special kind of MRFs) is important for the development of the MCRF model. Considering the Markovian nature of
the neighborhood system of the MCRF model and the facts that it was developed based on the single-Markov-chain random-field idea, it
obeys the transition probability rules, and it was simplified using the extension of the cardinal-neighbor conditional indpendence property
of Pickard random fields, I even regarded the MCRF model as a "special MRF" or an extension of the MRF model toward the sparse data space
at the beginning (see Li 2007a). We are not good at manipulating mathematical symbols and equations. My further study on the MCRF
model had found that it is consistent with Bayesian networks and may be regarded as a special Bayesian network on spatial data, and I also
found that such a special spatial Bayesian network could not be constructed by reasoning based on causality. Since the proposition of the
MCRF model more than ten years ago, it seemed that many people wanted to test or challenge it. We definitely welcome those sincere tests or
challenges with real data. But intentional troublemakings that aimed to create chaos and trouble us by playing tricks, such as intentionally
messing irrelevant models/methods together with wrong interpretations, providing false case study data generated by other models to stain
the MCRF model, always using extreme cases or improper parameters to demonstrate the CMC model works perfectly, and irrationally claiming
new spatial models on the MCRF model through irrelevant models or even nonspatial models, are improper. Even if we do not write comment
letters to point out the problems of some troublemaking articles, that does not mean they are doing the right things and may escape from
the eyes of readers forever. Although our knowledge in mathematics and statistics was limited, we did our research honestly and carefully.
We believe that every geo/spatial-statistical model should first have its rationality and physical meaning. As to whether the MCRF model
can be regarded as a special kind of MRF model or not, let the real experts in statistics or mathematics judge. This problem actually has
never been our concern. It is inevitable that the MCRF model/theory, including the spatial conditional independence assumption of nearest
data in a neighborhood, has generated large effect during last decade in the field of geostatistics, no matter whether the contribution
of the MCRF model was properly recogized in some publications.
In general, the MCRF approach represents a new geostatistical approach: (1) When the MCRF idea was initially proposed in 2004 and then formally published in 2007, it was different from any existing spatial/geo-statistical model. (2) Its initial purpose was to solve the small class underestimation problem of the CMC model. It reached its goal through avoiding the conflict transitions occurring in the CMC model by using the single-chain idea. (3) It was rationally derived using Bayes' theorem and sequential Bayesian updating on nearest data within a neighborhood for its full solution and further simplified using the spatial conditional independence assumption of nearest data within a neighborhood for its simplified formula. (4) Its simplified form was well-tested to be workable by us. (5) It was a scientific progress made by us after a long-term exploration on Markov chain spatial modeling. Without such a long-term exploration and extensive computer programming for testing various ideas, there is no way to make the progress no matter how simple it looks.