Go Back

Explanations to some possible questions


Below we provide some explanations on some possible questions relevant to Markov chain geostatistics. The purpose is to avoid further misunderstandings.

1. What are the supporting theories of Markov chain geostatistics?
Bayes' theorem (particularly Bayesian updating) and Markov chain theory. Others, such as the spatial conditional independence assumption of nearest data within a neighborhood (exended from a cardinal-neighbor property of Pickard random fields to sparse sample data situation, see Li 2007a, appendix) and the spatial stationarity assumption, are also needed for simplifying the MCRF full solution and implementing the approach. As a stationary Markov field in which any data sequence along a monotone path is a Markov chain, the Pickard random field is important for deriving the single-chain-based multi-D Markov chain (i.e., MCRF) model.

2. Is Markov chain geostatistics a Bayesian spatial statistics?
There is no doubt on this point. The MCRF model (including all MCRF models, such as the general full model, general simplified model, and specific models for specific purposes) was derived using Bayes' theorem (using spatial sequential Bayesian updating). It represents a new way to deal with nearest data within a neighborhood and thus should be a fundamental Bayesian spatial model at the neighborhood nearest data level. Although it looks simple, developing such a fundamental geostatistical model for spatial categorical sample data was never an open plain task. It might be unimaginable in some people's eyes. We do not have the special capability to get there by one step, given our previsouly limited knowledge in statistics. But we did it through a long hard way, because our research and personal experience brought us to the road. After the MCRF approach was proposed, published and developed into useful techniques, some people suddenly found the MCRF model is very simple and wanted to repeatedly repropose it in different ways with different names. That was really a historical joke and shame created by some people in geo/spatial-statistics. The webpages (From the CMC model to the MCRF model, The single-Markov-chain random-field idea, Markov chain random field theory) explained the motivation and the derivation process of the MCRF model and also illustrated its relation with and difference from the coupled Markov chain (CMC) model. Although the MCRF model was derived using Bayes' theorem and the factorization based on the relationship of joint probability and conditional probability, the MCRF approach is different from the usual Bayesian inference methods in statistics, because the latter are focused on using the Bayes' theorem to infer parameters. In the MCRF model, the parameters are estimated from sample data through transiograms rather than inferred through Bayesian analysis. We later found that the MCRF model, which could be visualized as a probabilistic directed acyclic graph, conforms to Bayesian Networks. However, constructing such a neighborhood-based Bayesian network on spatial data and unobserved locations through reasoning is difficult, if not impossible, due to the lack of causality in spatial data. Fortunately, the single-Markov-chain random-field idea and the spatial sequentical Bayesian updating idea helped overcome the challenge of causality.

3. Is a Markov chain random field a Markov chain?
Yes. We think so. At its simplest, a MCRF is a conventional 1-D (continuous or jumping) Markov chain. But at other situations, a MCRF is not a conventional 1-D Markov chain. It is a LOCALLY-CONDITIONED Markov chain in a space, or we may say it is a Markov chain locally-conditioned by data on nearest locations in different directions within a neighborhood. Although the MCRF model is derived using Bayes' theorem and thus it should be a Bayesian spatial model, this does not impact its Markov chain nature. Indeed, Markov networks (including Markov random fields (MRFs)) and Bayesian networks are different. A conventional Markov chain is a simple Bayesian network, and a MCRF, as a locally-conditioned Markov chain, can be described as a complex neighborhood-based spatial Bayesian network. Apparently, both a locally-conditioned Markov chain and a neighborhood-based Bayesian network on spatial data points are novel in spatial statistics.
       The MCRF model was initially called "spatial Markov chain" models when related manuscripts were submitted to journals in 2004 (the MCRF term was only mentioned in contents in some manuscripts but was not used in manuscript title or for referring to the whole approach). Consequently the idea was also called "spatial Markov chain" theory. The idea covers a series of specific single-chain-based spatial Markov chain models at different dimensions or data situations. Later, I felt that the term "spatial Markov chain" might be too general to clearly express the unique nature of the new model, I decided to use the more unique term "Markov chain random field" (MCRF) in the revised version of the manuscript of Li (2007a) to represent the general solution and all of the single-chain-based Markov chain spatial models, but the term "spatial Markov chain" was still kept in Li (2007a) to refer to specific spatial Markov chains in one to multiple dimensions within the framework of the MCRF theory. It is unfortunate that during last more than ten years, the irrational use of the term "spatial Markov chain" by some people on incorrect or irrelevant models had messed it up.

4. Are there any conflicts between the CMC model and the MCRF model?
We don't think there are any conflicts between the two models. We had no arguments with Elfeki and Dekking openly or privately on any scientific issues related to the CMC and MCRF models. We pointed out that the CMC model has some deficiencies in simulating categorical soil variables in our paper (Li et al. 2004) when we modified and extended the model and tried to make it more practical for predictive soil mapping (mainly the small class underestimation problem and the layer/patch inclination problem) based on our extensive tests and research records on soil data. The small class underestimation problem was actually demonstrated clearly in simulation cases (in two cases) by Elfeki and Dekking (2001). Their paper did not explicitly discuss the small class underestimation problem, probably because their understanding to the problem was not the same as ours at that time. For the layer inclination problem, Elfeki and Dekking (2001, 2005, 2007) also showed it more or less in their simulation cases, although not being so clear as in our simulation cases. There are differences between soil type/layer patterns and geological lithofacies layer patterns (e.g., the lithofacies layers they simulated are extremely straight and long), which might cause some difference in simulated results. Changes in grid cell size (length and width) and the use of subjective parameters (i.e., improperly-set horizontal transition probability values) in some simulation cases, which may implicitly alter class proportions and correlation ranges, may be other reasons. We neither wrote wrong equations nor provided wrong case studies with improper interpretations to mess up their model. Our case studies used widely existing soil patterns and objectively-estimated parameters (i.e., transition probabilities estimated from real data). In fact, I suggested to form an international collaboration group with Elfeki (he further recommended N. Park) in Fall 2005 when he contacted me to express his collaboration intention. However, things did not develop as expected later (probably due to interference and misunderstandings of others).
      Our study didn't aim to expose mistakes of others. The CMC model was initially proposed in Elfeki's PhD dissertation (Chapter 3, p. 64-98) published in 1996, but it was not well tested (Elfeki only simulated a hypothetical formation of inclined layers without data analysis, and focused his research on solute transport). My postdoc report was done and written in 1999. The purpose of the postdoc research was to use the model to field water study rather than to detect its defects. Using the CMC model was suggested by one of my collaborators in Belgium. I developed a Fortran computer program for the model based on the model description in Elfeki's PhD dissertation. Although our simulated results were not good (out of our expectation) and the postdoc research did not reach its goal, the report did not say anything bad when presenting simulated results. We didn't know Elfeki was still in the Netherlands at that time; otherwise, we should have contacted him after we met problems with the model. The report was neither published nor put online before 2015 (We had to put it online recently (after 2015) to clarify confusions caused by some persistent misunderstandings). We never submitted a manuscript to any journal on the topic before we believed we had basically understood the defects of the CMC model and made our own progress (contribution) in spring 2003. We never reviewed a manuscript for any journal before summer 2005. They had sufficient time to publish any progress they made on the model without our impact, if they had. So it is difficult to say our research impacted their research on the CMC model. Even after our papers that pointed out the defects of the CMC model were published and we demonstrated, proved, and solved the defects of the CMC model in publications, the CMC model was continuously used by some researchers in journal publications. In fact, the CMC model was even used and developed in a series of publications in engineering geology since 2016, by some Chinese researchers in China and in the Netherlands. We did not say anything so far (we think we have proved the defects of the CMC model sufficiently and also made sufficient clarification), although we could see (others and the authors themselves could not see?) the unreasonable artificial parameters they used in order to obtain better results (They either modified the values of horizontal transition probability matrices used as input parameters or set the values improperly, causing them to be in conflict with real data, that is, the class proportions and correlation ranges implied by the input transition probabilities are thoroughly different from those in the real data. For example, a minor class with a short autocorrelation range may become a major class with a long autocorrelation range if some related transition probability values are changed in the input transition probability matrix). If some people are determined to mess up the scientific field by tricks, nobody can persuade them. We made large efforts to try to solve the major defects of the CMC model with adherence to its basic ideas (i.e., use two or more 1-D Markov chains and assume they are independent of each other), but could not solve the small class underestimation problem. Finally, it was the single-Markov-chain random-field idea with the spatial sequential Bayesian updating idea that solved the major problem. Anyway, the MCRF model is related with the CMC model, and we recognized the contribution of the CMC model to the MCRF model. But theoretically the two models are very different. They used different statistical assumptions and principles to construct different models. Although the CMC model and the simplified MCRF model look similar in model formula, they are different in the head-tail directions of some transition probability terms, and it is exactly the difference that makes them generate different results.
      All of the defects of the CMC model we mentioned and solved (e.g., layer inclination tendency, small class underestimation) were displayed by Elfeki and Dekking themselves in their published papers (see Elfeki and Dekking 2001, 2005, and 2007), and all of the defects have their scientific reasons (explained and shown more clearly in this website) and also occurred in some other models, for example, in a Markov random field model (see Norberg et al. 2002 for small class underestimation), in some Markov mesh and autoregressive models (see Gray et al. 1994 and Sharp and Aroian 1985 for pattern inclination). However, after our major papers about the MCRF approach were accepted for publication in 2006, some open misunderstandings soon arised: The CMC model suddenly became a perfect and solely-correct Markov chain spatial model in some's eyes or hands. The MCRF model was then misdescribed as the CMC model and a simplification of other different spatial models and taken as a new model by some people in open publications. Unimaginably, after we clarified the misunderstandings, somebody without experience in geostatistical research even described the MCRF simulation algorithms as path-based methods without models. And a research group (Mr. Z.Z. Wang and his graduate student X. Huang in Central South University) in China without experience and knowledge in geostatistics even redescribed the published coMCRF model as a spatial hidden Markov chain model with a case study using the CMC model and claimed it as their original new model, while publishing a series of misleading cheating articles. It seems that wherever this group saw an existing nonspatial statistical model with conditional probabilities [e.g., linear Bayesian updating model for integrating expert opinions (e.g, for diagnosis of a disease), and feature variable-based (e.g., spectral value) (Naive) Bayes/Bayesian classifiers], they wanted to claim a transition probability-based spatial statistical model on it, even though the conditional probabilities have nothing to do with spatial transition probabilities. Probably in their eyes, conditional probabilities are equal to Markov transition probabilities and further equal to spatial Markov transition probabilities. At the same time, some researchers (mainly Chinese researchers) were crazily using and developing the CMC model to make articles with various tricks of concealing its defects, as if the MCRF model was built on lies. Because these things misled and scared many people, they harmed us very much while serving the fights in geostatistics (as well as in pedometrics, environmental informatics, and GIS) and the fights in China. No matter what unknown reasons they had, using such manners to mislead people and trouble others was improper. Where were their conscience and integrity!? Did they know how, what time, and under what situations the MCRF model was developed, proposed and published? Did they know how long time it would take to develop a new geostatistical approach with practical methods from initial exploration of encountered scientific issues? ......
      We have no problems with any studies that attempted to explore the possible relationships between our model/method and other spatial/nonspatial models/methods, or even explain our model/method from different perspectives, as long as their studies are honest, objective and respectful. However, those publications that provided false data and wrong interpretation or irrationally claimed a new model on our model by tricks were improper, because they were cheating, misleading and consequently troublemaking. In addition, we don't think that purely playing mathematical symbol and equation games without physical meanings can be claimed to be geostatistics or spatial statistics. Even a pure statistical theorem/principle has its physical meaning; otherwise, Bayes' theorem and detailed balance principle have no difference from the conditional probability formula. The MCRF model as a nonlinear geostatistical model has its rationality and unique physical meanings for spatial data.

5. Why was Markov chain geostatistics (mainly MCRF model) misunderstood repeatedly?
Don't know much. There seemed no convincing reasons for those misunderstandings. If they read our articles carefully, they should be able to understand the important scientific progresses we made for solving existing specific scientific issues. Maybe they heard some rumors from others or were encouraged by others; or maybe they were really confused on the progresses we made. After all, most people are not experts in geo/spatial-statistics. Some researchers might not have the expertise in both geostatistics and Markov chains. But the spatial sequential Bayesian updating over nearest data and the spatial conditional independence assumption of nearest data within a neighborhood are not difficult to understand. It is also not difficult to make a test to see the difference in results between the simplified MCRF model and the CMC model using the same neighborhood (see Zhang and Li 2008). Our papers on the MCRF approach were openly published. Whoever has problems or evidence of scientific misconducts, they can write comment letters; however, so far nobody ever wrote one piece. Under such a situation, any attempt to trouble the MCRF approach in their publications is irrational. As to some people committing scientific frauds in articles (e.g., intentional cheating in data), that is already irrelevant with misunderstandings. No misunderstandings on others can justify one's own fraud. In fact, some people immediately found the fact that the CMC model underestimates small classes in their testing studies published in 2008 and in 2014, reapectively. But they did not clearly tell the truth; on the contrary, they messed the MCRF model and the CMC model together. It seemed that other things played a role in the "misunderstandings", and it was obvious that thirst for publishing articles in some good journals was also one of the major reasons.
      There seemed some irrational accusations since spring 2004 (or as early as Winter 2003), such as "Elfeki and Dekking did not say their model had deficiencies, why did you say?", "Used others' technology but did not provide a good evaluation", "Impacted others' research". These were typical misunderstandings. If they read the article of Elfeki and Dekking (2001) carefully, they should be able to see that our researches in this topic including the MCRF model, published since 2004, had no real conflicts with the publications of Elfeki and Dekking on their CMC model, published since 2001, except for some different emphases. As to the MCRF model, it is completely different from the CMC model both in equations and in theories, in addition to that the MCRF model solved the major deficiency of the CMC model - small class underestimation. All the deficiencies of the CMC model we mentioned and solved were displayed by Elfeki and Dekking themselves in their published articles (see small class underestimation and layer/parcel inclination; in addition, transiogram joint modeling methods solved the problem in estimating transition probabilities from sample data for implementing multidimentional Markov chain models).
      Elfeki and Dekking never expressed their disagreements to us on our research. The CMC model should be able to generate valuable results in some cases in geology, but with obvious deficiences in most situations, as shown by themselves. But for soil or land cover patterns, it does not work well in most situations due to the complex and heterogeneous landscape patterns (there are always small classes) as well as the wide use of scattered point sample data; that was why we made efforts to solve the existing problems shown in our studies. Some things that occurred in last more than one decade arround the MCRF approach were unreasonable and has gone beyond science and rationality. It not only ruined the chances of collaborative researches and blocked the normal advancement of geostatistics, but also created a series of chaos, delimmas and reputation loss. There are always some opportunists who want to create chaos.

6. Why did you write comment letters? Why not just ignore them and do your research?
We don't like to write comment letters. Before 2011, we never wrote a comment letter on others' published articles. But later we had to do so. Our comment letters aimed to clarify misunderstandings, rather than attack or stain others. So far we never wrote a comment letter on any article that did not seriously misunderstand our research in Markov chain spatial simulation and cause serious problems on our reputation and career. If we did not clarify some misunderstandings, who knows what would occur further on us. If one has problems with our research and publications in Markov chain spatial simulation, he/she should deal with it through a normal/formal way, for example, communicating with us face to face or through emails, or writing a comment letter on our published papers and letting us respond, which would be welcome and solve problems. It is not suitable to harass others through improper manners, purely based on one's imaginations or prejudices. In addition, we could be polite to the first misunderstanding and the second misunderstanding in open publications, but we might not be so polite to further "misunderstandings" that were essentially frauds and/or aimed to harm us. However, after repeated clarifications and exposing a couple of scientific frauds, there is really no necessity to argue further on scientific issues with those intentional troublemakers. If some people have so strong desire to show off their tricks in troublemaking or plagiarising, and some journals want to publish their rubbish, let them publish. Sooner or later, people will see the truth.

7. What is the difference between the MCRF model and the MRF model?
The MCRF model and the conventional Markov random field (MRF) model are different in both mathematical formulas and implementation methods, although they both are based on Markov property. The MRF model is equivalent to the Gibbs distribution. To our knowledge at the time we were developing the MCRF model more than ten years ago (i.e., 2004), the MRF model was mainly used as a typical lattice model (it does not matter whether the lattice is regular or irregular) (see Cressie 1993), usually using a 3x3 grid cell square neighborhood on image data, and it had existed for a long time since 1974 as a well-established spatial statistical theory (in fact, studies on MRFs began in 1950s). It was often used for image processing without conditioning on sample data. The MRF model were normally implemented using iterative algorithms, such as Gibbs sampler or the Metropolis-Hastings algorithm. When it was used with sample data, an initial image was still needed as the starting point of iterations. Because of some reasons (see Tjelmeland and Besag 1998, Norberg et al. 2002), it was not regarded as a good method for conditional simulations on sparse sample data in earth sciences previously. That might be one reason why people in earth sciences continuously made efforts to develop multi-D Markov chain models after 1970s. MRFs were expanded to Markov networks, which might deal with nonspatial data. But it seems difficult to say that Markov networks are a geostatistical model for local probability distribution estimation based on sparse spatial data. This did not mean that the conventional MRF models or Gibbs distribution could not be extended or simplified based on some ideas from other methods or under the inspiration of other progresses, including the ideas of the simplified MCRF models [Looking back at the things that occurred during last 14 years (since winter 2003 or spring 2004), it was not difficult to guess that some misled influential people probably had repeatedly called for international actions to check the research in multi-D Markov chain modeling. However, it is obvious that some checkers were just foolers and/or takers, rather than truth seekers].
      However, the MCRF model (Li 2007a) is a spatial Bayesian updating model at neighborhood nearest data level. It was initially proposed to solve the small class underestimation problem of the coupled Markov chain (CMC) model (see From the CMC model to the MCRF model) and then generalized as a geostatistical model for simulating categorical spatial variables. The MCRF model was initially based on the single-Markov-chain random-field idea (in order to avoid the conflict transitions that occurred but excluded in constructing the CMC model). The mathematical derivation procedure of the MCRF model is simple: It assumed a locally-conditioned Markov chain and the local conditioning on nearest data in different directions within a neighborhood followed a sequential Bayesian updating process (it was done by decomposing the joint probability distribution of nearest data and the central unobserved random variable into a series of likelihoods and a prior probability using the relationship of joint probability and conditional probability). To simplify the full solution with multi-point likelihoods, I further applied the spatial conditional independence assumption of nearest data (extended from a property of Pickard random fields) to the MCRF model. This derivation process has no relation with maximum entropy or MRF (i.e., Gibbs distribution), and the model formula of MCRFs is also unique. There was no experiential addition or elimination of any components. Although the entire derivation process looks simple, it was based on spatial thinking of neighborhoods and spatial auto/cross-correlations, which make it thoroughly different from nonspatial statistical models. Therefore, the MCRF model is a typical fundamental Bayesian spatial statistical model at neighborhood nearest data level. These points were clearly presented in Li (2007a), Li and Zhang (2008), Li et al. (2013) and Li et al. (2015). Visually, the MCRF model is a probabilistic directed acyclic graph, and we later found that the MCRF model essentially conforms to a special Bayesian network built over the neighbrohood of spatial data. But this does not mean that Bayesian networks were a geostatistical model or automatically become a geostatistics. At most, we may say that the Bayesian network idea might be used to construct or interpret a geostatistical model like the MCRF model or the MCRF model does not violate the rules of Bayesian networks. Because the MCRF model was derived using Bayes' theorem, as a Bayesian model it is normal that it can conform to Bayesian networks. However, because spatial locations (data points and unobserved locations) do not have causality, it is difficult, if not impossible, to construct a Bayesian network on spatial data points and unobserved locations through reasoning. In addition, the MCRF model does not need to use iterative algorithms to perform simulations (but if we shrink its neighborhood to immdediately adjacent grid cells on a lattice, it may be run with iterative algorithms).
      Therefore, the MCRF model and the MRF model are theoretically different with different formulas. However, this does not mean that the MCRF model has no relation with the MRF model. The Pickard random field model (which was often regarded as a special kind of MRFs) is important for the development of the MCRF model. Considering the Markovian nature of the neighborhood system of the MCRF model and the facts that it was developed based on the single-Markov-chain random-field idea, it obeys the transition probability rules, and it was simplified using the extension of the cardinal-neighbor conditional indpendence property of Pickard random fields, I even regarded the MCRF model as a "special MRF" or an extension of the MRF model toward the sparse data space at the beginning (see Li 2007a). We are not good at manipulating mathematical symbols and equations. My further study on the MCRF model had found that it is consistent with Bayesian networks and may be regarded as a special Bayesian network on spatial data, and I also found that such a special spatial Bayesian network could not be constructed by reasoning based on causality. Since the proposition of the MCRF model more than ten years ago, it seemed that many people wanted to test or challenge it. We definitely welcome those sincere tests or challenges with real data. But intentional troublemakings that aimed to create chaos and trouble us by playing tricks, such as intentionally messing irrelevant models/methods together with wrong interpretations, providing false case study data generated by other models to stain the MCRF model, always using extreme cases or improper parameters to demonstrate the CMC model works perfectly, and irrationally claiming new spatial models on the MCRF model through irrelevant models or even nonspatial models, are improper. Even if we do not write comment letters to point out the problems of some troublemaking articles, that does not mean they are doing the right things and may escape from the eyes of readers forever. Although our knowledge in mathematics and statistics was limited, we did our research honestly and carefully. We believe that every geo/spatial-statistical model should first have its rationality and physical meaning. As to whether the MCRF model can be regarded as a special kind of MRF model or not, let the real experts in statistics or mathematics judge. This problem actually has never been our concern. It is inevitable that the MCRF model/theory, including the spatial conditional independence assumption of nearest data in a neighborhood, has generated large effect during last decade in the field of geostatistics, no matter whether the contribution of the MCRF model was properly recogized in some publications.


In general, the MCRF approach represents a new geostatistical approach: (1) When the MCRF idea was initially proposed in 2004 and then formally published in 2007, it was different from any existing spatial/geo-statistical model. (2) Its initial purpose was to solve the small class underestimation problem of the CMC model. It reached its goal through avoiding the conflict transitions occurring in the CMC model by using the single-chain idea. (3) It was rationally derived using Bayes' theorem and sequential Bayesian updating on nearest data within a neighborhood for its full solution and further simplified using the spatial conditional independence assumption of nearest data within a neighborhood for its simplified formula. (4) Its simplified form was well-tested to be workable by us. (5) It was a scientific progress made by us after a long-term exploration on Markov chain spatial modeling. Without such a long-term exploration and extensive computer programming for testing various ideas, there is no way to make the progress no matter how simple it looks.



Go Back