Go Back

A note about the MCRF approach

The MCRF approach, or the Markov chain geostatistics, is a geospatial statistical approach based on the Markov chain random field (MCRF) model. The major idea of the MCRF model is a single-Markov-chain random-field or more clearly a locally-conditioned Markov chain: A single Markov chain moves or jumps in a space (one to multiple dimensions) with its local conditional probability distribution at any unobserved location to be conditional on nearest data in different directions within a neighborhood. The local conditioning of the Markov chain is realized through a local spatial sequential Bayesian updating process on nearest data within a neighborhood. A spatial conditional independence assumption of nearest data within a neigborhood, extended from the cardinal-neighbor conditional independence property of Pickard random fields (i.e., given the central cell, its adjacent cells in cardinal directions are conditionally independent in Pickard random fields), was used to simplify the MCRF multi-point solution into a spatial model with only transition probabilities. The transiogram is an indispensable part of the MCRF approach, for providing parameters to implement simplified MCRF models.

The MCRF model uses a single Markov chain to deal with a whole random field with sample data by conditioning the Markov chain to local neighborhood data (the single Markov chain is conditioned on nearest data in each local neighborhood). Because the MCRF model considers Markov property in its neighborhood design, it is based on the single-Markov-chain random-field idea, and it obeys the transition probability rules, Li (2007a) thought that it might be regarded as an extension of conventional Markov random fields (MRFs) toward the space of sparse data points, where the distances or distant interactions among data locations within a flexible neighbrohood system need to be considered (but this does not mean it was derived from conventional MRF model or Gibbs distribution). This generalized single-chain idea (Li 2007a) was extended from the specific single-chain idea used in the single-chain-based multidimensional Markov chain model for subsurface characterization (see Li and Zhang 2008. This paper was written earlier but published later than Li (2007a)), which corrected the defects of the coupled Markov chain (CMC) model of Elfeki and Dekking (2001) for subsurface characterization. The MCRF model is essentially a Bayesian spatial model: Its local conditional probability function is factorized by repeatedly using the relationship of conditional probability and joint probability, and the factorizing terms can be intuitively explained as a local sequential Bayesian updating process of the prior probability on nearest data around an unobserved location being estimated within a neighborhood. General solution and some 2D to 3D simplified MCRF models with nearest data only in cardinal directions were provided in Li (2007a), and cosimulation models with emphases of general full solution and the sequential Bayesian updating view were presented in Li et al. (2013, 2015). The MCRF idea has no conflicts with existing statistical and spatial statistical theories [e.g., Markov chain theory, Bayesian theory, Markov random field theory, conventional geostatistics (kriging), among others] and was proved to be workable with advantages over some existing popular method in case studies. A typical characteristic of the MCRF model is that it can generate quite aggregative patterns that is prefered for categorical spatial simulation. As spatial statistical methods, the MCRF model and the MCRF approach did not exist in this world before it was proposed. Therefore, the MCRF approach represents a new approach in geo/spatial-statistics.

The MCRF approach is a result of decades-long exploration in Markov chain spatial modeling since 1990s. I began to study and use Markov chains and conventional geostatistics in early 1990s. The 2-D Markov chain modeling study of ours started from my postdoc research (with Professor Jan Feyen in soil science) in Leuven University (KU Leuven) in 1999, which did not generate publishable results due to the defects of the 2-D CMC model used [the unconditional 2-D CMC model suggested by Elfeki (in his dissertation in 1996) generated inclined soil class patterns and underestimated small soil classes in our case study, and the parameter estimation method (i.e. estimating one-step transition probability matrices from the original image for validaton) was also not practical for application purposes. In fact, the unconditional 2-D CMC model was not effectively tested with different real data sets and patterns in the dissertation]. Because I was only provided the dissertation book at that time, I developed the Fortran computer program (source code) for the model by myself based on its description in the dissertation, which was easy to do due to the simplicity of the model and my experience in computer programming. The code was simple and carefully checked repeatedly to make sure it is correct. Our initial purpose was to use the model to simulate soils so as to deal with the spatial uncertainty in distributed hydrology caused by soil variability. Unfortunately, this study did not reach the goal (For a postdoc research, a postdoc report must be written as a research record no matter whether the research results are positive or negative). Although the CMC model did not work well for soil simulation, we thought it had some merits (e.g. it generated aggregative patterns) and potential (the defects might be solvable). Later we tried to develop some ideas to extend and modify the model to eliminate the defects for practical use. In the exploration process for solving encountered scientific issues, we gradually understood the causes of the defects of the 2-D CMC model, thought out some ideas and solved a series of issues that impeded the evolution of Markov chains toward an independent geospatial statistics for conditional simulation on sample data. The study had no intention to expose the mistakes of others (my postdoc report was never published and we never submitted a manuscript to any journal on the topic before we believed we had basically understood the scientific issues and made our own progress in spring 2003). In fact, our research had no conflicts with the article of Elfeki and Dekking (2001).

Unfortunately, some misunderstandings occurred around this research during last decade since spring 2004 (maybe earlier in fall 2003, after we modified the CMC model for subsurface vertical lithology simulation of Elfeki and Dekking (2001) into a model for predictive mapping of soil classes in horizental two dimensions (see Li et al. 2004)). Some people quickly became crazy. The misunderstandings went too far later beyond science and rationality [first after the single-chain-based multi-dimensional Markov chain model and its generalized version, plus a transiogram method for parameter estimation, were proposed in second half of 2004, and then after the major articles were published in 2007], and pushed me into difficulty repeatedly. It seemed that some people mistakenly thought that this research had conflicts with the CMC model proposed by Elfeki and Dekking (2001). However, our early publications had no real conflicts with the article of Elfeki and Dekking (2001) about the CMC model: (1) Li et al. (2004) cited Elfeki’s dissertation (1996), but Elfeki and Dekking (2001) had first cited it. (2) While Li et al. (2004) pointed out that the unconditional CMC model had some defects in soil simulation, Elfeki and Dekking (2001, figure 6) had demonstrated that it worked poorly (only generated a little information of small class layers along wells in simulating geologic layers). (3) While Li et al. (2004) pointed out that the conditional CMC model of Elfeki and Dekking (2001) “largely improved the practicality of the method but the problems still exist to some extent when the density of boreholes is not high enough”, Elfeki and Dekking (2001, figures 6 and 8) had demonstrated that the conditional CMC model could only capture a part of the small class layers (note that the major layer class is class 8, mudstone, with the white color in their figures). The difference was that we explicitly pointed out the problems in sentences, provided data analysis, and discussed the reasons, but they did not. We had to explicitly discuss the defects of the CMC model because we were making efforts to solve them. There must be some reasons for the defects of the CMC model, and one might be able to solve them only after understanding the problems. Therefore, there was nothing improper to make effort to explore and solve the defects after we accidently met them in our previous study, and publish our progress after they also showed the problems. The Fortran computer programs I developed for the CMC model and modified CMC models were verified to be correct in Fall 2003. In fact, I even suggested to form an international collaborative group (with A.M. Elfeki and E. Park) in further research in fall 2005 when Elfeki contacted me to express a collaboration intention. However, things did not develop as expected due to some reasons (probably interference of some people due to some misunderstandings). Our understandings about the defects of the CMC model were proven to be correct, and that was why we could solve the problems. If somebody had different understandings about the defects of the CMC model, he/she might provide some different explanations to share with readers by writting a comment letter or short communication, but using tricks or frauds (or encourage others to do so) to mislead readers/publics and trouble/insult others to realize other goals is wrong!

Apparently the two sides (i.e., Elfeki side, and our side) had different emphases. Their emphasis seemed that the CMC model could work or generate valuable results in some cases in geology (with careful choices in pattern and cell size), and it could be useful in hydrogeological modeling. Our emphasis was that the model showed apparent deficiencies in our case studies in soil simulation and solving those deficiencies was necessary to make it practical for predictive soil mapping. By facing the problems, our efforts eventually solved the defects of the CMC model and consequently led to a new model - the single-chain-based multidimensional Markov chain model (see Li and Zhang 2008), which further led to the more generalized MCRF model (see Li 2007). All of the defects of the CMC model we mentioned and solved were displayed more or less by Elfeki and Dekking themselves in their case studies in their published papers [see Elfeki and Dekking (2001, 2005, 2007) for small class underestimation and layer inclination tendency in some simulation cases, although they did not explore and discuss the scientific reasons of the issues], and all of the defects have their scientific reasons (see detailed explanations in this website, or in Li and Zhang 2008 and Li 2007a) and also occurred in some other models. While solving the defects of the CMC model, the MCRF model is theoretically very different from the CMC model and it is a much more generalized spatial statistical theory based on strict statistical derivation and new ideas. That was why based on it a new geospatial statistical approach was proposed.

As a normal scientific research, our research activities in multidimensional Markov chain modeling (mainly the development of the MCRF approach) have continued for a very long time (19 years since 1999) and have gone through a series of steps: (1) encounter scientific issues in an attempted application study; (2) explore the encountered scientific issues; (3) gradually solve the scientific issues; (4) propose a new fundamental model based on our progresses; and (5) basically establish a complete model/theory and approach framework. At present we are working on the step (6) - develop some practical specific methods for real applications. Computer code development was conducted throughout the whole process from the beginning. Such a process is long, often deterred by some irrational troublemakings since 2004. Therefore, developing a new geospatial statistical approach from initial exploration of encountered scientific issues to (the construction of an entire theoretical system and further to) final practical software release is not an easy thing. During last 15 years since spring 2004 (or winter 2003), the troublemaking on our research in multi-D Markov chain modeling (first the modified CMC model and then the MCRF approach) has occurred round by round secretly and openly, and even developed into a long-lasting personal insult and persecution to the model proposer. Some Chinese were dragged into the campaign to make troubles blindly, and many people were confused or scared. Many misleading articles were published in journals and some were even published in well-known international journals since 2011. Although we commented a few articles and made repeated clarifications, the whole research area was still messed up repeatedly. Distorted interpretations and trick-based claims/plagiarism to the MCRF model were made repeatedly. Why did some people intentionally write cheating or joking articles to mislead people and mess up the MCRF approach? Joke the scientific community? What things did some people want to cover? What goals did they want to reach? Just couldn't tolerate a new geostatistical approach proposed by a person without social power!? Why didn't the authors of the cheating articles fear being punished? One can think......; that is thought-provoking. No matter who were behind the troublemaking campaign, those misunderstandings and irrational troublemaking should not continue, and especially should not be used for other purposes! While they confused readers and caused difficulties to us, they also messed up the related scientific fields and could damage the reputation of those so-called "misunderstanders" themselves!

The general full and simplified solutions of MCRFs may support a variety of multi-dimensional MCRF models, and may be extended into more integrative spatiotemporal models for some specific applications. Varied forms of MCRF models may be derived. The Markov chain geostatistics (i.e., the MCRF approach) has become one of the research topics of our research team. Much work needs to be done to develop such a new spatial statistical idea into a full-fledged and practical spatial/spatiotemporal modeling approach for various application purposes. We recognized the contributions of related pioneer studies to the MCRF approach. We thank many experts, reviewers and editors in applied statistics (mathematical geosciences and environmental statistics), soil science, GIS, and remote sensing for their helps and supports in the publication process of our papers in the MCRF approach. We especially want to express our gratitude to Prof. D.M. Titterington, Prof. W.E. Sharp and Prof. R.J. Martin for their supports to the publication of the MCRF idea and to editors of SSSAJ (Soil Science Society of America Journal) and anonymous reviewers for their supports to the publication of the MCRF random-path sequential simulation algorithm and the transiogram full paper during 2005 to 2006.


Go Back