Go Back

Different interpretations of Markov chain random fields


As widely known, if a spatial statistical method is theoretically correct, it should not have conflicts with existing statistical principles, theorems, and well-established models/methods. Thus, a theoretically sound spatial statistical model/method may be explained from different perspectives or may have different interpretations from different angles. With the initial purpose of solving the defects of earlier multi-dimensional Markov chain models (mainly the coupled Markov chain (CMC) model), the Markov chain random field (MCRF) model was derived using Bayes' theorem (based on the single-Markov-chain random-field idea) as a special Markov field model (a locally-conditioned Markov chain) for accounting for distant data interactions of categorical spatial variables (see "From the CMC model to the MCRF model") (Li 2007a, Li and Zhang 2008). First, it has a Markov chain view (Note: for a fixed continuous path, it is a 1D Markov chain updated by local nearest data in other directions; for a random path, it is a jumping Markov chain updated by local nearest data). One can examine it and find that a MCRF is always a Markov chain if considering only one nearest datum in each neighborhood as part of the data sequence of the Markov chain and regarding other nearest data as new evidence to update the Markov chain transition probability distribution locally. Second, it has a Bayesian view (it is a spatial Bayesian updating model and also a special complex Bayesian network on spatial data at the neighborhood nearest data level. It is widely known that a 1-D Markov chain is a simplest Bayesian network). No matter how to examine it, both the full general solutions and the simplified general solutions based on the spatial conditional independence assumption do not violate the Bayesian principle (while the MCRF model was derived using spatial sequential Bayesian updating, it certainly does not violate the Bayes' principle). There may be some seemingly plausible, but arguable or trick views as well. However, no matter how many different views (including the trick interpretations) exist, they do not negate the correctness and originality of the MCRF model as a spatial statistical model because such a geo/spatial-statistical model did not exist before we proposed it. Otherwise, one simple comment letter with evidence of misconducts from a real expert in geo/spatial-statistics could deny it immediately right after its publication. So far, nobody could write a comment letter on Li (2007, "Markov chain random fields for estimation of categorical variables". Mathematical Geology, 39(3): 321-335) to point out any wrongs of the MCRF model since its publication 10 years ago and also nobody could do the same on the coMCRF model presented in Li et al. (2015, Mathematical Geosciences, 47(2): 123-148) since its publication. All we could see were irrational troublemakings.

In fact, if one has the interest to examine other widely known geo/spatial-statistical models, such as the kriging model, the indicator kriging model, and the conventional Markov random field (MRF) model, one also can find they may be explained from multiple different perspectives. For example, are they maximum entropy models? If they are not, are they so-called "restrained maximum entropy" models? For another example, is the 1-D Markov chain model a maximum entropy model or "restrained maximum entropy" model? If it is, then is the coupled Markov chain (CMC) model also a maximum entropy or so-called "retrained maximum entropy" model because it is composed of two 1D Markov chains? Then one can think.... In fact, there is no such thing called "restrained maximum entropy"! If a model can be rederived by maximizing entropy with constraints set for the model, it is a maximum entropy model. If it cannot be, it is not a maximum entropy model. In fact, both simple kriging and the complete Gibbs distribution (i.e., MRF model) are maximum entropy models, all of which had been rederived by maximizing entropy with constraints set for them. But it is not proper to take them as new original models with different names just because they are maximum entropy models. The simplified MCRF model is apparently not a maximum entropy model, because it is a simplified form from the full MCRF model with multiple-point likelihoods. Even the full MCRF model may not be a maximum entropy model, because it is very different from Gibbs distribution - a maximum entropy distribution and the standard form of the MRF model. Then how can the simplified MCRF model is a simplified maximum entropy model? Even if we assume it could be regarded as a simplified maximum entropy model, that does not mean one can claim it with the excuse. Therefore, no matter how many ways one can use to explain or reinterpret the simplified MCRF model, taking it over with a different name by playing tricks and staining it was a misleading joke and essentially a plagiarism. Such behaviors could well mislead people and get the MCRF model proposer being crushed in China or USA. So it is not difficult to see the real purpose of the intentional troublemakers - persecute the MCRF model proposer.

Furthermore, probably becasue the conditional independence assumption was mistakenly thought to be improper in geo/spatial-statistics, such an assumption was rarely used by geostatisticians in nonlinear spatial models. Li (2007a) introduced it from Pickard (1980) as a cardinal-neighbor property of Pickard random fields and expanded it to sparse sample data, and then used it to simplify the full solution of MCRFs (with multiple-point likelihoods) to a simplified solution with only transition probabilities. While rationalizing the conditional indipendence assumption for spatial data, it also pointed out the possible application scope of this assumption on spatial data. One should not directly interpret simplified MCRF models improperly to cause misunderstandings by ignoring this point and the derivation process (the facts that MCRF model was derived using spatial sequential Bayesian updating over nearest data and it has a full solution). Without the derivation process, one cannot explain the head-tail directions of transition probability terms in the simplified MCRF model, which are exactly the difference of the simplified MCRF model from the CMC model of Elfeki and Dekking (2001). If one tried to derive a similar spatial model by other ways using heureistic rules, the final obtained model probably would be a generalized form of the CMC model rather than the simlified MCRF model.

Now we can see that the MCRF model can be well-explained as a spatial Bayesian network model. The MCRF model was derived using spatial sequential Bayesian updating, and was initially suggested for correcting the defects of the CMC model in order to simulate categorical soil variables. Without this motivation and a long-time effort, it was impossible for us to develop a new spatial statistical model based on nothing. Even a pure mathematician or statistician with profound statistical knowledge may not have the possibility to do such a thing without a specific purpose (i.e., solving some problems) and a long-term effort. No matter how some people try to explain the MCRF model in different ways and whether their explanations are correct or not, they or others should not stain the MCRF approach or trouble the model proposer irrationally (e.g., writing wrong/bizarre equations or implement wrong algorithms to get wrong results with wrong explanations/claims to cause misunderstandings among readers and the public).

For geostatistics or spatial statistics, development of fundamental models at neighborhood nearest data level is never a simple thing. This is irrelevant with how much profound knowledge one has in statistics or mathematics. First, there is large difference between spatial models and nonspatial models and between spatial data and nonspatial data. Without long-time exploration on specific issues with spatial data, breakthroughs in spatial thinking, and extensive computer programming for testing using spatial data, it won't be easy to extend a nonspatial theory, model or idea (e.g., least squares regression, 1-D Markov chain, or Bayes' theorem) into a spatial theory/model, let alone directly use it to account for the denpendencies of spatial sample data. Second, the distant interaction problem of sparse spatial data is different from the data interaction problem of lattice data that the conventional MRF model deals with. Third, the proposition of a new spatial model or approach must have its purpose - to solve existing scientific issues, rather than to attack/smear others, mess up/take over an existing model, publish some articles, or show off one's knowledge and capability. The distance from a nonspatial statistical idea/model to a practical spatial statistical model is huge. It needs to wait for the breakthroughs in spatial thinking and progresses in parameter estimation and computation. But the breakthroughs/progresses won't come simply and easily without long-time exploration on specific issues with spatial data. One cannot develop or propose a complex, theoretically sound, and practical spatial statistical approach suddenly without a motivation and deep exploration, without solving any existing scientific issues, and even with insults to others. Scientific conscience and integrity are definitely indispensable.

Many researchers spent decades or their whole lives to work on existing geo/spatial-statistic models/approaches, use them for application researches or develop some new ideas to solve some existing problems within the existing frameworks. Only a few of them have the possibility to eventually suggest a new spatial statistical approach, if they met some significant scientific issues (usually on an underdeveloped topic or in application studies) and solved them theoretically with new ideas. Such a process can be very long and difficult, and it is usually not guaranteed. Why did we propose the MCRF approach? That is because Li studied conventional geostatistics and Markov chain modeling for decades with application studies, met some scientific issues in two-dimensional Markov chain modeling, and eventually solved them after a long-time exploration. When the random-path MCRF simulation algorithm was developed during the winter of 2005-2006, Li was staying at home to recover his illness, and when the major articles of the MCRF approach were published in 2007, Li was paralyzed at home. However, after the major articles of the MCRF approach was published in 2007 (or after the MCRF model aticle was accepted by Mathematical Geology in 2006), some people still organized and launched a series of open challenges on the MCRF approach by using various tricks and even open scientific frauds (about 5 rounds of troublemakings occurred since 2007 to 2018). These irrational troublemakings caused severe harm on the life and career of the model proposer.

The MCRF model did not exist before we derived it, developed computer programs, and tested it. It was different from any existing spatial statistical models when its initial idea was proposed in 2004 and finally published in 2007. For our knowledge, even the spatial conditional independence assumptuon for nearest data within a neighborhood should be first rationalized in Li (2007a) based on Pickard random fields, and the use of sequential Bayesian updating to nearest spatial data within a neighborhood in Li (2007a) should also be the first time. The MCRF model not only realized its initial goal - correcting the small class underestimation problem that occurred in the CMC model, but also provided the theoretical and methodological foundation for a new geospatial statistical approach. Without a decade-long concentrated exploration, without the driving forces of solving the defects of the CMC model and extending Markov chains into an independent geostatistical approach, or without the supports from many people with scientific conscience, it was impossible for the MCRF model to be developed and published. After such a problem-solving model was proposed, any attempt to mess or ruin it by tricks or by harming/humiliating the model developer is unreasonable.


References:

Elfeki, A.M., and F.M. Dekking. 2001. A Markov chain model for subsurface characterization: Theory and applications. Math. Geol., 33: 569-589.

Journel, A.G. 2002. Combining knowledge from diverse sources: an alternative to traditional data independence hypothesis. Math. Geol., 34: 573-596.

Li, W. 2007a. Markov chain random fields for estimation of categorical variables. Math. Geol., 39(3): 321-335.

Li, W., and C. Zhang. 2008. A single-chain-based multidimensional Markov chain model for subsurface characterization. Environ. Ecol. Stat. 15(2): 157-174.

Pickard, D.K. 1980. Unilateral Markov fields. Advances in Applied Probability 12(3): 655-671.


Go Back