next up previous contents
Next: The Parameter Estimates Up: The Details of the Previous: Posterior Probability Distribution   Contents


Parameter Estimation and Model Comparison

The posterior probability distributions of section 5.3.4 constitute a complete description of the author's beliefs about the models and their parameters, once the data (chapter 5) are taken into account. However, it is a very unwieldy description, consisting as it does of two scalar fields defined on a $47$-dimensional space. A summary is required for presentation. The author believes that the most useful summary will be to give, for each parameter $Q_m$, estimates of its posterior marginal expectations

\begin{displaymath}
<Q_m\vert D, M_N> = \int_{\textrm{All }\mathbf{Q}\textrm{
space}}Q_mP(\mathbf{Q}\vert D, M_N)\mathrm{d}^{47}\mathbf{Q}
\end{displaymath} (5.50)

and
\begin{displaymath}
<Q_m\vert D, M_M> = \int_{\textrm{All }\mathbf{Q}\textrm{
s...
..._mP(\mathbf{Q}\vert D, M_M)\mathrm{d}^{47}\mathbf{Q}\textrm{,}
\end{displaymath} (5.51)

and of its posterior marginal standard deviations,
\begin{displaymath}
\sigma{}(Q_m\vert D, M_N) = \sqrt{<Q_m^2\vert D, M_N>-<Q_m\vert D, M_N>^2}
\end{displaymath} (5.52)

and
\begin{displaymath}
\sigma{}(Q_m\vert D, M_M) = \sqrt{<Q_m^2\vert D, M_M>-<Q_m\vert D, M_M>^2}\textrm{,}
\end{displaymath} (5.53)

where
\begin{displaymath}
<Q_m^2\vert D, M_n> = \int_{\textrm{All }\mathbf{Q}\textrm{
...
...^2P(\mathbf{Q}\vert D, M_n)\mathrm{d}^{47}\mathbf{Q}\textrm{.}
\end{displaymath} (5.54)

The summary should also include the estimates of the model posterior probabilities $P(M_n\vert D)$ (equations 5.47, 5.48.)

If a means can be found of drawing samples $\mathbf{Q}^{(\textrm{pos}, n,
i, j)}$ from the distribution $P(\mathbf{Q}\vert D, M_n)$, where $i$ runs from $1$ to $I$, $j$ runs from $1$ to $J$, and the total number of samples is, therefore, $IJ$, $<Q_m\vert D, M_n>$ can [48] be estimated by

\begin{displaymath}
\bar{Q_m}^{(\textrm{pos}, n)} = \frac{\sum_{i=1}^I\bar{Q_m}^{(\textrm{pos}, n, i)}}{I}\textrm{,}
\end{displaymath} (5.55)

where
\begin{displaymath}
\bar{Q_m}^{(\textrm{pos}, n, i)} = \frac{\sum_{j=1}^JQ_m^{(\textrm{pos}, n,
i, j)}}{J}\textrm{.}
\end{displaymath} (5.56)

Similarly, $\sigma{}(Q_m\vert D, M_n)$ can [48] be estimated by
\begin{displaymath}
\sigma_m^{(\textrm{pos}, n)}
=\sqrt{\frac{IJ(\bar{Q_m^2}^{(\...
...,
i)}-(\bar{Q_m}^{(\textrm{pos}, n, i)})^2)}{IJ-1}}\textrm{,}
\end{displaymath} (5.57)

where
\begin{displaymath}
\bar{Q_m^2}^{(\textrm{pos}, n)} = \frac{\sum_{i=1}^I\bar{Q_m^2}^{(\textrm{pos}, n, i)}}{I}\textrm{,}
\end{displaymath} (5.58)

and
\begin{displaymath}
\bar{Q_m^2}^{(\textrm{pos}, n, i)} = \frac{\sum_{j=1}^J((Q_m^{(\textrm{pos}, n,
i, j)})^2)}{J}\textrm{.}
\end{displaymath} (5.59)

Similarly, if a means can be found of drawing samples $\mathbf{Q}^{(\textrm{pri}, n, i, j)}$ from the distribution $P(\mathbf{Q}\vert M_n)$, where $i$ runs from $1$ to $I$, $j$ runs from $1$ to $J$, and the total number of samples is, therefore, $IJ$, $P(D\vert M_n)$ can [48] be estimated by

\begin{displaymath}
\bar{L}^{(\textrm{pri}, n)} =
\frac{\sum_{i=1}^I\bar{L}^{(\textrm{pri}, n, i)}}{I}\textrm{,}
\end{displaymath} (5.60)

where
\begin{displaymath}
\bar{L}^{(\textrm{pri}, n, i)} = \frac{\sum_{j = 1}^jP(D\vert\mathbf{Q}^{(\textrm{pri}, n, i, j)})}{J}\textrm{.}
\end{displaymath} (5.61)

All of these estimators are [90] non-Bayesian, i.e. they coincide exactly with Bayesian posterior expectations of the quantities being estimated, given the samples, only for specific, but not specified, prior probability distributions over moments of the distribution $P(\mathbf{Q}\vert D, M_n)$. The estimators will, therefore, differ by an amount $\delta$ from the posterior expectations that would be obtained with explicit statements of plausible prior probability distributions over these moments, if this were technically feasible. Fortunately, however, the size of $\delta$ tends [10] to decrease with decreasing marginal likelihood, i.e. with increasing number of samples. Equally fortunately, when the inference process aims to estimate moments of a probability distribution, from samples from that distribution, a critical $\delta$ for each moment is made available, by the degree of variation implied by the higher moments, such that significantly smaller values of $\delta$ can be safely ignored. All this means that, for large numbers of samples, the non-Bayesian estimators are acceptable.

The means of obtaining samples is the leapfrog method. This is explained in detail in Information Theory, Inference and Learning Algorithms [90], where it is attributed to Skilling. Each iteration of the method consists of the application of a leapfrog proposal density, followed by the application of a Metropolis decision algorithm. For sampling from the posterior probability distribution, after the $i$th iteration, the method's state is a set of $J$ parameter vectors $\mathbf{Q}^{(\textrm{pos}, n,
i, j)}$, which are the samples to be used in the estimators. The leapfrog proposal density sets

\begin{displaymath}
\mathbf{Q}'^{(\textrm{pos}, n, i, j)} = 2\mathbf{Q}^{(\textrm{pos}, n, i,
k)}-\mathbf{Q}^{(\textrm{pos}, n, i, j)}\textrm{,}
\end{displaymath} (5.62)

where, for each $j$, a $k$ is chosen at random from the other $J-1$ integers from $1$ to $J$, with each of these being equi-probable. The Metropolis decision algorithm then works as follows: An analogous process is used to draw samples $\mathbf{Q}^{(\textrm{pri}, n, i, j)}$ from the prior probability distribution.

As part of the preparation of the present thesis, the author has written Perl code that implements the leapfrog method; in the transparent copy of this thesis, the code for the null model $M_N$ is in the file null, and the code for the main model $M_M$ is in the file main. As far as the author has been able to discover, the present thesis is the first use of the leapfrog method to infer parameters and model likelihoods from real experimental data. Some further details are needed to define the particular version of the leapfrog method used in this thesis:


next up previous contents
Next: The Parameter Estimates Up: The Details of the Previous: Posterior Probability Distribution   Contents
Daniel Christopher Hatton 2004-11-30