Deviation information criteria (DIC) is a metric used to compare Bayesian models. It is closely related to the Akaike information criteria (AIC) which is defined as , where k is the number of parameters in a model and is the maximised log-likelihood. The DIC makes some changes to this formula. Firstly by replacing a maximised log-likelihood with the log-likelihood evaluated at the Bayes estimate and by replacing with an alternative correction
These changes make it more suitable for a Bayesian model, but beware, it isn’t a fully Bayesian metric in the philosophical sense. You are reducing probability distributions down to point estimates, so it looses some of the Bayesian credibility.
To demonstrate how we can calculate DIC, I simulate some data and draw from its posterior distribution. For simplicity, I use the Poisson distribution with a conjugate gamma distribution. This lets me easily draw from the posterior distribution.
This gives us a DIC value of . Which is useless on its own, but given two models we can compare the DIC values and favour the model with lowest DIC.
To demonstrate this, we simulate some data from a gamma distribution and fit two models; a gamma and a lognormal model using Stan.
The Stan code for the models is simply:
Replacing the gamma distribution for a lognormal in the other model.
I simulate 1000 datapoints and sample from Stan before forming a 2 column matrix with of the posterior samples.
To construct a function for the DIC, you need to be able to pass in
the data, likelihood function and posterior samples. Thankfully, R has
a fairly standard way of using its probability distributions, we can
rely on both
dlnorm to take in the same type of
arguments. But the
dic function is
not flexible enough that any distribution can be passed through.
By applying the function to the sample data and calculating the values we find that
The true model (gamma) has the lower DIC as expected. So everything is working as expected!
You can also asses the DIC on some out-of-sample data. This is achieved
by simply simulating from the same distribution and passing it through
Again, the gamma model has the lower DIC, more evidence that this is the correct model.
Overall, the DIC is a useful metric to asses your model correctness and easy to calculate using your posterior samples.
Submitted to RWeekly