Douglas Bates excelled during my first tutorial session of the useR! 2008 conference. He gave a three hours talk on mixed models, in which he was able to give an overview on theory and basic specification of these kind of models in R-Project, and to address highly advanced and avant-garde issues as well. I’m impressed. During the brake he was so kind as to answer a question regarding mixed models, that had nothing much to do with what he addressed during his talk. We even ended up having a short but nice talk about dutch politics.
During what was basically his introduction, he gave a nice guideline regarding a discussion that we have been having at our own university. It is the discussion on what instances we can apply mixed models to grouped data, and in what cases we can’t. Although he basically didn’t add anything that was new to me, his statements gave a lot of clarity to my thoughts on the subject. His basic argument was that we can estimate a grouping factor as mixed effects only, if it is reasonable that they come from a collection of these factors. So, for instance, the distinction between male and female would not be a good mixed effect, because should we repeat the ‘experiment’, we would automatically end up with the same values (male and female) on our grouping factor. A good example would be the class that a school child is in, for when we repeat the experiment with a new sample, we would end up with students in different classes. More interestingly, though, was his acknowledgment that there are simply grey areas. These are found on two extremes of the same dimension. When a small number of grouping factors are present, we end up with problems estimating the model. On the other hand, if we have (almost) all existing factors (i.e. all American states in a survey research project), then we wouldn’t end up with different grouping factors (states) when the project would be repeated. I find the fact that these extremes are defined as a grey area is rather clarifying and more informative than simply taking one of the extreme positions (‘always estimate mixed models’ or `mixed models are completely flawed in such cases’.
Following this introduction, a wide array of issues were addressed. Longitudinal models with time as a co-variate, interactions on the level of the grouping factor, theory of generalized models, an example of these generalized linear models, and finally some attention was paid to non-linear mixed models.
What I found especially interesting, though, was the explanation of how item response models can be represented by using a generalized linear mixed models. Item response models are based on theory that basically states that the responses people give to a stimulus (i.e. survey questions), are both due to characteristics of the stimulus, and due to characteristics of the respondent. We thus need a method for disentangling both influences. Douglas Bates demonstrated a method of doing so by applying mixed models. For long, computer was not capable of properly estimating such models. Now, it has become possible to approach the analysis of such models, by interpreting the responses to the items to be nested within individuals. Both item characteristics and person characteristics can then be added to this basic model.
To sum up, I found this session to be extremely fascinating. It gave a very good overview on mixed models, I learned some new thing, and I saw things that I did not understand. At all. That’s the risk that lies in getting a statistics course given by a mathematician. But, since we have the slides and books, these sections of the course will still function as pointers of what topics to study in the future.
Being in such an interdisciplinary setting as the useR! conference does that to you: you see topics and methods used in a completely different context that what you’re used to. From that you can easily gain a more general understanding of the techniques you work with within the safe confines of your own discipline. Very enriching and inspiring, and I think the applause was well deserved.
More to come this afternoon, when I will attend a session by Frank E. Harrell Jr. on regression modeling strategies.