R-Sessions 20: Plotting Multilevel Models


Plotting the results of a multilevel analysis, without use of the extension package ‘Lattice’ can be quite complicated while using R. Using only the basic packages, as well as the multilevel packages (nlme and lme4) there are no functions readily available for this task. So, this is a good point in this manual to put some of our programming skills to use. This makes exactly clear how the results of a multilevel analysis are stored in R as well.

In order to be able to plot a multilevel model, we first need such a model. We will estimate a model here, which we have seen before. We want to estimate the effect that a standardized test at school-entry has on a specific exam students made. Students are, of course, nested withing shools, which is taken into account in our analysis, as well as the average score in the intake test for each of the schools. The effects the test at entry of the school has on the test result is allowed to vary by school. In other words: we are estimating a random intercept model with a random slope.

In order to do so, we have to load the nlme package, as well as the mlmREV-package, which contains the data we will use. Then, for technical reasons, we have to get rid of the lme4-package. This is done by using the detach() function. This is what is done in the syntax below.

library(nlme)
library(mlmRev)
detach(“packages:lme4″)

model.01 <- lme(fixed = normexam ~ standLRT + schavg,
data = Exam,
random = ~ standLRT | school)

summary(model.01)

The requested output of the model.01 results in:

> summary(model.01)
Linear mixed-effects model fit by REML
 Data: Exam 
       AIC     BIC    logLik
  9337.885 9382.04 -4661.943

Random effects:
 Formula: ~standLRT | school
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev    Corr  
(Intercept) 0.2778443 (Intr)
standLRT    0.1237837 0.373 
Residual    0.7440440       

Fixed effects: normexam ~ standLRT + schavg 
                 Value  Std.Error   DF   t-value p-value
(Intercept) -0.0014224 0.03725472 3993 -0.038181  0.9695
standLRT     0.5522414 0.02035359 3993 27.132378  0.0000
schavg       0.2947588 0.10726821   63  2.747867  0.0078
 Correlation: 
         (Intr) stnLRT
standLRT  0.266       
schavg    0.089 -0.085

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.82942010 -0.63168683  0.03258589  0.68512318  3.43634584 

Number of Observations: 4059
Number of Groups: 65 

Probably the best way to visualize this model is to create a plot of the relationship on individual level between the score a student had on an standardized intake test (standLRT) and the result on an specific exam (normexam). We want these relationship to be shown for each school specifically. To do this, several steps are needed:

  • A plot region has to be set
  • Coefficients for each school need to be extracted from the model
  • The minimum and maximum value on the x-axis for each school need to be extracted
  • Based on the model, the data actually needs to be plotted

Each of the steps shall be described explicitly below. Since many steps need to be performed, they will be gathered in a function, which we’ll call visualize.lme(). It will have three parameters: the model we want to use, the predictor and the group-variable that will be taken into account.

visualize.lme <- function (model, coefficient, group, ...)
{
r <- ranef(model)
f <- fixef(model)

effects <- data.frame(r[,1]+f[1], r[,2]+f[2])

number.lines <- nrow(effects)

Above, the first row defines the function visualize.lme(), an arbitrary name we chose ourselves. Between the brackets, four elements are placed. The three needed elements described above are named ‘model’, ‘coefficient’, and ‘group’. The fourth element ( … ) means that any other element can be added. These elements will be transfered to the plot() function that will be used to set the graphics device.
Then, four variables are created within the function, to which data from the model is extracted. First, ranef() extracts the random coefficients from the model, while fixef() does the same for the fixed elements.
The effects variable is created next. This will be a data.frame that contains information on the intercepts and the slopes that will be plotted. The number.lines variable contains the number of rows in the effects data.frame, that is equal to the number of groups in the model.

predictor.min <- tapply(model$data[[coefficient]], model$data[[group]], min)
predictor.max <- tapply(model$data[[coefficient]], model$data[[group]], max)

outcome.min <- min(predict(model))
outcome.max <- max(predict(model))

Before the plotting area can be set up, we will need four coordinates. This is done above. First, the minimum and maximum value of the predictor is gathered. Next, the minimum and maximum of the predicted values are determined, using predict(). The predict() function takes the data from the model specified and uses the original data and the estimated model formula (which are stored inside) and returns a vector of predicted values on the outcome-variable. Using min() and max(), the minimum and maximum values are obtained.

plot (c(min(predictor.min),max(predictor.max)),c(outcome.min,outcome.max), type=”n”, …)

for (i in 1:number.lines)
{
expression <- function(x) {effects[i,1] + (effects[i,2] * x) }
curve(expression, from=predictor.min[i], to=predictor.max[i], add=TRUE)
}
}

Finally, the plot is created above. First, the area in which to plot is set up. Using the four coordinates obtained before, a plot is created in such a way, that the lines the will be plotted next will fit in exactly. Specifying type=”n” results in setting up a plotting region on a graphics device, without an actual plot being made. Only the axes that are created can be seen. The … parameter in the specification of the plot() function transfers all the additional parameters that are given to the visualize.lme() we’re creating to the plot()-function that is used here. We can use it to give proper names to both axes, as well as to set a title for the plot.

Then, a loop is created using for(). For each of the groups, a line will be plotted. In order to do so, a function is created based on the intercept and the slopes extracted earlier. Then, this created function (called ‘expression’) is entered into the curve()-function. This curve()-function draws a graph based on a function (for instance, the one we just created) from a starting point to an end-point that both are specified. This process is repeated for every group in the model, resulting in a basic graph based on our multilevel, random intercept and random slope regression model.

Finnally, as well as for convenience, the complete syntax of this function will be shown below, as well as the result when used on the model we estimated at the start of this paragraph.

visualize.lme <- function (model, coefficient, group, ...)
{
r <- ranef(model)
f <- fixef(model)

effects <- data.frame(r[,1]+f[1], r[,2]+f[2])

number.lines <- nrow(effects)

predictor.min <- tapply(model$data[[coefficient]], model$data[[group]], min)
predictor.max <- tapply(model$data[[coefficient]], model$data[[group]], max)

outcome.min <- min(predict(model))
outcome.max <- max(predict(model))

plot (c(min(predictor.min),max(predictor.max)),c(outcome.min,outcome.max), type="n", ...)

for (i in 1:number.lines)
{
expression <- function(x) {effects[i,1] + (effects[i,2] * x) }
curve(expression, from=predictor.min[i], to=predictor.max[i], add=TRUE)
}
}

visualize.lme(model.01, "standLRT", "school", xlab="Student test at school-entry", ylab="Result on Exam", main="Exam results for 65 schools")

Multilevel Plot

– – — — —– ——–

– – — — —– ——–
R-Sessions is a collection of manual chapters for R-Project, which are maintained on Curving Normality. All posts are linked to the chapters from the R-Project manual on this site. The manual is free to use, for it is paid by the advertisements, but please refer to it in your work inspired by it. Feedback and topic requests are highly appreciated.
——– —– — — – –

3 comment on “R-Sessions 20: Plotting Multilevel Models

  • I can’t seem to get this code to work.

    I would really like to use the lmer function instead of lme.

    There seems to be a problem with the expression function near the bottom. The code “curve(expression,” does not send anything to the expression function. It also crashed when I didn’t have $school after the ‘r’ in the dataframe section. And, finally, the font selected causes a problem with R because of the smart quotes.

    But otherwise this is helpful.

  • Hi Rense,

    As I’ve already told you, thanks a lot for your R sessions. Really nice!

    Here is my proposition of the same function but for lmer.
    (note that it assumes that the “coefficient variable” is always in second position in the formula, i.e. after the tilde ~ , and the group variable in last position… cf. objects a and b in function code!)

    I hope it helps.

    Enjoy,

    Julie

    library(mlmRev)
    data(Exam)
    library(lme4)

    visualize.lmer <- function (model, coefficient, group,…)
    {
    r <- ranef(model)
    f <- fixef(model)
    attributes(r)[[1]] <- "r1"
    r1 <- ((r)$r1)[,1]
    attributes(r)[[1]] <- "r2"
    r2 <- ((r)$r2)[,2]
    effects <- data.frame(r1+f[1], r2+f[2])
    number.lines <- nrow(effects)
    a<- unlist((attributes(model)$frame[2]))
    n <- ncol(attributes(model)$frame)
    b<- unlist((attributes(model)$frame[n]))
    predictor.min <- tapply(a,b,min)
    predictor.max <- tapply(a,b,max)
    outcome.min <- min(attributes(model)$eta)
    outcome.max <- max(attributes(model)$eta)
    plot (c(min(predictor.min),max(predictor.max)),c(outcome.min,outcome.max),
    type="n",…)
    for (i in 1:number.lines)
    {
    expression <- function(x) {effects[i,1] + (effects[i,2] * x) }
    curve(expression, from=predictor.min[i], to=predictor.max[i], add=TRUE)
    }
    }

    par(mar=c(2.3, 2.0, 2.5, 0.3), bg="white", las=2)
    layout(matrix(c(1,2), nrow=1, ncol=2, byrow=T), heights=c(3,3), widths=c(3,3))

    model.b <- lmer (normexam ~ standLRT + schavg + (1 + standLRT|school), data=Exam)
    summary(model.b)

    visualize.lmer(model.b, "standLRT", "school",xlab="Student test at school-entry", ylab="Result on Exam", main="Exam results for 65 schools")

    model.01 <-lme(fixed=normexam ~standLRT + schavg, data = Exam, random=~standLRT|school)
    summary(model.01)

    visualize.lme(model.01, "standLRT", "school", xlab="Student test at school-entry", ylab="Result on Exam", main="Exam results for 65 schools")

  • Hello Rense,
    This is a very nice example of how to visualize models with nlme, but I am having trouble manipulating it piece by piece, probably because I am not very good with the function() part. I have tried to manipulate the script to fit by own data by first reproducing your work. If I essentially copy/paste and run the script, everything works with the dataset (Exam) and model provided. However, when going through, I keep running into errors that certain objects are not able to be found. For example:

    object ‘group’ not found

    For example, I have tried many variations of the following line to reproduce the exam data, bu nothing works.

    predictor.min <- tapply(model$data[[coefficient]], model$data[[group]], min)

    Changed into:
    tapply(model.01$Exam[[standLRT]], model.01$Exam[[school]], min)

    I have tried many variations of this, ie 1) not changing anything, removing $, or removing $Exam, keeping the words coefficients and group etc.

    Could you maybe provide an example without the function part that allows one to go step by step or alternatively, provide a bit more guidance as to what information in teh script needs to be changed to modify it to other data sets? Much appreciated!!

Leave a Reply to Kate Cancel reply