Although all introductions on regression seem to be based on the assumption of data that is distributed normally, in practice this is not the case. Many other types of distributions exist. To name a few: normal distribution, binomial distribution, poisson, gaussian and so on. The lmer()-function in the lme4-package can easily estimate models based on these distributions. This is done by adding the ‘family’-argument to the command syntax, thereby specifying that not a linear multilevel model needs to be estimated, but a generalized linear model.
Logistic Multilevel Regression
Let us say, we want to estimate the chance for success on a test a student in a specific school has. Therefor, we can use the Exam data-set in the mlmRev-package. This contains the standardized scores on a test. Here, we’ll define success on the test as having a standardized score of 0 or larger. This is recoded to a 0-1 variable below, using the ifelse() function. Using summary() the process of recoding is checked. The needed packages are loaded as well, using the library() function.
library(lme4)
library(mlmRev)
names(Exam)Exam$success <- ifelse(Exam$normexam >= 0,1,0)
summary(Exam$normexam)
summary(Exam$success)
> library(lme4) Loading required package: Matrix Loading required package: lattice > library(mlmRev) > names(Exam) [1] "school" "normexam" "schgend" "schavg" "vr" "intake" [7] "standLRT" "sex" "type" "student" > > Exam$success <- ifelse(Exam$normexam >= 0,1,0) > summary(Exam$normexam) Min. 1st Qu. Median Mean 3rd Qu. Max. -3.6660000 -0.6995000 0.0043220 -0.0001138 0.6788000 3.6660000 > summary(Exam$success) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.0000 1.0000 0.5122 1.0000 1.0000
In order to be able to properly use the so created binary ‘success’ variable, a logistic regression model needs to be estimated. This is done by specifying binomial family, using the logit as a link-function, using “family = binomial(link = “logit”)”. The rest of the specification is exactly the same as a normal linear multilevel regression model using the lmer() function.
lmer(success~ schavg + (1|school), data=Exam, family=binomial(link = “logit”))
> lmer(success~ schavg + (1|school), + data=Exam, + family=binomial(link = "logit")) Generalized linear mixed model fit using Laplace Formula: success ~ schavg + (1 | school) Data: Exam Family: binomial(logit link) AIC BIC logLik deviance 5323 5342 -2658 5317 Random effects: Groups Name Variance Std.Dev. school (Intercept) 0.23113 0.48076 number of obs: 4059, groups: school, 65 Estimated scale (compare to 1 ) 0.9909287 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.08605 0.07009 1.228 0.220 schavg 1.60548 0.21374 7.511 5.86e-14 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Correlation of Fixed Effects: (Intr) schavg 0.072
– – — — —– ——–
- Discuss this article and pose additional questions in the R-Sessions Forum
- Find the original article embedded in the manual.
– – — — —– ——–
R-Sessions is a collection of manual chapters for R-Project, which are maintained on Curving Normality. All posts are linked to the chapters from the R-Project manual on this site. The manual is free to use, for it is paid by the advertisements, but please refer to it in your work inspired by it. Feedback and topic requests are highly appreciated.
——– —– — — – –
Thanks so much for the helpful tutorial!
Thanks, and you’re welcome!