1 year ago
#190372
VLarsen
How do I use the value of the previous year-week as the comparison in a negative binomial regression in R?
I have a dataset that contains weekly counts from 2019 to 2021. What I want to do is to compare the weekly count for a given week in 2020 to the count for the same week in 2019, and similarly compare the count in 2021 to that of the same week in 2019. The data looks like this:
set.seed(123)
df <- data.frame(count = sample(1:300, 156, replace = TRUE),
week = rep(seq(1, 52, by = 1), 3),
year = rep(2019:2021, each = 52))
In my real data, there is significant overdispersion, so I figured a negative binomial model may be best suited. I have run the following:
library(MASS)
nb <- glm.nb(count ~ factor(year)+factor(week), data = df)
summary(nb)
> summary(nb)
Call:
glm.nb(formula = count ~ factor(year) + factor(week), data = df,
init.theta = 2.193368056, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.5012 -0.7180 -0.0207 0.4896 1.6763
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.180832 0.399664 12.963 < 2e-16 ***
factor(year)2020 0.077185 0.133587 0.578 0.563404
factor(year)2021 0.037590 0.133609 0.281 0.778448
factor(week)2 -0.642313 0.556042 -1.155 0.248028
factor(week)3 0.084091 0.554446 0.152 0.879451
factor(week)4 0.228528 0.554245 0.412 0.680103
factor(week)5 -0.275901 0.555095 -0.497 0.619165
factor(week)6 0.181420 0.554308 0.327 0.743448
factor(week)7 -0.331875 0.555218 -0.598 0.550015
factor(week)8 -0.018194 0.554608 -0.033 0.973829
factor(week)9 -0.093659 0.554737 -0.169 0.865926
factor(week)10 -0.260570 0.555062 -0.469 0.638753
factor(week)11 -0.165835 0.554871 -0.299 0.765038
factor(week)12 -0.003480 0.554583 -0.006 0.994993
factor(week)13 0.045328 0.554506 0.082 0.934850
factor(week)14 -0.420895 0.555429 -0.758 0.448581
factor(week)15 -0.288260 0.555121 -0.519 0.603570
factor(week)16 -1.719551 0.561984 -3.060 0.002215 **
factor(week)17 -0.339217 0.555235 -0.611 0.541237
factor(week)18 -0.770541 0.556464 -1.385 0.166141
factor(week)19 -0.088333 0.554728 -0.159 0.873483
factor(week)20 -0.595712 0.555901 -1.072 0.283893
factor(week)21 -2.010330 0.565001 -3.558 0.000374 ***
factor(week)22 -0.075819 0.554706 -0.137 0.891282
factor(week)23 0.298783 0.554157 0.539 0.589772
factor(week)24 0.114664 0.554401 0.207 0.836147
factor(week)25 0.089396 0.554439 0.161 0.871907
factor(week)26 -0.396060 0.555368 -0.713 0.475754
factor(week)27 -0.261789 0.555065 -0.472 0.637186
factor(week)28 -0.090157 0.554731 -0.163 0.870894
factor(week)29 0.210589 0.554269 0.380 0.703990
factor(week)30 -0.537967 0.555736 -0.968 0.333032
factor(week)31 -0.401567 0.555381 -0.723 0.469651
factor(week)32 0.108651 0.554410 0.196 0.844630
factor(week)33 -0.732234 0.556332 -1.316 0.188113
factor(week)34 -0.589688 0.555884 -1.061 0.288775
factor(week)35 -0.437695 0.555471 -0.788 0.430714
factor(week)36 -0.402218 0.555383 -0.724 0.468933
factor(week)37 -0.076802 0.554708 -0.138 0.889881
factor(week)38 -0.151350 0.554844 -0.273 0.785022
factor(week)39 0.272593 0.554189 0.492 0.622806
factor(week)40 -0.119806 0.554785 -0.216 0.829027
factor(week)41 -1.184984 0.558260 -2.123 0.033784 *
factor(week)42 -0.153762 0.554848 -0.277 0.781685
factor(week)43 -0.068443 0.554693 -0.123 0.901799
factor(week)44 -0.721053 0.556294 -1.296 0.194916
factor(week)45 0.102378 0.554419 0.185 0.853497
factor(week)46 -0.009142 0.554593 -0.016 0.986848
factor(week)47 -0.284169 0.555112 -0.512 0.608712
factor(week)48 -0.133066 0.554809 -0.240 0.810454
factor(week)49 -0.705118 0.556242 -1.268 0.204924
factor(week)50 -0.080921 0.554715 -0.146 0.884017
factor(week)51 0.152016 0.554348 0.274 0.783912
factor(week)52 -0.503605 0.555642 -0.906 0.364752
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(2.1934) family taken to be 1)
Null deviance: 221.83 on 155 degrees of freedom
Residual deviance: 169.71 on 102 degrees of freedom
AIC: 1923.7
Number of Fisher Scoring iterations: 1
Theta: 2.193
Std. Err.: 0.243
2 x log-likelihood: -1813.701
The reference category for factor(year)
is 2019, which is what I want it to be. However, I am struggling with the interpretation of the coefficients (and also IRR) for week
with week 1 being the reference category.
Is there a better way to do this, in order to achieve the week/year comparison? My main goal is to plot weekly IRR for 2020 and 2021, relative to 2019.
r
regression
glm
mass
0 Answers
Your Answer