1 year ago

#356823

test-img

MeowMix

Resetting a group's cumsum when a condition is met

First time poster here, so apologies if I get something wrong.

I have a data table that looks like this currently:

dt_achieved<-data.table(last_nm=c("raus","stroper", "degroat","degroat","degroat","degroat","degroat","degroat","piya","mazzy","mazzy","mazzy"),intake_date=c("2021-03-04","2021-06-18","2021-04-14","2021-06-10","2021-07-08","2021-08-09","2021-11-09","2021-12-08","2021-09-16","2021-04-15","2021-08-02","2021-08-09"))

dt_achieved$intake_date<-as.Date(dt_achieved$intake_date)

I would like it to look like this, where the data are grouped by last_nm and then, if a row's intake date is at least 90 days after the group's initial one, it's flagged/its count increases by one on a rolling basis, within group.

dt_ideal<-data.table(last_nm=c("raus","stroper", "degroat","degroat","degroat","degroat","degroat","degroat","piya",
           "mazzy","mazzy","mazzy"),intake_date=c("2021-03-04","2021-06-18","2021-04-14","2021-06-10","2021-07-08","2021-08-09","2021-11-09","2021-12-08","2021-09-16","2021-04-15","2021-08-02","2021-08-09"),intake_round=c(1,1,1,1,1,2,3,3,1,1,2,2), Running_intake=c(1,1,1,2,3,4,5,6,1,1,2,3))

dt_ideal$intake_date<-as.Date(dt_ideal$intake_date)

I have gotten this far:

# order by last name and intake date (the real data is randomized)
setkeyv(dt_achieved, c("last_nm", "intake_date"))

dt_achieved[, intake_round := cumsum(c(TRUE, diff(as.Date(intake_date)) >= 90)), 
    .(last_nm)][, Running_intake := as.numeric(seq_len(.N)), .(last_nm)]

The issue is that the intake_round column currently reflects the difference from the date above rather than looking to the group's initial intake_date to see if it's at least 90 days later. ...I just can't figure out the script to get the cumsum(c(TRUE, ...) function to do this on a rolling basis, within group.

Have consulted a number of questions including:

incremental counter within dataframe only when a condition is met in r

Resetting the cumulative sum when a condition is met in R

Split into groups based on (multiple) conditions?

Cumulative sum that resets when the condition is no longer met

Please, any suggestions would be greatly, greatly appreciated!

r

group-by

conditional-statements

reset

cumsum

0 Answers

Your Answer

Accepted video resources