R/major_mutate_variations.R
mutate_subset.RdThis function performs dplyr::summarize on a .filtered subset of data. Then it applies the result to all observations (or all observations in the group, if applied to grouped data), filling in columns of the data with the summarize results, as though dplyr::mutate had been run.
mutate_subset( .df, ..., .filter, .group_i = TRUE, .i = NULL, .t = NULL, .d = NA, .uniqcheck = FALSE, .setpanel = TRUE )
| .df | Data frame or tibble. |
|---|---|
| ... | Specification to be passed to |
| .filter | Unquoted logical condition for which observations |
| .group_i | By default, if |
| .i | Quoted or unquoted variables that identify the individual cases. Note that setting any one of |
| .t | Quoted or unquoted variable indicating the time. |
| .d | Number indicating the gap in |
| .uniqcheck | Logical parameter. Set to TRUE to always check whether |
| .setpanel | Logical parameter. |
One application of this is to partially widen data. For example, if your analysis uses childhood height as a control variable in all years, mutate_subset() could be used to easily generate a height_age10 variable from a height variable.
data(SPrail) # In preparation for fitting a choice model for how people choose ticket type, # I'd like to know the price of a "Promo" ticket for a given route # So that I can compare each other type of ticket price to that type SPrail <- SPrail %>% mutate_subset( promo_price = mean(price, na.rm = TRUE), .filter = fare == "Promo", .i = c(origin, destination) )