Suite of tools extending the dplyr package to perform data manipulation. These tools are geared towards use in panel data and hierarchical data.

Details

Unlike other suites dealing with panel data, all functions in pmdplyr are designed to work even when considering a set of variables that do not uniquely identify rows. This is handy when working with any kind of hierarchical data, or panel data where there are multiple observations per individual per time period, like student/term/class education data.

pmdplyr contains the following functions:

  • between_i and within_i Standard between and within panel calculations.

  • fixed_check Checks a list of variables for consistency within a panel structure.

  • fixed_force Forces a list of variables to be constant within a panel structure.

  • id_variable Takes a list of variables that make up an individual identifier and turns it into a single variable.

  • time_variable Takes a time variable, or set of time variables, and turns them into a single well-behaved integer time variable of the kind required by most panel functions.

  • inexact_join Wrapper for the dplyr join functions which allows for a variable to be matched inexactly, for example joining a time variable in x to the most recent previous value in y.

  • safe_join Set of wrappers for the dplyr::join and pmdplyr::inexact_join functions which checks before merging whether each data set is uniquely identified as expected.

  • pibble, as_pibble, and is_pibble Set the panel structure for a data set, or check if it is already set.

  • panel_convert Converts between the panel data types pmdplyr::pibble, tsibble::tsibble, plm::pdata.frame, and panelr::panel_data.

  • mutate_cascade A wrapper for dplyr mutate which runs one period at a time, allowing changes in one period to finalize before the next period is calculated.

  • mutate_subset A wrapper for dplyr mutate that performs a calculation on a subset of data, and then applies the result to all the observations (within group).

  • panel_fill Fills in gaps in the panel. Can also fill in at the beginning or end of the data to create a perfectly balanced panel.

  • panel_locf A last-observation-carried-forward function for panels. Fills in NAs with recent nonmissing observations.

  • tlag Lags a variable in time.