Suite of tools extending the dplyr
package to perform data manipulation. These tools are geared towards use in panel data and hierarchical data.
Unlike other suites dealing with panel data, all functions in pmdplyr
are designed to work even when considering a set of variables that do not uniquely identify rows. This is handy when working with any kind of hierarchical data, or panel data where there are multiple observations per individual per time period, like student/term/class education data.
pmdplyr
contains the following functions:
between_i
and within_i
Standard between and within panel calculations.
fixed_check
Checks a list of variables for consistency within a panel structure.
fixed_force
Forces a list of variables to be constant within a panel structure.
id_variable
Takes a list of variables that make up an individual identifier and turns it into a single variable.
time_variable
Takes a time variable, or set of time variables, and turns them into a single well-behaved integer time variable of the kind required by most panel functions.
inexact_join
Wrapper for the dplyr
join
functions which allows for a variable to be matched inexactly, for example joining a time variable in x
to the most recent previous value in y
.
safe_join
Set of wrappers for the dplyr::join
and pmdplyr::inexact_join
functions which checks before merging whether each data set is uniquely identified as expected.
pibble
, as_pibble
, and is_pibble
Set the panel structure for a data set, or check if it is already set.
panel_convert
Converts between the panel data types pmdplyr::pibble
, tsibble::tsibble
, plm::pdata.frame
, and panelr::panel_data
.
mutate_cascade
A wrapper for dplyr
mutate
which runs one period at a time, allowing changes in one period to finalize before the next period is calculated.
mutate_subset
A wrapper for dplyr
mutate
that performs a calculation on a subset of data, and then applies the result to all the observations (within group).
panel_fill
Fills in gaps in the panel. Can also fill in at the beginning or end of the data to create a perfectly balanced panel.
panel_locf
A last-observation-carried-forward function for panels. Fills in NA
s with recent nonmissing observations.
tlag
Lags a variable in time.