This function takes in two variables of equal length, the first of which is a categorical variable, and performs a test of independence between them. It returns a character string with the results of that test for putting in a table.

independence.test(
  x,
  y,
  w = NA,
  factor.test = NA,
  numeric.test = NA,
  star.cutoffs = c(0.01, 0.05, 0.1),
  star.markers = c("***", "**", "*"),
  digits = 3,
  fixed.digits = FALSE,
  format = "{name}={stat}{stars}",
  opts = list()
)

Arguments

x

A categorical variable.

y

A variable to test for independence with x. This can be a factor or numeric variable. If you want a numeric variable treated as categorical, convert to a factor first.

w

A vector of weights to pass to the appropriate test.

factor.test

Used when y is a factor, a function that takes x and y as its first arguments and returns a list with three arguments: (1) The name of the test for printing, (2) the test statistic, and (3) the p-value. Defaults to a Chi-squared test if there are no weights, or a design-based F statistic (Rao & Scott Aadjustment, see survey::svychisq) with weights, which requires that the survey package be installed. WARNING: the Chi-squared test's assumptions fail with small sample sizes. This function will be attempted for all non-numeric y.

numeric.test

Used when y is numeric, a function that takes x and y as its first arguments and returns a list with three arguments: (1) The name of the test for printing, (2) the test statistic, and (3) the p-value. Defaults to a group differences F test. If you only have two groups and would prefer an absolute t-statistic to an F-statistic, pass vtable:::groupt.it.

star.cutoffs

A numeric vector indicating the p-value cutoffs to use for reporting significance stars. Defaults to c(.01,.05,.1). If you don't want stars, remove them from the format argument.

star.markers

A character vector indicating the symbols to use to indicate significance cutoffs associated with star.cuoffs. Defaults to c('***','**','*'). If you don't want stars, remove them from the format argument.

digits

Number of digits after the decimal to round the test statistic and p-value to.

fixed.digits

FALSE will cut off trailing 0s when rounding. TRUE retains them. Defaults to FALSE.

format

The way in which the four elements returned by (or calculated after) the test - {name}, {stat}, {pval}, and {stars} - will be arranged in the string output. Note that the default '{name}={stat}{stars}' does not contain the p-value, and also does not contain superscript for the stars since it doesn't know what markup language you're aiming for. For LaTeX you may prefer '{name}$={stat}^{{stars}}$', and for HTML '{name}={stat}<sup>{stars}</sup>'.

opts

The options listed above, entered in named-list format.

Details

In an attempt (and perhaps an encouragement) to use this function in weird ways, and because it's not really expected to be used directly, input is not sanitized. Have fun!

Examples


data(mtcars)
independence.test(mtcars$cyl,mtcars$mpg)
#> [1] "F=39.698***"