Expands the covariate matrix, adding interactions and polynomials. This is particularly useful for penalized regressions.

expand_df(X, int_order = 2, poly_order = 4, threshold = 0)

Arguments

X

Covariate matrix (no intercept).

int_order

Order of interactions to be added. Set equal to one if no interactions are desired.

poly_order

Order of the polynomials to be added. Set equal to one if no polynomials are desired.

threshold

Drop binary variables representing less than threshold% of the population. Useful to speed up computation.

Value

The expanded covariate matrix, as a data frame.

Details

expand_df assumes that categorical variables are coded as factors. Also, no missing values are allowed.

expand_df uses model.matrix to expand factors to a set of dummy variables. Then, it identifies continuous covariates as those not having 0 and 1 as unique values.

expand_df first introduces all the int_order-way interactions between the variables (using the expanded set of dummies), and then adds poly_order-order polynomials for continuous covariates.

Author

Riccardo Di Francesco