Multinomial Machine Learning

Estimation strategy to estimate conditional choice probabilities for multinomial outcomes.

multinomial_ml(Y = NULL, X = NULL, learner = "forest", scale = TRUE)

Arguments

Y: Outcome vector.
X: Covariate matrix (no intercept).
learner: String, either "forest" or "l1". Selects the base learner to estimate each expectation.
scale: Logical, whether to scale the covariates. Ignored if learner is not "l1".

Value

Object of class mml.

Details

Multinomial machine learning expresses conditional choice probabilities as expectations of binary variables:

$$p_m \left( X_i \right) = \mathbb{E} \left[ 1 \left( Y_i = m \right) | X_i \right]$$

This allows us to estimate each expectation separately using any regression algorithm to get an estimate of conditional probabilities.

multinomial_ml combines this strategy with either regression forests or penalized logistic regressions with an L1 penalty, according to the user-specified parameter learner.

If learner == "l1", the penalty parameters are chosen via 10-fold cross-validation and model.matrix is used to handle non-numeric covariates. Additionally, if scale == TRUE, the covariates are scaled to have zero mean and unit variance.

References

Di Francesco, R. (2025). Ordered Correlation Forest. Econometric Reviews, 1–17. doi:10.1080/07474938.2024.2429596 .

Author

Riccardo Di Francesco

Examples