R package to implement ordered correlation forest, a machine learning estimator specifically optimized for predictive modeling of ordered non-numeric outcomes.

ocf provides forest-based estimation of the conditional choice probabilities and the covariates’ marginal effects. Under an “honesty” condition, the estimates are consistent and asymptotically normal and standard errors can be obtained by leveraging the weight-based representation of the random forest predictions. Please reference the use as Di Francesco (2023).

To get started, please check the online short tutorial.

Installation

The package can be downloaded from CRAN:

install.packages("ocf")

Alternatively, the current development version of the package can be installed using the devtools package:

devtools::install_github("riccardo-df/ocf") # run install.packages("devtools") if needed.

References

  • Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized Random Forests. Annals of Statistics, 47(2). [paper]

  • Di Francesco, R. (forthcoming). Ordered Correlation Forest. Econometric Reviews. [paper]

  • Lechner, M., & Mareckova, J. (2022). Modified Causal Forest. arXiv preprint arXiv:2209.03744. [paper]

  • Lechner, M., & Okasa, G. (2019). Random Forest Estimation of the Ordered Choice Model. arXiv preprint arXiv:1907.02436. [paper]

  • Peracchi, F. (2014). Econometric methods for ordered responses: Some recent developments. In Econometric methods and their applications in finance, macro and related fields(pp. 133–165). World Scientific. [paper]

  • Wager, S., & Athey, S. (2018). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523). [paper]

  • Wright, M. N. & Ziegler, A. (2017). ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77(1). [paper]