generate_qualitative_data_iv.Rd
Generate a synthetic data set with qualitative outcomes under an instrumental variables design. The data include a binary treatment indicator and a binary instrument. Potential outcomes and potential treatments are independent of the instrument. Moreover, the instrument does not directly impact potential outcomes, has an impact on treatment probability, and can only increase the probability of treatment.
generate_qualitative_data_iv(n, outcome_type)
A list storing a data frame with the observed data, the true propensity score, the true instrument propensity score, and the true local probabilities of shift.
Potential outcomes are generated differently according to outcome_type
. If outcome_type == "multinomial"
, generate_qualitative_data_iv
computes linear predictors for each class using the covariates:
ηmi(d)=βdm1Xi1+βdm2Xi2+βdm3Xi3,d=0,1,
and then transforms ηmi(d) into valid probability distributions using the softmax function:
P(Yi(d)=m|Xi)=exp(ηmi(d))∑m′exp(ηm′i(d)),d=0,1.
It then generates potential outcomes Yi(1) and Yi(0) by sampling from {1, 2, 3} using Pi(Y(d)=m|X),d=0,1.
If instead outcome_type == "ordered"
, generate_qualitative_data_iv
first generates latent potential outcomes:
Y∗i(d)=τd+Xi1+Xi2+Xi3+N(0,1),d=0,1,
with τ=2. It then constructs Yi(d) by discretizing Y∗i(d) using threshold parameters ζ1=2 and ζ2=4. Then,
P(Yi(d)=m|Xi)=P(ζm−1<Y∗i(d)≤ζm|Xi)=Φ(ζm−∑jXij−τd)−Φ(ζm−1−∑jXij−τd),d=0,1,
which allows us to analytically compute the local probabilities of shift.
## Generate synthetic data.
set.seed(1986)
data <- generate_qualitative_data_iv(100,
outcome_type = "ordered")
data$local_pshifts
#> [1] -0.576468964 -0.003873771 0.580342735