Optimal matching for heterogeneous treatment effect estimation

    Yun Cai is currently a postgraduate at the School of Management, University of Science and Technology of China. Her research mainly focuses on causal inference

    Shuguang Zhang is currently a Professor at the School of Management, University of Science and Technology of China (USTC). He received his Ph.D. degree in Statistics from USTC in 1992. His research mainly focuses on stochastic partial differential equations, backward stochastic differential equations, mathematical finance, and financial engineering

  Received Date: 06 March 2023
  Accepted Date: 17 April 2023
  Available Online: 30 June 2023
  • In observational studies, identifying subgroups and exploring heterogeneity is of practical significance. However, causal inference at the individual level is a challenging problem due to the absence of counterfactual outcomes and the presence of selection bias. To address this issue, we propose a general framework called TRIMATCH for estimating heterogeneous treatment effects. First, we find the optimal matching by solving a minimum average cost flow optimization problem in a tripartite graph network structure. Second, with the pseudo individual treatment effects acquired from the previous step, we establish a nonparametric regression model to predict heterogeneous treatment effects for individuals with diverse characteristics. Our experiments demonstrate the effectiveness of the proposed matching method and the interpretability of the results.
    Advantages of the proposed framework for predicting heterogeneous treatment effects by matching.
    In observational studies, identifying subgroups and exploring heterogeneity is of practical significance. However, causal inference at the individual level is a challenging problem due to the absence of counterfactual outcomes and the presence of selection bias. To address this issue, we propose a general framework called TRIMATCH for estimating heterogeneous treatment effects. First, we find the optimal matching by solving a minimum average cost flow optimization problem in a tripartite graph network structure. Second, with the pseudo individual treatment effects acquired from the previous step, we establish a nonparametric regression model to predict heterogeneous treatment effects for individuals with diverse characteristics. Our experiments demonstrate the effectiveness of the proposed matching method and the interpretability of the results.
    • Utilizing the minimum average cost flow algorithm to tackle the optimization problem of multiobjective matching yields heightened flexibility and accuracy compared to conventional matching methods.
    • Constructing an XGBoost tree using the acquired pseudo individual treatment effects yields better prediction accuracy compared to alternative regression-based methods.
    • Both theoretical and experimental results demonstrate that the proposed method boasts a tolerable upper limit of estimation error while incurring minimal average matching costs.

    Figure  1.  Tripartite network structure.

    Figure  2.  Comparison between different methods. (a) Mean absolute errors of 100 simulations forvarious methods. (b) Mean squared errors of 100 simulations for various methods

    Figure  3.  (a) The graph displays the distribution of estimated heterogeneous treatment effects corresponding to the approximated propensity score percentiles. (b) Scatter plots of estimated treatment effect (averaged over the 100 iterations) against ASVAB cognitive ability.

    Figure  4.  Prediction model of heterogeneous treatment effects

