In causal inference, estimating heterogeneous treatment effects (HTE) is critical for identifying how different subgroups respond to interventions, with broad applications in fields such as precision medicine and personalized advertising. Although HTE estimation methods aim to improve accuracy, how to provide explicit subgroup descriptions remains unclear, hindering data interpretation and strategic intervention management. In this paper, we propose CURLS, a novel rule learning method leveraging HTE, which can effectively describe subgroups with significant treatment effects. Specifically, we frame causal rule learning as a discrete optimization problem, finely balancing treatment effect with variance and considering the rule interpretability. We design an iterative procedure based on the minorize-maximization algorithm and solve a submodular lower bound as an approximation for the original. Quantitative experiments and qualitative case studies verify that compared with state-of-the-art methods, CURLS can find subgroups where the estimated and true effects are 16.1% and 13.8% higher and the variance is 12.0% smaller, while maintaining similar or better estimation accuracy and rule interpretability. Code is available at https://osf.io/zwp2k/.
CausalPrism: A Visual Analytics Approach for Subgroup-based Causal Heterogeneity Exploration
Jiehui Zhou, Xumeng Wang, Kam-Kwai Wong, and
5 more authors
In causal inference, estimating Heterogeneous Treatment Effects (HTEs) from observational data is critical for understanding how different subgroups respond to treatments, with broad applications such as precision medicine and targeted advertising. However, existing work on HTE, subgroup discovery, and causal visualization is insufficient to address two challenges: first, the sheer number of potential subgroups and the necessity to balance multiple objectives (e.g., high effects and low variances) pose a considerable analytical challenge. Second, effective subgroup analysis has to follow the analysis goal specified by users and provide causal results with verification. To this end, we propose a visual analytics approach for subgroup-based causal heterogeneity exploration. Specifically, we first formulate causal subgroup discovery as a constrained multi-objective optimization problem and adopt a heuristic genetic algorithm to learn the Pareto front of optimal subgroups described by interpretable rules. Combining with this model, we develop a prototype system, CausalPrism, that incorporates tabular visualization, multi-attribute rankings, and uncertainty plots to support users in interactively exploring and sorting subgroups and explaining treatment effects. Quantitative experiments validate that the proposed model can efficiently mine causal subgroups that outperform state-of-the-art HTE and subgroup discovery methods, and case studies and expert interviews demonstrate the effectiveness and usability of the system. Code is available at https://osf.io/jaqmf/?view_only=ac9575209945476b955bf829c85196e9.
AVA: An automated and AI-driven intelligent visual analytics framework
Jiazhe Wang, Xi Li, Chenlu Li, and
11 more authors
With the incredible growth of the scale and complexity of datasets, creating proper visualizations for users becomes more and more challenging in large datasets. Though several visualization recommendation systems have been proposed, so far, the lack of practical engineering inputs is still a major concern regarding the usage of visualization recommendations in the industry. In this paper, we proposed AVA, an open-sourced web-based framework for Automated Visual Analytics. AVA contains both empiric-driven and insight-driven visualization recommendation methods to meet the demands of creating aesthetic visualizations and understanding expressible insights respectively. The code is available at https://github.com/antvis/AVA.
2023
FraudAuditor: A Visual Analytics Approach for Collusive Fraud in Health Insurance
Jiehui Zhou, Xumeng Wang, Jie Wang, and
7 more authors
IEEE Transactions on Visualization and Computer Graphics, 2023
Collusive fraud, in which multiple fraudsters collude to defraud health insurance funds, threatens the operation of the healthcare system. However, existing statistical and machine learning-based methods have limited ability to detect fraud in the scenario of health insurance due to the high similarity of fraudulent behaviors to normal medical visits and the lack of labeled data. To ensure the accuracy of the detection results, expert knowledge needs to be integrated with the fraud detection process. By working closely with health insurance audit experts, we propose FraudAuditor, a three-stage visual analytics approach to collusive fraud detection in health insurance. Specifically, we first allow users to interactively construct a co-visit network to holistically model the visit relationships of different patients. Second, an improved community detection algorithm that considers the strength of fraud likelihood is designed to detect suspicious fraudulent groups. Finally, through our visual interface, users can compare, investigate, and verify suspicious patient behavior with tailored visualizations that support different time scales. We conducted case studies in a real-world healthcare scenario, i.e., to help locate the actual fraud group and exclude the false positive group. The results and expert feedback proved the effectiveness and usability of the approach.
2022
DPVisCreator: Incorporating Pattern Constraints to Privacy-preserving Visualizations via Differential Privacy
Jiehui Zhou, Xumeng Wang, Jason K Wong, and
8 more authors
IEEE Transactions on Visualization and Computer Graphics, 2022
Data privacy is an essential issue in publishing data visualizations. However, it is challenging to represent multiple data patterns in privacy-preserving visualizations. The prior approaches target specific chart types or perform an anonymization model uniformly without considering the importance of data patterns in visualizations. In this paper, we propose a visual analytics approach that facilitates data custodians to generate multiple private charts while maintaining user-preferred patterns. To this end, we introduce pattern constraints to model users’ preferences over data patterns in the dataset and incorporate them into the proposed Bayesian network-based Differential Privacy (DP) model PriVis . A prototype system, DPVisCreator , is developed to assist data custodians in implementing our approach. The effectiveness of our approach is demonstrated with quantitative evaluation of pattern utility under the different levels of privacy protection, case studies, and semi-structured expert interviews.
2021
MedicareVis: a Joint Visual Analytics Approach for Anti-Fraud in Medical Insurance
Jiehui Zhou, Rongchen Zhu, Wei Zhang, and
4 more authors