File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Causal discoveries for high dimensional mixed data

TitleCausal discoveries for high dimensional mixed data
Authors
Keywordscausal discoveries
latent Gaussian model
mixed data
PC algorithm
rank correlation
Issue Date2022
Citation
Statistics in Medicine, 2022, v. 41, n. 24, p. 4924-4940 How to Cite?
AbstractCausal relationships are of crucial importance for biological and medical research. Algorithms have been proposed for causal structure learning with graphical visualizations. While much of the literature focuses on biological studies where data often follow the same distribution, for example, the normal distribution for all variables, challenges emerge from epidemiological and clinical studies where data are often mixed with continuous, binary, and ordinal variables. We propose to use a mixed latent Gaussian copula model to estimate the underlying correlation structure via the rank correlation for mixed data. This correlation structure is then incorporated into a popular causal discovery algorithm, the PC algorithm, to identify causal structures. The proposed algorithm, called the latent-PC algorithm, is able to discover the true causal structure consistently under mild conditions in high dimensional settings. From simulation studies, the latent-PC algorithm delivers a competitive performance in terms of a similar or higher true positive rate and a similar or lower false positive rate, compared with other variants of the PC algorithm. In the high dimensional settings where the number of variables is more than the number of observations, the causal graphs identified by the latent-PC algorithm are closer to the true causal structures, compared to other competing algorithms. Further, we demonstrate the utility of the latent-PC algorithm in a real dataset for hepatocellular carcinoma. Causal structures for patient survival are visualized and connected with clinical interpretations in the literature.
Persistent Identifierhttp://hdl.handle.net/10722/328835
ISSN
2023 Impact Factor: 1.8
2023 SCImago Journal Rankings: 1.348
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorCai, Zhanrui-
dc.contributor.authorXi, Dong-
dc.contributor.authorZhu, Xuan-
dc.contributor.authorLi, Runze-
dc.date.accessioned2023-07-22T06:24:29Z-
dc.date.available2023-07-22T06:24:29Z-
dc.date.issued2022-
dc.identifier.citationStatistics in Medicine, 2022, v. 41, n. 24, p. 4924-4940-
dc.identifier.issn0277-6715-
dc.identifier.urihttp://hdl.handle.net/10722/328835-
dc.description.abstractCausal relationships are of crucial importance for biological and medical research. Algorithms have been proposed for causal structure learning with graphical visualizations. While much of the literature focuses on biological studies where data often follow the same distribution, for example, the normal distribution for all variables, challenges emerge from epidemiological and clinical studies where data are often mixed with continuous, binary, and ordinal variables. We propose to use a mixed latent Gaussian copula model to estimate the underlying correlation structure via the rank correlation for mixed data. This correlation structure is then incorporated into a popular causal discovery algorithm, the PC algorithm, to identify causal structures. The proposed algorithm, called the latent-PC algorithm, is able to discover the true causal structure consistently under mild conditions in high dimensional settings. From simulation studies, the latent-PC algorithm delivers a competitive performance in terms of a similar or higher true positive rate and a similar or lower false positive rate, compared with other variants of the PC algorithm. In the high dimensional settings where the number of variables is more than the number of observations, the causal graphs identified by the latent-PC algorithm are closer to the true causal structures, compared to other competing algorithms. Further, we demonstrate the utility of the latent-PC algorithm in a real dataset for hepatocellular carcinoma. Causal structures for patient survival are visualized and connected with clinical interpretations in the literature.-
dc.languageeng-
dc.relation.ispartofStatistics in Medicine-
dc.subjectcausal discoveries-
dc.subjectlatent Gaussian model-
dc.subjectmixed data-
dc.subjectPC algorithm-
dc.subjectrank correlation-
dc.titleCausal discoveries for high dimensional mixed data-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1002/sim.9544-
dc.identifier.pmid35968913-
dc.identifier.scopuseid_2-s2.0-85136062041-
dc.identifier.volume41-
dc.identifier.issue24-
dc.identifier.spage4924-
dc.identifier.epage4940-
dc.identifier.eissn1097-0258-
dc.identifier.isiWOS:000840415000001-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats