File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Neyman-Pearson classification algorithms and NP receiver operating characteristics

TitleNeyman-Pearson classification algorithms and NP receiver operating characteristics
Authors
Issue Date2018
Citation
Science Advances, 2018, v. 4, n. 2, article no. eaao1659 How to Cite?
AbstractIn many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0)while enforcing an upper bound, a, on the type I error.Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than a do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than a, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoringtype classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose a in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies.
Persistent Identifierhttp://hdl.handle.net/10722/354116
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorTong, Xin-
dc.contributor.authorFeng, Yang-
dc.contributor.authorLi, Jingyi Jessica-
dc.date.accessioned2025-02-07T08:46:34Z-
dc.date.available2025-02-07T08:46:34Z-
dc.date.issued2018-
dc.identifier.citationScience Advances, 2018, v. 4, n. 2, article no. eaao1659-
dc.identifier.urihttp://hdl.handle.net/10722/354116-
dc.description.abstractIn many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0)while enforcing an upper bound, a, on the type I error.Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than a do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than a, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoringtype classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose a in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies.-
dc.languageeng-
dc.relation.ispartofScience Advances-
dc.titleNeyman-Pearson classification algorithms and NP receiver operating characteristics-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1126/sciadv.aao1659-
dc.identifier.pmid29423442-
dc.identifier.scopuseid_2-s2.0-85042153781-
dc.identifier.volume4-
dc.identifier.issue2-
dc.identifier.spagearticle no. eaao1659-
dc.identifier.epagearticle no. eaao1659-
dc.identifier.eissn2375-2548-
dc.identifier.isiWOS:000426845500016-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats