File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.jbi.2018.06.013
- Scopus: eid_2-s2.0-85048970420
- PMID: 29928997
- WOS: WOS:000445054800001
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Prediction task guided representation learning of medical codes in EHR
Title | Prediction task guided representation learning of medical codes in EHR |
---|---|
Authors | |
Keywords | Medical code Healthcare resource utilization Word embedding Natural language processing Representation learning Electronic health records |
Issue Date | 2018 |
Citation | Journal of Biomedical Informatics, 2018, v. 84, p. 1-10 How to Cite? |
Abstract | © 2018 There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples. |
Persistent Identifier | http://hdl.handle.net/10722/296177 |
ISSN | 2023 Impact Factor: 4.0 2023 SCImago Journal Rankings: 1.160 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cui, Liwen | - |
dc.contributor.author | Xie, Xiaolei | - |
dc.contributor.author | Shen, Zuojun | - |
dc.date.accessioned | 2021-02-11T04:53:00Z | - |
dc.date.available | 2021-02-11T04:53:00Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Journal of Biomedical Informatics, 2018, v. 84, p. 1-10 | - |
dc.identifier.issn | 1532-0464 | - |
dc.identifier.uri | http://hdl.handle.net/10722/296177 | - |
dc.description.abstract | © 2018 There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples. | - |
dc.language | eng | - |
dc.relation.ispartof | Journal of Biomedical Informatics | - |
dc.subject | Medical code | - |
dc.subject | Healthcare resource utilization | - |
dc.subject | Word embedding | - |
dc.subject | Natural language processing | - |
dc.subject | Representation learning | - |
dc.subject | Electronic health records | - |
dc.title | Prediction task guided representation learning of medical codes in EHR | - |
dc.type | Article | - |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.doi | 10.1016/j.jbi.2018.06.013 | - |
dc.identifier.pmid | 29928997 | - |
dc.identifier.scopus | eid_2-s2.0-85048970420 | - |
dc.identifier.volume | 84 | - |
dc.identifier.spage | 1 | - |
dc.identifier.epage | 10 | - |
dc.identifier.isi | WOS:000445054800001 | - |
dc.identifier.issnl | 1532-0464 | - |