Prediction task guided representation learning of medical codes in EHR

Cui, Liwen; Xie, Xiaolei; Shen, Zuojun

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.jbi.2018.06.013
Scopus: eid_2-s2.0-85048970420
PMID: 29928997
WOS: WOS:000445054800001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Journal/Magazine Articles
- President's Office: Journal/Magazine Articles

Article: Prediction task guided representation learning of medical codes in EHR

Title	Prediction task guided representation learning of medical codes in EHR
Authors	Cui, Liwen Xie, Xiaolei Shen, Zuojun
Keywords	Medical code Healthcare resource utilization Word embedding Natural language processing Representation learning Electronic health records
Issue Date	2018
Citation	Journal of Biomedical Informatics, 2018, v. 84, p. 1-10 How to Cite? DOI: http://dx.doi.org/10.1016/j.jbi.2018.06.013
Abstract	© 2018 There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples.
Persistent Identifier	http://hdl.handle.net/10722/296177
ISSN	1532-0464 2023 Impact Factor: 4.0 2023 SCImago Journal Rankings: 1.160
ISI Accession Number ID	WOS:000445054800001

DC Field	Value	Language
dc.contributor.author	Cui, Liwen	-
dc.contributor.author	Xie, Xiaolei	-
dc.contributor.author	Shen, Zuojun	-
dc.date.accessioned	2021-02-11T04:53:00Z	-
dc.date.available	2021-02-11T04:53:00Z	-
dc.date.issued	2018	-
dc.identifier.citation	Journal of Biomedical Informatics, 2018, v. 84, p. 1-10	-
dc.identifier.issn	1532-0464	-
dc.identifier.uri	http://hdl.handle.net/10722/296177	-
dc.description.abstract	© 2018 There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples.	-
dc.language	eng	-
dc.relation.ispartof	Journal of Biomedical Informatics	-
dc.subject	Medical code	-
dc.subject	Healthcare resource utilization	-
dc.subject	Word embedding	-
dc.subject	Natural language processing	-
dc.subject	Representation learning	-
dc.subject	Electronic health records	-
dc.title	Prediction task guided representation learning of medical codes in EHR	-
dc.type	Article	-
dc.description.nature	link_to_OA_fulltext	-
dc.identifier.doi	10.1016/j.jbi.2018.06.013	-
dc.identifier.pmid	29928997	-
dc.identifier.scopus	eid_2-s2.0-85048970420	-
dc.identifier.volume	84	-
dc.identifier.spage	1	-
dc.identifier.epage	10	-
dc.identifier.isi	WOS:000445054800001	-
dc.identifier.issnl	1532-0464	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Prediction task guided representation learning of medical codes in EHR

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats