BioNumQA-BERT: Answering Biomedical Questions Using Numerical Facts with a Deep Language Representation Model

Wu, Y; Ting, HF; Lam, TW; Luo, R

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1145/3459930.3469557
Scopus: eid_2-s2.0-85112390963
WOS: WOS:000722623700070

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: BioNumQA-BERT: Answering Biomedical Questions Using Numerical Facts with a Deep Language Representation Model

Title	BioNumQA-BERT: Answering Biomedical Questions Using Numerical Facts with a Deep Language Representation Model
Authors	Wu, Y Ting, HF Lam, TW Luo, R
Keywords	Text Mining Biomedical Question Answering BERT Numerical encoding
Issue Date	2021
Publisher	Association for Computing Machinery (ACM).
Citation	The 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2021). Virtual Conference, 1-4 August 2021 How to Cite? DOI: http://dx.doi.org/10.1145/3459930.3469557
Abstract	Biomedical question answering (QA) is playing an increasingly significant role in medical knowledge translation. However, current biomedical QA datasets and methods have limited capacity, as they commonly neglect the role of numerical facts in biomedical QA. In this paper, we constructed BioNumQA, a novel biomedical QA dataset that answers research questions using relevant numerical facts for biomedical QA model training and testing. To leverage the new dataset, we designed a new method called BioNumQA-BERT by introducing a novel numerical encoding scheme into the popular biomedical language model BioBERT to represent the numerical values in the input text. Our experiments show that BioNumQABERT significantly outperformed other state-of-art models, including DrQA, BERT and BioBERT (39.0% vs 29.5%, 31.3% and 33.2%, respectively, in strict accuracy). To improve the generalization ability of BioNumQA-BERT, we further pretrained it on a large biomedical text corpus and achieved 41.5% strict accuracy. BioNumQA and BioNumQA-BERT establish a new baseline for biomedical QA. The dataset, source codes and pretrained model of BioNumQA-BERT are available at https://github.com/LeaveYeah/BioNumQA-BERT.
Description	BCB Session 6B: Ontologies & Databases
Persistent Identifier	http://hdl.handle.net/10722/301148
ISBN	9781450384506
ISI Accession Number ID	WOS:000722623700070

DC Field	Value	Language
dc.contributor.author	Wu, Y	-
dc.contributor.author	Ting, HF	-
dc.contributor.author	Lam, TW	-
dc.contributor.author	Luo, R	-
dc.date.accessioned	2021-07-27T08:06:50Z	-
dc.date.available	2021-07-27T08:06:50Z	-
dc.date.issued	2021	-
dc.identifier.citation	The 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2021). Virtual Conference, 1-4 August 2021	-
dc.identifier.isbn	9781450384506	-
dc.identifier.uri	http://hdl.handle.net/10722/301148	-
dc.description	BCB Session 6B: Ontologies & Databases	-
dc.description.abstract	Biomedical question answering (QA) is playing an increasingly significant role in medical knowledge translation. However, current biomedical QA datasets and methods have limited capacity, as they commonly neglect the role of numerical facts in biomedical QA. In this paper, we constructed BioNumQA, a novel biomedical QA dataset that answers research questions using relevant numerical facts for biomedical QA model training and testing. To leverage the new dataset, we designed a new method called BioNumQA-BERT by introducing a novel numerical encoding scheme into the popular biomedical language model BioBERT to represent the numerical values in the input text. Our experiments show that BioNumQABERT significantly outperformed other state-of-art models, including DrQA, BERT and BioBERT (39.0% vs 29.5%, 31.3% and 33.2%, respectively, in strict accuracy). To improve the generalization ability of BioNumQA-BERT, we further pretrained it on a large biomedical text corpus and achieved 41.5% strict accuracy. BioNumQA and BioNumQA-BERT establish a new baseline for biomedical QA. The dataset, source codes and pretrained model of BioNumQA-BERT are available at https://github.com/LeaveYeah/BioNumQA-BERT.	-
dc.language	eng	-
dc.publisher	Association for Computing Machinery (ACM).	-
dc.relation.ispartof	The 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB 2021)	-
dc.subject	Text Mining	-
dc.subject	Biomedical Question Answering	-
dc.subject	BERT	-
dc.subject	Numerical encoding	-
dc.title	BioNumQA-BERT: Answering Biomedical Questions Using Numerical Facts with a Deep Language Representation Model	-
dc.type	Conference_Paper	-
dc.identifier.email	Ting, HF: hfting@cs.hku.hk	-
dc.identifier.email	Lam, TW: twlam@cs.hku.hk	-
dc.identifier.email	Luo, R: rbluo@cs.hku.hk	-
dc.identifier.authority	Ting, HF=rp00177	-
dc.identifier.authority	Lam, TW=rp00135	-
dc.identifier.authority	Luo, R=rp02360	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1145/3459930.3469557	-
dc.identifier.scopus	eid_2-s2.0-85112390963	-
dc.identifier.hkuros	323502	-
dc.identifier.isi	WOS:000722623700070	-
dc.publisher.place	New York	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: BioNumQA-BERT: Answering Biomedical Questions Using Numerical Facts with a Deep Language Representation Model

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats