Best of ASCO - 2014 Annual Meeting

 

Welcome

Attend this session at the
2019 ASCO Annual Meeting!


Session: Health Services Research, Clinical Informatics, and Quality of Care

Type: Poster Session

Time: Saturday June 1, 1:15 PM to 4:15 PM

Location: Hall A

Use of machine learning to identify relevant research publications in clinical oncology.

Sub-category:
Clinical Informatics

Category:
Health Services Research, Clinical Informatics, and Quality of Care

Meeting:
2019 ASCO Annual Meeting

Abstract No:
6558

Poster Board Number:
Poster Session (Board #249)

Citation:
J Clin Oncol 37, 2019 (suppl; abstr 6558)

Author(s): Fernando Jose Suarez Saiz, Corey Sanders, Rick J Stevens, Robert Nielsen, Michael W Britt, Anita Preininger, Gretchen Jackson; IBM Watson Health, New York, NY; IBM Watson Health, Nashville, TN; IBM Watson Health, Cambridge, MA

Abstract Disclosures

Abstract:

Background: Finding high-quality science to support decisions for individual patients is challenging. Common approaches to assess clinical literature quality and relevance rely on bibliometrics or expert knowledge. We describe a method to automatically identify clinically relevant, high-quality scientific citations using abstract content. Methods: We used machine learning trained on text from PubMed papers cited in 3 expert resources: NCCN, NCI-PDQ, and Hemonc.org. Balanced training data included text cited in at least two sources to form an “on topic” set (i.e., relevant and high quality), and an “off-topic” set, not cited in any of the above 3 sources. The off-topic set was published in lower ranked journals, using a citation-based score. Articles were part of an Oncology Clinical Trial corpus generated using a standard PubMed query. We used a gradient boosted-tree approach with a binary logistic supervised learning classification. Briefly, 988 texts were processed to produce a term frequency-inverse document frequency (tf-idf) n-gram representation of both the training and the test set (70/30 split). Ideal parameters were determined using 1000-fold cross validation. Results: Our model classified papers in the test set with 0.93 accuracy (95% CI (0.09:0.96) p≤ 0.0001), with sensitivity 0.95 and specificity 0.91. Some false positives contained language considered clinically relevant that may have been missed or not yet included in expert resources. False negatives revealed a potential bias towards chemotherapy-focused research over radiation therapy or surgical approaches. Conclusions: Machine learning can be used to automatically identify relevant clinical publications from biographic databases, without relying on expert curation or bibliometric methods. The use of machine learning to identify relevant publications may reduce the time clinicians spend finding pertinent evidence for a patient. This approach is generalizable to cases where a corpus of high-quality publications that can serve as a training set exists or cases where document metadata is unreliable, as is the case of “grey” literature within oncology and beyond to other diseases. Future work will extend this approach and may integrate it into oncology clinical decision-support tools.

 
Other Abstracts in this Sub-Category:

 

1. A predictive model for survival in non-small cell lung cancer (NSCLC) based on electronic health record (EHR) and tumor sequencing data at the Department of Veterans Affairs (VA).

Meeting: 2019 ASCO Annual Meeting Abstract No: 109 First Author: Nathanael Fillmore
Category: Health Services Research, Clinical Informatics, and Quality of Care - Clinical Informatics

 

2. A blinded evaluation of a clinical decision-support system at a regional cancer care center.

Meeting: 2019 ASCO Annual Meeting Abstract No: 6553 First Author: Suthida Suwanvecho
Category: Health Services Research, Clinical Informatics, and Quality of Care - Clinical Informatics

 

3. A framework for building a clinically relevant risk model.

Meeting: 2019 ASCO Annual Meeting Abstract No: 6554 First Author: Robert Michael Daly
Category: Health Services Research, Clinical Informatics, and Quality of Care - Clinical Informatics

 

More...