Pointers at Glance
- Physicians often have a query on a patient’s EHR for data that assists them in making treatment decisions.
- Researchers are building a machine-learning model that can help doctors more efficiently in finding information from EHR.
Physicians frequently question researchers on a patient’s electronic health record for data that help them make treatment decisions. The cumbersome nature of these records hampers the process. Researchers mentioned that it takes more than eight minutes on average to answer one question, even if a doctor is trained to use an electronic health record (EHR).
Physicians must spend more time navigating through clunky EHR interfaces frequently than interacting with patients and providing treatment.
Hence, Researchers started developing machine learning models that can automatically streamline the process of finding the information needed by physicians in an EHR. But, effective training models need massive datasets of relevant medical questions. Existing models struggle to generate such authentic queries and cannot find the correct answers successfully.
To overcome the data shortage issue, MIT researchers collaborated with medical experts to study the questions physicians usually ask while reviewing EHRs. Post that, they build a publicly available dataset of over 2000 clinically relevant questions written by these medical experts. They plan to generate many authentic medical questions to train the machine learning model that would assist doctors in finding sought-after information in a patient’s record more efficiently.
Deficiency of Data
Eric Lehman, lead writer and a graduate scholar within the Laptop Science and Synthetic Intelligence Laboratory (CSAIL), explains that the few large datasets of clinical questions the researchers found had many issues.
Some were composed of medical questions asked by patients on web forums, which are far from physician questions. Other datasets had questions produced from templates that are mostly similar in structure, making many questions unrealistic.
Lehman says that collecting high-quality data is vital for machine-learning tasks, especially in a health care context, and they have shown that it can be done.
Cause For Concern
The researchers found that when a model was given trigger text, it could generate a good question 63% of the time, whereas a human physician would ask a good question 80% of the time.
They also trained models to regain answers to clinical questions using the publicly available datasets they had found at the outset of the project. Post which they tested these trained models to see if they could find answers to “good” questions asked by human medical experts. The models were only able to recover about 25% of answers to physician-generated questions.
Lehman says that the result is really concerning. What people thought were good-performing models, in practice, just awful because the evaluation questions they were testing on were not good, to begin with.
The team is now applying this work toward their primary goal, i.e., building a model that can automatically answer physicians’ questions in an EHR. For the next step, they will use their dataset to train a machine-learning model that can automatically generate thousands or millions of good clinical questions, that can then be used for automatic question answering to train a new model.
While there is still much work to do before that model could be a reality, Lehman is encouraged by the team’s strong initial results with this dataset.