A natural language processing algorithm has achieved 90% precision in automatically spotting signs of social isolation in cancer patients by “reading” clinical notes in a hospital’s electronic health record.
The feat was the work of behavioral scientists and biomedical informaticists at the Medical University of South Carolina (MUSC). Their findings are running online in BMC Medical Informatics and Decision Making.
Senior study author Chanita Hughes Halbert, PhD, and colleagues focused on patients with cancer of the prostate. They did so because previous research has found these patients tend to withdraw from relationships due to the side effects of treatment, which not infrequently include incontinence and/or impotence.
The researchers developed a lexicon and NLP pipelines able to extract from digitized clinical notes terms like “lack of social support,” “lonely,” “social isolation,” “no friends” and “loneliness.”
They trained the algorithm on 55,516 clinical notes from a random sampling of 3,130 patients, using a dataset of 1,057 other prostate-cancer patients as their test mechanism.
The tool identified 35 unique patients, about 1.2% of the experimental cohort, who were likely to be suffering from social isolation.
The smaller test dataset turned up 17 patients (1.6%) with terms indicating social isolation.
Manually reviewing 154 records of 52 patients randomly selected as a control, the team found the algorithm only brought back four false positives for social isolation and one false negative. Hence the report of 90% precision.
“Our NLP approach demonstrates a highly accurate approach to identify social isolation when such information is available in clinical notes,” the authors concluded, emphasizing that their lexicon is specific to patients with prostate cancer and so is not likely generalizable to other patient populations.
In coverage of the study by MUSC’s news office, lead study author Vivienne Zhu, MD, points out that most patients tend to tell their doctors about social isolation and other health factors during office visits.
“But you won’t find that in the coded data,” she adds. “You have to look at the clinical notes—that’s where the information is embedded.”
The MUSC news writer points out that the algorithm was able to comb through tens of thousands of clinical notes in only eight seconds. A human performing the same task, she adds, would probably have to work many months to get the job done.