Most of us would agree that extracting clinical data from unstructured physician notes would be great. At present, few organizations have deployed such tools, nor have EMR vendors come to the rescue en masse, and the conventional wisdom holds that text analytics would be crazy expensive. I’ve always suspected that digging out and analyzing this data may be worth the trouble, however.
That’s why I really dug a recent article from HealthCatalyst’s Eric Just, which seemed to offer some worthwhile ideas on how to use text analytics effectively. Just, who is senior vice president of product development, made a good case for giving this approach a try. (Note: HealthCatalyst and partner Regenstrief Institute offer solutions in this area.)
The article includes an interesting case study explaining how healthcare text analytics performed head-to-head against traditional research methods.
It tells the story of a team of analysts in Indiana that set out to identify peripheral artery disease (PAD) patients across two health systems. At first gasp, things weren’t going well. When researchers looked at EMR and claims data, they found that failed to identify over 75% of patients with this condition, but text analytics improved their results dramatically.
Using ICD and CPT codes for PAD, and standard EMR data searches, team members had identified less than 10,000 patients with the disorder. However, once they developed a natural language processing tool designed to sift through text-based data, they discovered that there were at least 41,000 PAD patients in the population they were studying.
To get this kind of results, Just says, there are three key features a medical text analytics tool should have:
- The medical text analytics software should tailor results to a given user’s needs. For example, he notes that if the user doesn’t have permission to view PHI, the analytics tool should display only nonprivate data.
- Medical text analytics tools should integrate medical terminology to improve the scope of searches. For example, when a user does a search on the term “diabetes” the search tool should automatically be capable of displaying results for “NIDDM,” as this broadens the search to include more relevant content.
- Text analytics algorithms should do more than just find relevant terms — they should provide context as well as content. For example, a search for patients with “pneumonia,” done with considering context, would also bring up phrases like “no history of pneumonia.” A better tool would be able to rule out phrases like “no history of pneumonia,” or “family history of pneumonia” from a search for patients who have been treated for this illness.
The piece goes into far more detail than I can summarize here, so I recommend you read it in full if you’re interested in leveraging text analytics for your organization.
But for what it’s worth, I came away from the piece with the sense that analyzing your clinical textual information is well worth the trouble — particularly if EMR vendors being to add such tools to their systems. After all, when it comes to improving outcomes, we need all the help we can get.