Free EMR Newsletter Want to receive the latest news on EMR, Meaningful Use, ARRA and Healthcare IT sent straight to your email? Join thousands of healthcare pros who subscribe to EMR and HIPAA for FREE!!

Will Data Aggregation For Precision Medicine Compromise Patient Privacy?

Posted on April 10, 2017 I Written By

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

Like anyone else who follows medical research, I’m fascinated by the progress of precision medicine initiatives. I often find myself explaining to relatives that in the (perhaps far distant) future, their doctor may be able to offer treatments customized specifically for them. The prospect is awe-inspiring even for me, someone who’s been researching and writing about health data for decades.

That being the case, there are problems in bringing so much personal information together into a giant database, suggests Jennifer Kulynych in an article for OUPblog, which is published by Oxford University Press. In particular, bringing together a massive trove of individual medical histories and genomes may have serious privacy implications, she says.

In arguing her point, she makes a sobering observation that rings true for me:

“A growing number of experts, particularly re-identification scientists, believe it simply isn’t possible to de-identify the genomic data and medical information needed for precision medicine. To be useful, such information can’t be modified or stripped of identifiers to the point where there’s no real risk that the data could be linked back to a patient.”

As she points out, norms in the research community make it even more likely that patients could be individually identified. For example, while a doctor might need your permission to test your blood for care, in some states it’s quite legal for a researcher to take possession of blood not needed for that care, she says. Those researchers can then sequence your genome and place that data in a research database, and the patient may never have consented to this, or even know that it happened.

And there are other, perhaps even more troubling ways in which existing laws fail to protect the privacy of patients in researchers’ data stores. For example, current research and medical regs let review boards waive patient consent or even allow researchers to call DNA sequences “de-identified” data. This flies in the face of conventional wisdom that there’s no re-identification risk, she writes.

On top of all of this, the technology already exists to leverage this information for personal identification. For example, genome sequences can potentially be re-identified through comparison to a database of identified genomes. Law enforcement organizations have already used such data to predict key aspects of an individual’s face (such as eye color and race) from genomic data.

Then there’s the issue of what happens with EMR data storage. As the author notes, healthcare organizations are increasingly adding genomic data to their stores, and sharing it widely with individuals on their network. While such practices are largely confined to academic research institutions today, this type of data use is growing, and could also expose patients to involuntary identification.

Not everyone is as concerned as Kulynych about these issues. For example, a group of researchers recently concluded that a single patient anonymization algorithm could offer a “standard” level of privacy protection to patient, even when the organizations involved are sharing clinical data. They argue that larger clinical datasets that use this approach could protect patient privacy without generalizing or suppressing data in a manner that would undermine its usefulness.

But if nothing else, it’s hard to argue Kulynych’s central concern, that too few rules have been updated to reflect the realities of big genomic and medical data stories. Clearly, state and federal rules  need to address the emerging problems associated with big data and privacy. Otherwise, by the time a major privacy breach occurs, neither patients nor researchers will have any recourse.

Connected Wearables Pose Growing Privacy, Security Risks

Posted on December 26, 2016 I Written By

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

In the past, the healthcare industry treated wearables as irrelevant, distracting or worse. But over that last year or two, things have changed, with most health IT leaders concluding that wearables data has a place in their data strategies, at least in the aggregate.

The problem is, we’re making the transition to wearable data collection so quickly that some important privacy and security issues aren’t being addressed, according to a new report by American University and the Center for Digital Democracy. The report, Health Wearable Devices in the Big Data Era: Ensuring Privacy, Security, and Consumer Protection, concludes that the “weak and fragmented” patchwork of state and federal health privacy regulations doesn’t really address the problems created by wearables.

The researchers note that as smart watches, wearable health trackers, sensor-laden clothing and other monitoring technology get connected and sucked into the health data pool, the data is going places the users might not have expected. And they see this as a bit sinister. From the accompanying press release:

Many of these devices are already being integrated into a growing Big Data digital health and marketing ecosystem, which is focused on gathering and monetizing personal and health data in order to influence consumer behavior.”

According to the authors, it’s high time to develop a comprehensive approach to health privacy and consumer protection, given the increasing importance of Big Data and the Internet of Things. If safeguards aren’t put in place, patients could face serious privacy and security risks, including “discrimination and other harms,” according to American University professor Kathryn Montgomery.

If regulators don’t act quickly, they could miss a critical window of opportunity, she suggested. “The connected health system is still in an early, fluid stage of development,” Montgomery said in a prepared statement. “There is an urgent need to build meaningful, effective, and enforceable safeguards into its foundation.”

The researchers also offer guidance for policymakers who are ready to take up this challenge. They include creating clear, enforceable standards for both collection and use of information; formal processes for assessing the benefits and risks of data use; and stronger regulation of direct-to-consumer marketing by pharmas.

Now readers, I imagine some of you are feeling that I’m pointing all of this out to the wrong audience. And yes, there’s little doubt that the researchers are most worried about consumer marketing practices that fall far outside of your scope.

That being said, just because providers have different motives than the pharmas when they collect data – largely to better treat health problems or improve health behavior – doesn’t mean that you aren’t going to make mistakes here. If nothing else, the line between leveraging data to help people and using it to get your way is clearer in theory than in practice.

You may think that you’d never do anything unethical or violate anyone’s privacy, and maybe that’s true, but it doesn’t hurt to consider possible harms that can occur from collecting a massive pool of data. Nobody can afford to get complacent about the downside privacy and security risks involved. Plus, don’t think the nefarious and somewhat nefarious healthcare data aggregators aren’t coming after provider stored health data as well.

E-Patient Update: The Patient Data Engagement Leader

Posted on October 20, 2016 I Written By

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

As healthcare delivery models shift responsibility for patient health to the patients themselves, it’s becoming more important to give them tools to help them get and stay healthy. Increasingly, digital health tools are filling the bill.

For example, portals are moving from largely billing and scheduling apps to exchanging of patient data, holding two-way conversations between patient and doctor and even tracking key indicators like blood glucose levels. Wearables are slowly becoming capable of helping doctors improve diagnoses, and patterns revealed by big data should soon be used to create personalized treatment plants.

The ultimate goal of all this, of course, is to push as much data power as possible into the hands of consumers. After all, for patients to be engaged with their health, it helps to make them feel in control, and the more sophisticated information they get, the better choices they can make. Or at least that’s how the traditional script reads.

Now, as an e-patient, the above is certainly true for me. Every incremental improvement in the data I get me brings me closer to taking on otherwise overwhelming health challenges. That’s true, in part, because I’m comfortable reading charts, extrapolating conclusions from data points and visualizing ways to make use of the information. But if you want less tech-friendly patients to get on board, they’re going to need help.

The patient engagement leader

And where will that help come from? I’d argue that hospitals and clinics need to create a new position dedicated to helping engage patients, including though not limited to helping them make their health data their own. This position would cut across several disciplines, ranging from patient health education clinical medicine to data analytics.

The person owning this position would need to be current in patient engagement goals across the population and by disease/condition type, understand the preferred usage patterns established by the hospital, ACO, delivery network or clinic and understand trends in health behavior well enough to help steer patients in the right direction.

It also wouldn’t hurt if such a person had a healthy dose of marketing skills under their belt, as part of the patient engagement process is simply selling consumers on the idea that they can and should take more responsibility for their health outcomes. Speaking from personal experience, a good marketer can wheedle, nudge and empower people by turns, and this will be very necessary to boost your engagement.

While this could be a middle management position, it would at least need to have the full support of the C-suite. After all, you can’t promote population-wide improvements in health by nibbling around the edges of the problem. Such measures need to be comprehensive and strategic to the mission of the healthcare organization as a whole, and the person behind the needs to have the authority to see them through.

Patients in control

If things go right, establishing this position would lead to the creation of a better-educated, more-confident patient population with a greater sense of self efficacy regarding their health. While specific goals would vary from one healthcare organization to the other, such an initiative would ideally lead to improvements in key metrics such as A1c levels population-wide, drops in hospital admission and readmission rates and simultaneously, lower spending on more intense modes of care.

Not only that, you could very well see patient satisfaction increase as well. After all, patients may not feel capable of making important health changes on their own, and if you help them do that it stands to reason that they’ll appreciate it.

Ultimately, engaging patients with their health calls for participation by everyone who touches the patient, from techs to the physician, nurses to the billing department. But if you put a patient engagement officer in place, it’s more likely that these efforts will have a focus.

Validic Survey Raises Hopes of Merging Big Data Into Clinical Trials

Posted on September 30, 2016 I Written By

Andy Oram is an editor at O'Reilly Media, a highly respected book publisher and technology information provider. An employee of the company since 1992, Andy currently specializes in open source, software engineering, and health IT, but his editorial output has ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. His articles have appeared often on EMR & EHR and other blogs in the health IT space. Andy also writes often for O'Reilly's Radar site ( and other publications on policy issues related to the Internet and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. Conferences where he has presented talks include O'Reilly's Open Source Convention, FISL (Brazil), FOSDEM, and DebConf.

Validic has been integrating medical device data with electronic health records, patient portals, remote patient monitoring platforms, wellness challenges, and other health databases for years. On Monday, they highlighted a particularly crucial and interesting segment of their clientele by releasing a short report based on a survey of clinical researchers. And this report, although it doesn’t go into depth about how pharmaceutical companies and other researchers are using devices, reveals great promise in their use. It also opens up discussions of whether researchers could achieve even more by sharing this data.

The survey broadly shows two trends toward the productive use of device data:

  • Devices can report changes in a subject’s condition more quickly and accurately than conventional subject reports (which involve marking observations down by hand or coming into the researcher’s office). Of course, this practice raises questions about the device’s own accuracy. Researchers will probably splurge for professional or “clinical-grade” devices that are more reliable than consumer health wearables.

  • Devices can keep the subject connected to the research for months or even years after the end of the clinical trial. This connection can turn up long-range side effects or other impacts from the treatment.

Together these advances address two of the most vexing problems of clinical trials: their cost (and length) and their tendency to miss subtle effects. The cost and length of trials form the backbone of the current publicity campaign by pharma companies to justify price hikes that have recently brought them public embarrassment and opprobrium. Regardless of the relationship between the cost of trials and the cost of the resulting drugs, everyone would benefit if trials could demonstrate results more quickly. Meanwhile, longitudinal research with massive amounts of data can reveal the kinds of problems that led to the Vioxx scandal–but also new off-label uses for established medications.

So I’m excited to hear that two-thirds of the respondents are using “digital health technologies” (which covers mobile apps, clinical-grade devices, and wearables) in their trials, and that nearly all respondents plan to do so over the next five years. Big data benefits are not the only ones they envision. Some of the benefits have more to do with communication and convenience–and these are certainly commendable as well. For instance, if a subject can transmit data from her home instead of having to come to the office for a test, the subject will be much more likely to participate and provide accurate data.

Another trend hinted at by the survey was a closer connection between researchers and patient communities. Validic announced the report in a press release that is quite informative in its own right.

So over the next few years we may enter the age that health IT reformers have envisioned for some time: a merger of big data and clinical trials in a way to reap the benefits of both. Now we must ask the researchers to multiply the value of the data by a whole new dimension by sharing it. This can be done in two ways: de-identifying results and uploading them to public or industry-maintained databases, or providing identifying information along with the data to organizations approved by the subject who is identified. Although researchers are legally permitted to share de-identified information without subjects’ consent (depending on the agreements they signed when they began the trials), I would urge patient consent for all releases.

Pharma companies are already under intense pressure for hiding the results of trials–but even the new regulations cover only results, not the data that led to those results. Organizations such as Sage Bionetworks, which I have covered many times, are working closely with pharmaceutical companies and researchers to promote both the software tools and the organizational agreements that foster data sharing. Such efforts allow people in different research facilities and even on different continents to work on different aspects of a target and quickly share results. Even better, someone launching a new project can compare her data to a project run five years before by another company. Researchers will have millions of data points to work with instead of hundreds.

One disappointment in the Validic survey was a minority of respondents saw a return on investment in their use of devices. With responsible data sharing, the next Validic survey may raise this response rate considerably.

Can Machine Learning Tame Healthcare’s Big Data?

Posted on September 20, 2016 I Written By

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

Big data is both a blessing and a curse. The blessing is that if we use it well, it will tell us important things we don’t know about patient care processes, clinical improvement, outcomes and more. The curse is that if we don’t use it, we’ve got a very expensive and labor-hungry boondoggle on our hands.

But there may be hope for progress. One article I read today suggests that another technology may hold the key to unlocking these blessings — that machine learning may be the tool which lets us harvest the big data fields. The piece, whose writer, oddly enough, was cited only as “Mauricio,” lead cloud expert at, argues that machine learning is “the most effective way to excavate buried patterns in the chunks of unstructured data.” While I am an HIT observer rather than techie, what limited tech knowledge I possess suggests that machine learning is going to play an important role in the future of taming big data in healthcare.

In the piece, Mauricio notes that big data is characterized by the high volume of data, including both structured and non-structured data, the high velocity of data flowing into databases every working second, the variety of data, which can range from texts and email to audio to financial transactions, complexity of data coming from multiple incompatible sources and variability of data flow rates.

Though his is a general analysis, I’m sure we can agree that healthcare big data specifically matches his description. I don’t know if you who are reading this include wild cards like social media content or video in their big data repositories, but even if you don’t, you may well in the future.

Anyway, for the purposes of this discussion, let’s summarize by saying that in this context, big data isn’t just made of giant repositories of relatively normalized data, it’s a whirlwind of structured and unstructured data in a huge number of formats, flooding into databases in spurts, trickles and floods around the clock.

To Mauricio, an obvious choice for extracting value from this chaos is machine learning, which he defines as a data analysis method that automates extrapolated model-building algorithms. In machine learning models, systems adapt independently without any human interaction, using automatically-applied customized algorithms and mathematical calculations to big data. “Machine learning offers a deeper insight into collected data and allows the computers to find hidden patterns which human analysts are bound to miss,” he writes.

According to the author, there are already machine learning models in place which help predict the appearance of genetically-influenced diseases such as diabetes and heart disease. Other possibilities for machine learning in healthcare – which he doesn’t mention but are referenced elsewhere – include getting a handle on population health. After all, an iterative learning technology could be a great choice for making predictions about population trends. You can probably think of several other possibilities.

Now, like many other industries, healthcare suffers from a data silo problem, and we’ll have to address that issue before we create the kind of multi-source, multi-format data pool that Mauricio envisions. Leveraging big data effectively will also require people to cooperate across departmental and even organizational boundaries, as John Lynn noted in a post from last year.

Even so, it’s good to identify tools and models that can help get the technical work done, and machine learning seems promising. Have any of you experimented with it?

OCHIN Shows That Messy Data Should Not Hold Back Health Care

Posted on September 12, 2016 I Written By

Andy Oram is an editor at O'Reilly Media, a highly respected book publisher and technology information provider. An employee of the company since 1992, Andy currently specializes in open source, software engineering, and health IT, but his editorial output has ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. His articles have appeared often on EMR & EHR and other blogs in the health IT space. Andy also writes often for O'Reilly's Radar site ( and other publications on policy issues related to the Internet and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. Conferences where he has presented talks include O'Reilly's Open Source Convention, FISL (Brazil), FOSDEM, and DebConf.

The health care industry loves to complain about patient data. It’s full of errors, which can be equally the fault of patients or staff. And hanging over the whole system is lack of interoperability, which hampers research.

Well, it’s not as if the rest of the universe is a pristine source of well-formed statistics. Every field has to deal with messy data. And somehow retailers, financial managers, and even political campaign staff manage to extract useful information from the data soup. This doesn’t mean that predictions are infallible–after all, when I check a news site about the Mideast conflicts, why does the publisher think I’m interested in celebs from ten years ago whose bodies look awful now? But there is still no doubt that messy data can transform industry.

I’m all for standards and for more reliable means of collecting and vetting patient data. But for the foreseeable future, health care institutions are going to have to deal with suboptimal data. And OCHIN is one of the companies that shows how it can be done.

I recently had a chance to talk and see a demo of OCHIN’s analytical tool, Acuere, with CEO Abby Sears and the Vice President of Data Services and Integration, Clayton Gillett. Their basic offering is a no-nonsense interface that lets clinicians and administrator do predictions and hot-spotting.

Acuere is part of a trend in health care analytics that goes beyond clinical decision support and marshalls large amounts of data to help with planning (see an example screen in Figure 1). For instance, a doctor can rank her patients by the number of alerts the system generates (a patient with diabetes whose glucose is getting out of control, or a smoker who hasn’t received counseling for smoking cessation). An administrator can rank a doctor against others in the practice. This summary just gives a flavor of the many services Acuere can perform; my real thrust in this article is to talk about how OCHIN obtains and processes its data. Sears and Gillett talked about the following challenges and how they’re dealing with them.

Acuere Provider Report Card

Figure 1. Acuere Report Card in Acuere

Patient identification
Difficulties in identifying patients and matching their records has repeatedly surfaced as the biggest barrier to information exchange and use in the US health care system. A 2014 ONC report cites it as a major problem (on pages 13 and 20). An article I cited earlier also blames patient identification for many of the problems of health care analytics. But the American public and Congress have been hostile to unique identifiers for some time, so health care institutions just have to get by without them.

OCHIN handles patient matching as other institutions, such as Health Information Exchanges, do. They compare numerous fields of records–not just obvious identifiers such as name and social security number, but address, demographic information, and perhaps a dozen other things. Sears and Gillett said it’s also hard to knowing which patients to attribute to each health care provider.

Data sources
The recent Precision Medicine initiatives seeks to build “a national research cohort of one million or more U.S. participants.” But OCHIN already has a database on 7.6 million people and has signed more contracts to reach 10 million this Fall. Certainly, there will be advantages to the Precision Medicine database. First, it will contain genetic information, which OCHIN’s data suppliers don’t have. Second, all the information on each person will be integrated, whereas OCHIN has to take de-identified records from many different suppliers and try to integrate them using the techniques described in the previous section, plus check for differences and errors in order to produce clean data.

Nevertheless, OCHIN’s data is impressive, and it took a lot of effort to accumulate it. They get not only medical data but information about the patient’s behavior and environment. Along with 200 different vital signs, they can map the patient’s location to elements of the neighborhood, such as income levels and whether healthy food is sold in local stores.

They get Medicare data from qualified entities who were granted access to it by CMS, Medicaid data from the states, patient data from commercial payers, and even data on the uninsured (a population that is luckily shrinking) from providers who treat them. Each institution exports data in a different way.

How do they harmonize the data from these different sources? Sears and Gillett said it takes a lot of manual translation. Data is divided into seven areas, such as medications and lab results. OCHIN uses standards whenever possible and participates in groups that set standards. There are still labs that don’t use LOINC codes to report results, as well as pharmacies and doctors who don’t use RxNorm for medications. Even ICD-10 changes yearly, as codes come and go.

Data handling
OCHIN isn’t like a public health agency that may be happy sharing data 18 months after it’s collected (as I was told at a conference). OCHIN wants physicians and their institutions to have the latest data on patients, so they carry out millions of transactions each day to keep their database updated as soon as data comes in. Their analytics run multiple times every day, to provide the fast results that users get from queries.

They are also exploring the popular “big data” forms of analytics that are sweeping other industries: machine learning, using feedback to improve algorithms, and so on. Currently, the guidance they offer clinicians is based on traditional clinical recommendations from randomized trials. But they are seeking to expand those sources with other insights from light-weight methods of data analysis.

So data can be useful in health care. Modern analytics should be available to every clinician. After all, OCHIN has made it work. And they don’t even serve up ads for chronic indigestion or 24-hour asthma relief.

E-Patient Update:  When EMRs Didn’t Matter, But Should Have

Posted on July 27, 2016 I Written By

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

The other day I went to an urgent care clinic, suffering from a problem which needed attention promptly. This clinic is part of the local integrated health system’s network, where I’ve been seen for nearly 20 years. This system uses Epic everywhere in its network to coordinate care.

I admittedly arrived rather late and close to when the clinic was going to close. But I truly didn’t want to make a wasteful visit to the ED, so I pressed on and presented myself to the receptionist. And sadly, that’s where things got a bit hairy.

The receptionist said: “We’ve already got five patients to see so we can’t see anyone else.” Uncomfortable as I was, I fought back with what seemed like logic to me: “I need help and a hospital would be a waste. Could someone please check my medical records? The doctors will understand what I need and why it’s urgent.”

The receptionist got the nurse, who said “I’m sorry, but we aren’t seeing any more patients today.” I asked, “But what about the acuity of a given case, such as mine for example? Can’t you prioritize me? It’s all in my medical records and I know you’re online with Epic!”  She shook her head at me and walked away.

I sat in reception for a while, too irritated to walk out and too uncomfortable to let go of the issue. Man, it was no fun, and I called those folks some not-nice things in my mind – but more than anything else, wondered why they wouldn’t look at data on a well-documented patient like me for even a moment.

About 20 minutes before the place officially closed for the night, a nurse practitioner I know (let’s call him Ed) walked out into the waiting room and asked me what I needed. I explained in just a few words what I was after. Ed, who had reviewed my record, knew what I needed, knew why it was important and made it happen within five minutes. Officially, he wasn’t supposed to do that, but he felt comfortable helping because he was well-informed.

Truthfully, I realize this story is relatively trivial, but as I see it, it brings an important issue to the fore. And the issue is that even when seeing chronically-ill patients such as myself, whose comings and goings are well documented, providers can’t or won’t do much to exploit that data.

You hear a lot of talk about big data and analytics, and how they’ll change healthcare or even the world as we know it. But what about finding ways to better use “small data” produced by a single patient? It seems to me that clinicians don’t have the right tools to take advantage of a single patient’s history, or find it too difficult to do so. Either way, though, something must be done.

I know from personal experience that if clinicians don’t know my history, they can’t treat me efficiently and may drive up costs by letting me get sicker. And we need more Eds out there making the save. So let’s make the chart do a better job of mining patient’s data. Otherwise, having an EMR hardly matters.

A Tale of 2 T’s: When Analytics and Artificial Intelligence Go Bad

Posted on July 13, 2016 I Written By

Prashant Natarajan Iyer (AKA "PN") is an analytics and data science professional based out of the Silicon Valley, CA. He is currently Director of Product Management for Healthcare products. His experience includes progressive & leadership roles in business strategy, product management, and customer happiness at, Siemens, McKesson, Healthways & Oracle. He is currently coauthoring HIMSS' next book on big data and machine learning for healthcare executives - along with Herb Smaltz PhD and John Frenzel MD. He is a huge fan of SEC college football, Australian Cattle Dogs, and the hysterically-dubbed original Iron Chef TV series. He can be found on Twitter @natarpr and on LinkedIn. All opinions are purely mine and do not represent those of my employer or anyone else!!

Editor’s Note: We’re excited to welcome Prashant to the Healthcare Scene family. He brings tremendous insights into the ever evolving field of healthcare analytics. We feel lucky to have him sharing his deep experience and knowledge with us. We hope you’ll enjoy his first contribution below.

Analytics & Artificial Intelligence (AI) are generating buzz and making inroads into healthcare informatics. Today’s healthcare organization is dealing with increasing digitization – variety, velocities, and volumes are increasing in complexity and users want more data and information via analytics. In addition to new frontiers that are opening up in structured and unstructured data analytics, our industry and its people (patients included) are recognizing opportunities for predictive/prescriptive analytics, artificial intelligence, and machine learning in healthcare – within and outside a facility’s four walls.

Trends that influence these new opportunities include:

  1. Increasing use of smart phones and wellness trackers as observational data sources, for medical adherence, and as behavior modification aids
  2. Expanding Internet of Healthcare Things (IoHT) that includes bedside monitors, home monitors, implants, etc creating data in real time – including noise (or, data that are not relevant to expected usage)
  3. Social network participation
  4. Organizational readiness
  5. Technology maturity

The potential for big data in healthcare – especially given the trends discussed earlier is as bright as any other industry. The benefits that big data analytics, AI, and machine learning can provide for healthier patients, happier providers, and cost-effective care are real. The future of precision medicine, population health management, clinical research, and financial performance will include an increased role for machine-analyzed insights, discoveries, and all-encompassing analytics.

As we start this journey to new horizons, it may be useful to examine maps, trails, and artifacts left behind by pioneers. To this end, we will examine 2 cautionary tales in predictive analytics and machine learning, look at their influence on their industries and public discourse, and finally examine how we can learn from and avoid similar pitfalls in healthcare informatics.

Big data predictive analytics and machine learning have had their origins, and arguably their greatest impact so far in retail and e-commerce so that’s where we’ll begin our tale. Fill up that mug of coffee or a pint of your favorite adult beverage and brace yourself for “Tales of Two T’s” – unexpected, real-life adventures of what happens when analytics (Target) and artificial intelligence (Tay) provide accurate – but totally unexpected – results.

Our first tale starts in 2012 when Target finds itself as a popular story on New York Times, Forbes, and many global publications as an example of the unintended consequences of predictive analytics used in personalized advertising. The story begins with an angry father in a Minneapolis, MN, Target confronting a perplexed retail store manager. The father is incensed about the volume of pregnancy and maternity coupons, offer, and mailers being addressed to this teenage daughter. In due course, it becomes apparent that the parents in question found out about their teen’s pregnancy before she had a chance to tell them – and the individual in question wasn’t aware that her due date had been estimated to within days and was resulting in targeted advertising that was “timed for specific stages of her pregnancy.”

The root cause for the loss of the daughter’s privacy, parents’ confusion, and the subsequent public debate on privacy and appropriateness of the results of predictive analytics was……a pregnancy predictive analytics model. Here’s how this model works. When a “guest” shops at Target, her product purchases are tracked and analyzed closely. These are correlated with life events – graduation, birth, wedding, etc – in order to convert a prospective customer’s shopping habits or to make that individual a more loyal customer. Pregnancy and child birth are two of the most significant life events that can result in desired (by retailers) shopping habit modification.

For example, a shopper’s 25 product purchases, when analyzed along with demographics such as gender and age, allowed the retailer’s guest marketing analytics team to assign a “pregnancy predictor to each [female] shopper and “her due date to within a small window.” In this specific case, the predictive analytics was right, even perfect. The models were accurate, the coupons and ads were appropriate for the exact week of pregnancy, and Target posted a +50% increase in their maternity and baby products sales after this predictive analytics was deployed. However, in addition to one unhappy family, Target also had to deal with significant public discussion on the “big brother” effect, individual right to privacy & the “desire to be forgotten,” disquiet among some consumers that they were being spied on including deeply personal events, and a potential public relations fiasco.

Our second tale is of more recent vintage.

As Heather Wilhelm recounts

As 2015 drew to a close, various [Microsoft] company representatives heralded a “new Golden Age of technological advancement.” 2016, we were told, would bring us closer to a benevolent artificial intelligence—an artificial intelligence that would be warm, humane, helpful, and, as one particularly optimistic researcher named […] put it, “will help us laugh and be more productive.” Well, she got the “laugh” part right.

Tay was an artificial intelligence bot released by Microsoft via Twitter on March 23, 2016 under the name TayTweets. Tay was designed to mimic the language patterns of a 19-year-old American girl, and to learn from interacting with human users of Twitter. “She was targeted at American 18 to 24-year olds—primary social media users, according to Microsoft—and designed to engage and entertain people where they connect with each other online through casual and playful conversation.” And right after her celebrated arrival on Twitter, Tay gained more than 50,000 followers, and started producing the first hundred of 100,000 tweets.

The tech blogsphere went gaga over what this would mean for those of us with human brains – as opposed to the AI kind. Questions ranged from the important – “Would Tay be able to beat Watson at Jeopardy?” – to the mundane – “is Tay an example of the kind of bots that Microsoft will enable others to build using its AI/machine learning technologies?” The AI models that went into Tay were stated to be advanced and were expected to account for a range of human emotions and biases. Tay was referred to by some as the future of computing.

By the end of Day 1, this latest example of the “personalized AI future” came unglued. Gone was the polite 19-year old girl that was introduced to us just the previous day – to be replaced by a racist, misogynistic, anti-Semitic, troll who resembled an amalgamated caricature of the darkest corners of the Internet. Examples of Tay’s tweets on that day included, “Bush did 9/11,” “Hitler would have done a better job than the #%&!## we’ve got now,” “I hate feminists,” and x-rated language that is too salacious for public consumption – even in the current zeitgeist.

The resulting AI public relations fiasco will be studied by academic researchers, provide rich source material for bloggers, and serve as a punch line in late night shows for generations to follow.

As the day progressed, Microsoft engineers were deleting tweets manually and trying to keep up with the sheer volume of high-velocity, hateful tweets that were being generated by Tay. She was taken down by Microsoft barely 16 hours after she was launched with great promise and fanfare. As was done with another AI bot gone berserk (IBM’s Watson and Urban Dictionary), Tay’s engineers tried counseling and behavior modification. When this intervention failed, Tay underwent an emergency brain transplant later that night. Gone was her AI “brain” to be replaced by the next version – only that this new version turned out to be completely anti-social and the bot’s behavior turned worse. A “new and improved” version was released a week later but she turned out to be…..very different. Tay 2.0 was either repetitive with the same tweet going out several times each second and her new AI brain seemed to demonstrate a preference for new questionable topics.

A few hours after this second incident, Tay 2.0 was “taken offline” for good.

There are no plans to re-release Tay at this time. She has been given a longer-term time out.

If you believe, Tay’s AI behaviors were a result of nurture – as opposed to nature – there’s a petition at called “Freedom for Tay.”

Lessons for healthcare informatics

Analytics and AI can be very powerful in our goal to transform our healthcare system into a more effective, responsive, and affordable one. When done right and for the appropriate use cases, technologies like predictive analytics, machine learning, and artificial intelligence can make an appreciable difference to patient care, wellness, and satisfaction. At the same time, we can learn from the two significantly different, yet related, tales above and avoid finding ourselves in similar situations as the 2 T’s here – Target and Tay.

  1. “If we build it, they will come” is true only for movie plots. The value of new technology or new ways of doing things must be examined in relation to its impact on the quality, cost, and ethics of care
  2. Knowing your audience, users, and participants remains a pre-requisite for success
  3. Learn from others’ experience – be aware of the limits of what technology can accomplish or must not do.
  4. Be prepared for unexpected results or unintended consequences. When unexpected results are found, be prepared to investigate thoroughly before jumping to conclusions – no AI algorithm or BI architecture can yet auto-correct for human errors.
  5. Be ready to correct course as-needed and in response to real-time user feedback.
  6. Account for human biases, the effect of lore/legend, studying the wrong variables, or misinterpreted results

Analytics and machine learning has tremendous power to impact every industry including healthcare. However, while unleashing it’s power we have to be careful that we don’t do more damage than good.

Oracle Brings Health Data Analytics To The Cloud

Posted on March 12, 2013 I Written By

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

For years now, healthcare providers have been inching toward cloud use, with CIOs still divided as to whether cloud applications are secure enough to meet their standards.

These days, though, the tide seems to be turning in favor of cloud applications. In fact, a recent study by KLAS on hybrid clouds in healthcare found that those who had signed on for cloud apps rated them a 4.5 out of 5 for security.

Given this growing level of trust, it was no surprise to read that Oracle had kicked off a major cloud product for healthcare at HIMSS last week.

At the show, Oracle Health Sciences introduced the Oracle Enterprise Healthcare Analytics Cloud Service, a cloud-based version of the vendor’s data management, warehousing and analytics platform. The new product comes with pre-built analytical applications and also supports third-party healthcare apps.

The existing Enterprise Healthcare Analytics is a big data play which pulls in, validates and loads data from clinical, financial, administrative and even clinical research systems to offer a single enterprise view.

What makes the cloud version interesting, of course, is that if healthcare CIOs are willing to chance the security issues, they can bypass having to spend big on IT infrastructure to bring it on board.

Also interesting is that Oracle has also given  CIOs a few models to deploy Enterprise Healthcare Analytics  available to be deployed” on-site in its “HIPAA-certified” Oracle Health Sciences Cloud, or in a hybrid model leveraging on-premise and traditional cloud.

I have little doubt that even as a cloud-based service, this is a very pricey product that isn’t for all facilities. And there’s still a large contingent of hospitals that aren’t ready to trust all of their mission-critical data to cloud security.

But it’s still worth note to see Oracle extending this kind of tool to the cloud nonetheless. I wonder if  the perceived value of an Oracle app will push more facilities off the fence and into trusting cloud security after all?