Skip to Main Content

It was an audacious undertaking, even for one of the most storied American companies: With a single machine, IBM would tackle humanity’s most vexing diseases and revolutionize medicine.

Breathlessly promoting its signature brand — Watson — IBM sought to capture the world’s imagination, and it quickly zeroed in on a high-profile target: cancer.

advertisement

But three years after IBM began selling Watson to recommend the best cancer treatments to doctors around the world, a STAT investigation has found that the supercomputer isn’t living up to the lofty expectations IBM created for it. It is still struggling with the basic step of learning about different forms of cancer. Only a few dozen hospitals have adopted the system, which is a long way from IBM’s goal of establishing dominance in a multibillion-dollar market. And at foreign hospitals, physicians complained its advice is biased toward American patients and methods of care.

STAT examined Watson for Oncology’s use, marketing, and performance in hospitals across the world, from South Korea to Slovakia to South Florida. Reporters interviewed dozens of doctors, IBM executives, artificial intelligence experts, and others familiar with the system’s underlying technology and rollout.

The interviews suggest that IBM, in its rush to bolster flagging revenue, unleashed a product without fully assessing the challenges of deploying it in hospitals globally. While it has emphatically marketed Watson for cancer care, IBM hasn’t published any scientific papers demonstrating how the technology affects physicians and patients. As a result, its flaws are getting exposed on the front lines of care by doctors and researchers who say that the system, while promising in some respects, remains undeveloped.

“Watson for Oncology is in their toddler stage, and we have to wait and actively engage, hopefully to help them grow healthy,” said Dr. Taewoo Kang, a South Korean cancer specialist who has used the product.

advertisement

At its heart, Watson for Oncology uses the cloud-based supercomputer to digest massive amounts of data — from doctor’s notes to medical studies to clinical guidelines. But its treatment recommendations are not based on its own insights from these data. Instead, they are based exclusively on training by human overseers, who laboriously feed Watson information about how patients with specific characteristics should be treated.

IBM executives acknowledged Watson for Oncology, which has been in development for nearly six years, is in its infancy. But they said it is improving rapidly, noting that by year’s end, the system will offer guidance about treatment for 12 cancers that account for 80 percent of the world’s cases. They said it’s saving doctors time and ensuring that patients get top-quality care.

“We’re seeing stories come in where patients are saying, ‘It gave me peace of mind,’” Watson Health general manager Deborah DiSanzo said. “That makes us feel extraordinarily good that what we’re doing is going to make a difference for patients and their physicians.”

But contrary to IBM’s depiction of Watson as a digital prodigy, the supercomputer’s abilities are limited.

Perhaps the most stunning overreach is in the company’s claim that Watson for Oncology, through artificial intelligence, can sift through reams of data to generate new insights and identify, as an IBM sales rep put it, “even new approaches” to cancer care. STAT found that the system doesn’t create new knowledge and is artificially intelligent only in the most rudimentary sense of the term.

While Watson became a household name by winning the TV game show “Jeopardy!”, its programming is akin to a different game-playing machine: the Mechanical Turk, a chess-playing robot of the 1700s, which dazzled audiences but hid a secret — a human operator shielded inside.

Watson on Jeopardy
“Jeopardy!” champions Ken Jennings (left) and Brad Rutter watch Watson beat them to the buzzer to answer a question during a practice round in 2011. Seth Wenig/AP

In the case of Watson for Oncology, those human operators are a couple dozen physicians at a single, though highly respected, U.S. hospital: Memorial Sloan Kettering Cancer Center in New York. Doctors there are empowered to input their own recommendations into Watson, even when the evidence supporting those recommendations is thin.

The actual capabilities of Watson for Oncology are not well-understood by the public, and even by some of the hospitals that use it. It’s taken nearly six years of painstaking work by data engineers and doctors to train Watson in just seven types of cancer, and keep the system updated with the latest knowledge.

“It’s been a struggle to update, I’ll be honest,” said Dr. Mark Kris, Memorial Sloan Kettering’s lead Watson trainer. He noted that treatment guidelines for every metastatic lung cancer patient worldwide recently changed in the course of one week after a research presentation at a cancer conference. “Changing the system of cognitive computing doesn’t turn around on a dime like that,” he said. “You have to put in the literature, you have to put in cases.”

Watson grew out of an effort to transform IBM from an old-guard hardware company to one that operates in the cloud and along the cutting edge of artificial intelligence. Despite its use in an array of industries — from banking to manufacturing — it has failed to end a streak of 21 consecutive quarters of declining revenue at IBM. In the most recent quarter, revenue even slid from the same period last year in IBM’s cognitive solutions division — which is built around Watson and is supposed to be the future of its business.

In response to STAT’s questions, IBM said Watson, in health care and otherwise, remains on an upward trajectory and “is already an important part” of its $20 billion analytics business. Health care is a crucial part of the Watson enterprise. IBM employs 7,000 people in its Watson health division and sees the industry as a $200 billion market over the next several years. Only financial services, at $300 billion, is considered a bigger opportunity by the company.

At stake in the supercomputer’s performance is not just the fortunes of a famed global company. In the world of medicine, Watson is also something of a digital canary — the most visible attempt to use artificial intelligence to identify the best ways to prevent and treat disease. The system’s larger goal, IBM executives say, is to democratize medical knowledge so that every patient, no matter the person’s geography or income level, will be able to access the best care.

But in cancer treatment, the pursuit of that utopian ideal has faltered.

STAT’s investigation focused on Watson for Oncology because that product is the furthest along in clinical care, though Watson sells separate packages to analyze genomic information and match patients to clinical trials. It’s also applying Watson to other tasks, including honing preventive medicine practices and reading medical images.

Doctors’ reliance on Watson for Oncology varies among hospitals. While institutions with fewer specialists lean more heavily on its recommendations, others relegate the system to a background role, like a paralegal whose main skill is researching existing knowledge.

Hospitals pay a per-patient fee for Watson for Oncology and other products enabled by the supercomputer. The amount depends on the number of products a hospital buys, and ranges between $200 and $1,000 per patient, according to DiSanzo. The system sometimes comes with consulting costs and is expensive to link with electronic medical records. At hospitals that don’t link it with their medical records, more time must be spent typing in patient information.

At Jupiter Medical Center in Florida, that task falls to nurse Jean Thompson, who spends about 90 minutes a week feeding data into the machine. Once she has completed that work, she clicks the “Ask Watson” button to get the supercomputer’s advice for treating patients.

On a recent morning, the results for a 73-year-old lung cancer patient were underwhelming: Watson recommended a chemotherapy regimen the oncologists had already flagged.

“It’s fine,” Dr. Sujal Shah, a medical oncologist, said of Watson’s treatment suggestion while discussing the case with colleagues.

He said later that the background information Watson provided, including medical journal articles, was helpful, giving him more confidence that using a specific chemotherapy was a sound idea. But the system did not directly help him make that decision, nor did it tell him anything he didn’t already know.

Jupiter is one of two U.S. hospitals that have adopted Watson for Oncology. The system has generated more business in India and Southeast Asia. Many doctors in those countries said Watson is saving time and helping more patients get quality care. But they also said its accuracy and overall value is limited by differing medical practices and economic circumstances.

Despite IBM’s marketing blitz, with years of high-profile Watson commercials featuring celebrities from Serena Williams to Bob Dylan to Jon Hamm, the company’s executives are not always gushing. In interviews with STAT, they acknowledged the system faces challenges and needs better integration with electronic medical records and more data on real patients to find patterns and suggest cutting-edge treatments.

“The goal as Watson gets smarter is for it to make some of those recommendations in a more automated way, to sort of suggest now may be the time and let us flip the switch” when a promising treatment option emerges, said Dr. Andrew Norden, a former IBM deputy health chief who left the company in early August. “As I describe it, you’re probably getting a sense it’s really hard and nuanced.”

Such nuance is absent from the careful narrative IBM has constructed to sell Watson.

Alex Hogan, Ike Swetlitz/STAT

It is by design that there is not one independent, third-party study that examines whether Watson for Oncology can deliver. IBM has not exposed the product to critical review by outside scientists or conducted clinical trials to assess its effectiveness.

While it’s not unheard of for companies to avoid external vetting early on, IBM’s circumstances are unusual because Watson for Oncology is not in development — it has already been deployed around the world.

Yoon Sup Choi, a South Korean venture capitalist and researcher who wrote a book about artificial intelligence in health care, said IBM isn’t required by regulatory agencies to do a clinical trial in South Korea or America before selling the system to hospitals. And given that hospitals are already using the system, a clinical trial would be unlikely to improve business prospects.

“It’s too risky, right?” Choi said. “If the result of the clinical trial is not very good — [if] there’s a marginal clinical benefit from Watson — it’s really bad news to the whole IBM.”

Pilar Ossorio, a professor of law and bioethics at University of Wisconsin Law School, said Watson should be subject to tighter regulation because of its role in treating patients. “As an ethical matter, and as a scientific matter, you should have to prove that there’s safety and efficacy before you can just go do this,” she said.

Norden dismissed the suggestion IBM should have been required to conduct a clinical trial before commercializing Watson, noting that many practices in medicine are widely accepted even though they aren’t supported by a randomized controlled trial.

“Has there ever been a randomized trial of parachutes for paratroopers?” Norden asked. “And the answer is, of course not, because there is a very strong intuitive value proposition. … So I believe that bringing the best information to bear on medical decision making is a no-brainer.”

IBM said in its statement that it has collaborated with the research community and presented data on Watson at industry gatherings and in peer-reviewed journals. Some doctors said they didn’t need to see more research to know that the system is valuable. “Artificial intelligence will be adopted in all medical fields in the future,” said Dr. Uhn Lee, who runs the Watson program at Gachon University Gil Medical Center in South Korea. “If that trend, that change is inevitable, then why don’t we just start early?”

So far, the only studies about Watson for Oncology are conference abstracts. The full results haven’t been published in peer-reviewed journals — and every study, save one, was either conducted by a paying customer or included IBM staff on the author list, or both. Most trumpet positive results, showing that Watson saves doctors time and has a high concordance rate with their treatment recommendations.

The “concordance” studies comprise the vast majority of the public research on Watson for Oncology. Doctors will ask Watson for its advice for treating a slew of patients, and then compare its recommendations to those of oncologists. In an unpublished study from Denmark, the rate of agreement was about 33 percent — so the hospital decided not to buy the system. In other countries, the rate can be as high as 96 percent for some cancers. But showing that Watson agrees with the doctors proves only that it is competent in applying existing methods of care, not that it can improve them.

IBM executives said they are pursuing studies to examine the impact on doctors and patients, although none has been completed to date.

Questions about Watson have begun spilling into public view, including in a recent Gizmodo story headlined “Why Everyone is Hating on IBM Watson — Including the People Who Helped Make It.” The most prominent failure occurred last February when MD Anderson Cancer Center, part of the University of Texas, cancelled its partnership with Watson.

The MD Anderson alliance was essentially the early face of Watson in health care. The Houston hospital was among IBM’s first partners, and it was using the system to create its own expert oncology adviser, similar to the one IBM was developing with Memorial Sloan Kettering. But the project disintegrated amid internal allegations of overspending, delays, and mismanagement. In all, MD Anderson spent more than three years and $60 million — much of it on outside consultants — before shelving the effort.

The hospital declined to answer questions. But the project leader, Dr. Lynda Chin, in her first media interview on the subject, told STAT about the challenges she faced. Chin left MD Anderson before the project collapsed; a subsequent audit flagged several violations of procurement rules under her leadership.

Chin said that Watson is a powerful technology, but that it is exceedingly difficult to make functional in health care. She and her team encountered numerous roadblocks, some of which still have not been fully addressed by IBM — at MD Anderson or elsewhere.

The cancer hospital’s first major challenge involved getting the machine to deal with the idiosyncrasies of medical records: the acronyms, human errors, shorthand phrases, and different styles of writing. “Teaching a machine to read a record is a lot harder than anyone thought,” she said. Her team spent countless hours on that problem, trying to get Watson to extract valuable information from medical records so that it could apply them to its recommendations.

Chin said her team also wrestled with deploying the system in clinical practice. Watson, even if guided by doctors, is as close as medicine has ever gotten to allowing a machine to help decide the treatments delivered to human beings. That carries with it thorny questions, such as how to test the safety of a digital treatment adviser, how to ensure its compliance with regulations, and how to incorporate it into the daily work of doctors and nurses.

“Importantly,” Chin said. “How do we create an environment that can ensure the most important tenet in medicine: Do no harm?”

Finally, the project ran into a bigger obstacle: Even if you can get Watson to understand patient variables and make competent treatment recommendations, how do you get it access to enough patient data, from enough different sources, to derive insights that could significantly advance the standard of care?

Chin said that was a showstopper. Watson did not have a connected network of institutions feeding data about specific cohorts of patients. “You may have 10,000 patients for lung cancer. That is still not a very big number when you think about it,” she said.

With data from many more patients, Chin said, you could see patterns — “subsets [of patients] that respond a certain way, subsets that don’t, subsets that have a certain toxicity. That pattern would help with better personalized and precision medicine. But we can’t get there without the ability to actually have a way of aggregating them.”

IBM told STAT that Chin’s work was separate from the effort to create Watson for Oncology, which was validated by cancer specialists at Memorial Sloan Kettering prior to its deployment. The company said that Watson for Oncology can extract and summarize substantial text from patient records, though the information must be verified by a clinician, and that it has made significant progress in obtaining more data to improve Watson’s performance. It pointed to partnerships with the health care publisher Elsevier and the analytics firm Doctor Evidence.

To date, more than 50 hospitals on five continents have agreements with IBM, or intermediary technology companies, to use Watson for Oncology to treat patients, and others are using the genomics and clinical trials products.

But the partnership with Memorial Sloan Kettering, and the product that grew out of it, resulted in complications that IBM has papered over with carefully parsed statements and misleading marketing.

Watson Korean hospital
Tae-hyun Cho (right), the first Korean to be treated with assistance from Watson for Oncology, reviews his medical information with oncologists at Gachon University Gil Medical Center. Gachon University Gil Medical Center

In its press releases, IBM celebrates Memorial Sloan Kettering’s role as the only trainer of Watson. After all, who better to educate the system than doctors at one of the world’s most renowned cancer hospitals?

But several doctors said Memorial Sloan Kettering’s training injects bias into the system, because the treatment recommendations it puts into Watson don’t always comport with the practices of doctors elsewhere in the world.

Given the same clinical scenario, doctors can — and often do — disagree about the best course of action, whether to recommend surgery or chemotherapy, or another treatment. Those discrepancies are especially wide for second- and third-line treatments given after an initial therapy fails, where evidence of benefits is slimmer and consensus more elusive.

Rather than acknowledge this dilemma, IBM executives, in marketing materials and interviews, have sought to downplay it. In an interview with STAT, DiSanzo, the head of Watson Health, rejected the idea that Memorial Sloan Kettering’s involvement creates any bias at all.

“The bias is taken out by the sheer amount of data we have,” she said, referring to patient cases and millions of articles and studies fed into Watson.

But that mischaracterizes how Watson for Oncology works. (IBM later claimed that DiSanzo was referring to Watson in general.)

The system is essentially Memorial Sloan Kettering in a portable box. Its treatment recommendations are based entirely on the training provided by doctors, who determine what information Watson needs to devise its guidance as well as what those recommendations should be.

When users ask Watson for advice, the system also searches published literature — some of which is curated by Memorial Sloan Kettering — to provide relevant studies and background information to support its recommendation. But the recommendation itself is derived from the training provided by the hospital’s doctors, not the outside literature.

Doctors at Memorial Sloan Kettering acknowledged their influence on Watson. “We are not at all hesitant about inserting our bias, because I think our bias is based on the next best thing to prospective randomized trials, which is having a vast amount of experience,” said Dr. Andrew Seidman, one of the hospital’s lead trainers of Watson. “So it’s a very unapologetic bias.”

Seidman said the hospital is careful to keep its training grounded in clinical evidence when the evidence exists, but it is not shy about giving its recommendations when it doesn’t. “We want cancer care to be democratized,” he said.  “We don’t want doctors who don’t have the thousands and thousands of patients’ experience on a more rare cancer to be handicapped. We want to share that knowledge base.”

At a recent training session of Watson on Manhattan’s Upper East Side, the tensions involved in programming the system were on full display. STAT sat in as Memorial Sloan Kettering doctors, led by Seidman, gathered with IBM engineers to train Watson to treat bladder cancer. Five IBM engineers sat on one side of the table. Across from them were three oncologists — one specializing in surgery, another in radiation, and a third in chemotherapy and targeted medicines.

Several minutes into the discussion, the question arose of which treatment to recommend for patients whose cancers persisted through six rounds of chemotherapy. The options in such cases tend to be as slim as the evidence supporting them. Should Watson recommend a radical surgery to remove the bladder? Dr. Tim Donahue, the surgical oncologist, noted that such surgery seldom cures patients and is not associated with improved survival in his experience.

Then what about another course of chemotherapy combined with radiation?

When Watson gives its recommendations, it puts the top recommendation in green, alternative options in orange, and not recommended options in red.

But in some clinical scenarios, it’s difficult to tell the colors apart.

“This is the hard part of this whole game,” Dr. Marisa Kollmeier, the radiation oncologist, said during the training. “There’s a lack of evidence. And you don’t know if something should be in green without evidence. We don’t have a randomized trial to support every decision.”

But the task in front of them required the doctors to press ahead. And they did, rifling through an array of clinical scenarios. In some cases, a large body of evidence backed up their answers. But many others fell into a gray area or were clouded by the inevitable uncertainty of patient preferences.

The meeting was one of many in a months-long process to bring Watson up to speed in bladder cancer. Subsequent sessions would involve feeding it data on real patient cases at Memorial Sloan Kettering, so doctors could reinforce Watson’s training with repetition.

That training does not teach Watson to base its recommendations on the outcomes of these patients, whether they lived, or died or survived longer than similar patients. Rather, Watson makes its recommendations based on the treatment preferences of Memorial Sloan Kettering physicians.

At some institutions using Watson, IBM’s lack of clarity on the cancer center’s role causes confusion. Some seem to think they are getting advice from doctors around the world.

“As we tell the patients, it’s like another consultation, but it’s a worldwide consultation,” said Dr. K. Adam Lee, medical director of thoracic oncology at Jupiter Medical Center, when STAT visited in June.

“Really worldwide,” added Kerri Ward, an oncology nurse at the hospital. “It pulls from 300 journals, just for oncology, the clinical database, so the national clinical database, journals, textbooks, and then Sloan Kettering is the one that’s feeding in the clinical [information] currently.”

Robert Garrett, the CEO of Hackensack Meridian Health, a group in New Jersey that is using a version of Watson for Oncology, said the information in Watson is “global.”

“If you’re a patient that has colon cancer, they have in their database, as I understand it, how colon cancer is treated around the world, by different clinicians, what’s been the most effective treatment for different phases of colon cancer,” Garrett said. “That’s what IBM Watson brings to the table.”

None of that accurately depicts how Watson for Oncology works.

Several doctors who have examined Watson in other countries told STAT that Memorial Sloan Kettering’s role has given them pause. Researchers in Denmark and the Netherlands said hospitals in their countries have not signed on with Watson because it is too focused on the preferences of a few American doctors.

Martijn van Oijen, an epidemiologist and associate professor at Academic Medical Center in the Netherlands, said Memorial Sloan Kettering is packed with top specialists but doesn’t have a monopoly on cancer expertise. “The bad thing is, it’s a U.S.-based hospital with a different approach than some other hospitals in the world,” said van Oijen, who’s involved in a national initiative to evaluate technologies like Watson and is a strong believer in using artificial intelligence to help cancer doctors.

In Denmark, oncologists at one hospital said they have dropped the project altogether after finding that local doctors agreed with Watson in only about 33 percent of cases.

“We had a discussion with [IBM] that they had a very limited view on the international literature, basically, putting too much stress on American studies, and too little stress on big, international, European, and other-part-of-the-world studies,” said Dr. Leif Jensen, who directs the center at Rigshospitalet in Copenhagen that contains the oncology department.

In countries where doctors were trained in the United States, or they use similar treatment guidelines as the Memorial Sloan Kettering doctors, Watson for Oncology can be helpful. Taiwan uses the same guidelines as Americans, so Watson’s advice will be useful there, said Dr. Jeng-Fong Chiou, vice superintendent of the Taipei Cancer Center at Taipei Medical University, which started using Watson for Oncology with patients in July.

But he also said there are differences between American and Taiwanese patients — his patients often receive lower doses of drugs to minimize side effects — and that his oncologists will have to make adjustments from Watson’s recommendations.

The generally affluent population treated at Memorial Sloan Kettering doesn’t reflect the diversity of people around the world. The cases used to train Watson therefore don’t take into account the economic and social issues faced by patients in poorer countries, noted Ossorio, the University of Wisconsin law professor.

“What it’s going to be learning is race, gender, and class bias,” she said. “We’re baking those social stratifications in, and we’re making the biases even less apparent and even less easy for people to recognize.”

Sometimes, the recommendations Watson gives diverge sharply from what doctors would say for reasons that have nothing to do with science, such as medical insurance. In a poster presented at the Global Breast Cancer Conference 2017 in South Korea, researchers reported that the treatment Watson most often recommended for breast cancer patients simply wasn’t covered by the national insurance system.

IBM said it has convened an international group of advisers to gather input on Watson’s performance. It also said that the system can be customized to reflect variations in treatment practices, differences in drug availability and financial considerations, and that the company recently introduced tools reduce the time and cost of adapting Watson.

In a response to STAT’s questions, Memorial Sloan Kettering said international journals are part of the literature it provides to Watson, including the Lancet, the European Journal of Cancer, Annals of Oncology, and the BMJ. “As we do in all areas of cancer research, we will continue to observe and study how Watson for Oncology impacts care internationally, follow the evidence, and work with IBM to optimize the system,” the hospital said.

Some hospitals abroad are customizing the system for their patients, adding information about local treatments. Nan Chen, who manages the Watson for Oncology program at Bumrungrad International Hospital in Thailand, said his oncologists use Japanese guidelines, not American guidelines, for treating gastric cancer.

But he said doctors can find this localization redundant or unnecessary: They are not that interested in being told the same guidance they just taught Watson.

“Our doctors say, this treatment is our own treatment, we know that,” Chen said. “You don’t need to turn around and put those treatments in Watson, and let Watson tell us what kind of treatment that we are using here in the hospital.”

Chen said this modified system is incredibly beneficial, however — to a hospital in the capital of Mongolia that employs zero oncology specialists.

At UB Songdo Hospital, of which Chen’s company is a majority owner, doctors are following Watson’s suggestions nearly 100 percent of the time. Patients who otherwise would have been treated by generalists with little, if any, cancer training are now benefiting from top-level expertise.

“That is the kind of thing that IBM is dreaming about,” Chen said.

In South Korea, Dr. Taewoo Kang, a surgical oncologist at Pusan National University Hospital who specializes in breast cancer, pointed to another important problem that Watson needs to solve. Right now, it provides supporting evidence for the recommendations it makes, but doesn’t actually explain how it came to recommend that particular treatment for that particular patient.

Kang said that, sometimes, he will ask Watson for advice on a patient whose cancer has not spread to the lymph nodes, and Watson will recommend a type of chemotherapy drug called a taxane. But, he said, that therapy is normally used only if the cancer has spread to the lymph nodes. And, to support the recommendation, Watson will show a study demonstrating the effectiveness of the taxane for patients whose cancer did spread to their lymph nodes.

Kang is left confused as to why Watson recommended a drug that he does not normally use for patients like the one in front of him. And Watson can’t tell him why.

WATSON at ASCO
Louisa Roberts (left) of IBM Watson Health speaks with Merck executive Oliver Maschinsky in the Watson booth at the 2017 ASCO cancer conference in Chicago. Heather Stone for STAT

For all the concerns, some doctors around the world who use Watson insist that artificial intelligence will one day revolutionize health care. They say that clinicians are realizing concrete benefits — saving doctors valuable time searching for studies, better educating patients, and undercutting hierarchies in the clinic that might interfere with evidence-based treatment.

In Taiwan, Chiou said Watson immediately provides the “best data” from the literature about a treatment — survival rates, for example — relieving doctors of the task of searching the literature to compare each possible treatment.

Watson’s information also empowers patients, said Lee, the doctor who runs the Watson program at Gil Medical Center in South Korea. Previously, doctors verbally explained different treatment options to patients. Now, physicians can give patients a comprehensive packet prepared by Watson, which includes potential treatment plans along with relevant scientific articles. Patients can do their own research about these treatments, and maybe even disagree with the doctor about the right course of action.

“This is one of the most important and significant changes,” Lee said.

Watson also holds senior doctors accountable to the data. At Gil Medical Center, patients sit in a room with five doctors and Watson itself, the interface displayed on a flat-screen television in the so-called “Watson center.” Lee said that Watson’s presence has a huge influence on the doctors’ decision-making process, leveling the hierarchy that traditionally prioritized the opinion of the senior doctor over junior colleagues.

Watson gives the junior physicians quick and easy access to data that might prove their elders wrong, displaying on the screen information such as the survival rate right alongside a recommended treatment. It would be humiliating for senior doctors to continue to push for a different treatment in light of this evidence, Lee said.

At Manipal Hospitals in India, Dr. S.P. Somashekhar said that while there are some regional disparities in Watson’s recommendations for patients with rectal and breast cancer, those cases are outliers: For the vast majority of patients, the program matched the recommendations given to patients by the hospital’s tumor board — a group of 20 physicians that typically study their cases for a week and spend an hour discussing them.

That means that in a handful of seconds, Watson did what it takes 20 doctors over a week to accomplish. “That is so precious and very highly valuable,” Somashekhar said. “Our physicians cannot discuss every case. For every case we discuss in the tumor board, there are five cases which we cannot discuss.”

While those benefits are significant, they fall short of breakthrough discoveries that could predict or eradicate disease.

IBM executives said that doesn’t mean Watson can’t accomplish those feats. Norden, the former deputy health officer for Watson for Oncology and Genomics, said the goal is to ultimately bring together streams of clinical trial data and real-world patient data, so that Watson could begin to pinpoint the best treatments on its own.

“My own belief is that over time we will be better at measuring and reporting outcomes, and that data will be increasingly influential,” he said. “Where cancer care is today, I don’t think that any computing system is ready to be let out into the world without a measure of expert human oversight.”

IMMERSION in 360 DEGREES: Click and drag to look around the Watson “Immersion Room” in Cambridge, Mass. Dom Smith/STAT

The bigger question for IBM is not whether health care will see a revolution in artificial intelligence but who will drive it.

One former IBM employee says the company could become a victim of its own marketing success — the unrealistic expectations it set are obscuring real accomplishments.

“IBM ought to quit trying to cure cancer,” said Peter Greulich, a former IBM brand manager who has written several books about IBM’s history and modern challenges. “They turned the marketing engine loose without controlling how to build and construct a product.”

Greulich said IBM needs to invest more money in Watson and hire more people to make it successful. In the 1960s, he said, IBM spent about 11.5 times its annual earnings to develop its mainframe computer, a line of business that still accounts for much of its profitability today.

If it were to make an equivalent investment in Watson, it would need to spend $137 billion. “The only thing it’s spent that much money on is stock buybacks,” Greulich said.

IBM said it created the market for artificial intelligence and is pleased with the pace of Watson’s growth, noting that it and other new business units grew by more than $20 billion in the past three years. “It took Facebook and Amazon more than 13 years to grow $20 billion,” the company said in a statement.

Since Watson’s “Jeopardy!” demonstration in 2011, hundreds of companies have begun developing health care products using artificial intelligence. These include countless startups, but IBM also faces stiff competition from industry titans such as Amazon, Microsoft, Google, and the Optum division of UnitedHealth Group.

Google’s DeepMind, for example, recently displayed its own game-playing prowess, using its AlphaGo program to defeat a world champion in Go, a 3,000-year-old Chinese board game.

DeepMind is working with hospitals in London, where it is learning to detect eye disease and speed up the process of targeting treatments for head and neck cancers, although it has run into privacy concerns.

Meanwhile, Amazon has launched a health care lab, where it is exploring opportunities to mine data from electronic health records and potentially build a virtual doctor’s assistant.

A recent report by the financial firm Jefferies said IBM is quickly losing ground to competitors. “IBM appears outgunned in the war for AI talent and will likely see increasing competition,” the firm concluded.

While not specific to Watson’s health care products, the report said potential clients are backing away from the system because of significant consulting costs associated with its implementation. It also noted that Amazon has 10 times the job listings of IBM, which recently didn’t renew a small number of contractors that worked for the company following its acquisition of Truven, a company it bought for $2.6 billion last year to gain access to 100 million patient records.

In its statement, IBM said that the workers’ contracts ended and that it is continuing to hire aggressively in the Cambridge, Mass.-based Watson Health and other units, with more than 5,000 positions open in the U.S.

But the outlook for Watson for Oncology is challenging, say those who have worked closest with it. Kris, the lead trainer at Memorial Sloan Kettering, said the system has the potential to improve care and ensure more patients get expert treatment. But like a medical student, Watson is just learning to perform in the real world.

“Nobody wants to hear this,” Kris said. “All they want to hear is that Watson is the answer. And it always has the right answer, and you get it right away, and it will be cheaper. But like anything else, it’s kind of human.”

STAT encourages you to share your voice. We welcome your commentary, criticism, and expertise on our subscriber-only platform, STAT+ Connect

To submit a correction request, please visit our Contact Us page.