Machine learning used to predict clinical response to anti-cancer drugs

Cancer treatment is a step closer to becoming personalised with Mitra, a biotechnology company in Boston, releasing a study on predicting different patients’ clinical response to anti-cancer drugs using a machine learning algorithm.

Mitra's team of scientists engineered the functional response of tumour ecosystems of 109 patients to anti-cancer drugs with the corresponding clinical outcomes. They then applied a machine learning algorithm called support vector machine, partial area under the curve (SVMpAUC) to train on the data and create a model.

It was tested on 55 patients, with 100 per cent sensitivity (true positive rate) in predictions while keeping specificity (true negative rate) in a desired high range.

This CANScript technology, as the Mitra has called it, was able to make accurate predictions on the effectiveness of targeted and cytotoxic drugs in patients with head and neck squamous cell carcinoma (HNSCC) and colorectal cancer (CRC).

“The ability to predict patient tumour response to cytotoxic or target defined therapeutic agents remains a holy grail. While molecular and genetic profiling is driving the evolution of subtype-specific personalised therapy, the presence of a biomarker often does not translate into a successful clinical outcome,” the Mitra team wrote in its study.

The team first classified patients as responders (R) or non-responders (NR), with a high true positive rate. The model maximised partial AUC while achieving at least 75 per cent specificity (so a 25 per cent false positive rate) on the training set, and assigned coefficients (the number to multiply a variable, eg. 4x) of 0.2977, 0.5562, 0.0073 and 0.1388 to the viability, histology, proliferation and apoptosis read-outs.

The threshold (cut off mark that determines if prediction is positive or negative) was 19.1, with cases having a weighted score greater than 19.1 classified as responders.

“The model achieved 96.77 per cent sensitivity on the training set. We then tested the learned algorithm on a new test group of 55 patients, consisting of 42 HNSCC and 13 CRC patients treated with the same drugs as above, where the model achieved 91.67 per cent specificity and 100 per cent sensitivity.”

The team then refined the model to classify responders into partial responders (PR) and complete responders (CR). The threshold was adjusted to maximise the PR versus CR prediction accuracy on the training data. Cases with a weighted score between 19.1 and 55.14 were classified as partial responders and greater than 55.14 were classified as complete responders.

“The resulting predictions had 87.27 per cent accuracy on the test set. In particular, among the 55 test cases, there were only seven prediction errors: four PRs were predicted as CR; one CR was predicted as PR; one NR was predicted as PR; and one NR was predicted as CR.

Read: How hospitals are using big data analytics to research paediatric cancer

“Biomarker analysis selected all 13 CRC patients in the test set, all of whom were positive for wild-type KRAS, to receive cetuximab. However, as can be seen, only 3 of these 13 wild-type KRAS patients actually responded to the drug (1 exhibited CR and 2 exhibited PR), while the remaining 10 presented with progressive disease.

“The CANScript platform predicted two CRs, two PRs and nine NRs, with only one actual NR case being wrongly predicted as CR… Based on standard practice, all 42 HNSCC patients in the test set received TPF. However, 14 of these patients did not respond to the drug combination. The CANScript platform could identify 13 of these as NRs. Again, importantly, all patients predicted by the platform as NRs were indeed NRs.”

Thirteen CRC patients and 42 HNSCC patients are small sample sizes to work with, however, and the team noted that larger sample sizes are needed to further test the technology.

“However, based on the observed improvements over the standard/biomarker-based approach, we anticipate that the CANScript platform can emerge as a powerful strategy for predicting chemotherapy outcomes.

“This approach was found to be superior to the performance of a standard, widely used support vector ordinal regression algorithm that directly aims to make predictions in the three categories and does not explicitly incorporate the need for high sensitivity.”

Mitra's study has been published in Nature Communications.

Follow CIO Australia on Twitter and Like us on Facebook… Twitter: @CIO_Australia, Facebook: CIO Australia, or take part in the CIO conversation on LinkedIn: CIO Australia

Follow Rebecca Merrett on Twitter: @Rebecca_Merrett

Copyright © 2015 IDG Communications, Inc.

6 digital transformation success stories