Abstract
High-throughput drug combination screening provides a systematic strategy to discover unexpected combinatorial synergies in pre-clinical cell models. However, phenotypic combinatorial screening with multi-dose matrix assays is experimentally expensive, especially when the aim is to identify selective combination synergies across a large panel of cell lines or patient samples. Here we implemented DECREASE, an efficient machine learning model that requires only a limited set of pairwise dose-response measurements for accurate prediction of drug combination synergy and antagonism. Using a compendium of 23,595 drug combination matrices tested in various cancer cell lines, and malaria and Ebola infection models, we demonstrate how cost-effective experimental designs with DECREASE capture almost the same degree of information for synergy and antagonism detection as the fully-measured dose-response matrices. Measuring only the diagonal of the matrix provides an accurate and practical option for combinatorial screening. The open-source web-implementation enables applications of DECREASE to both pre-clinical and translational studies.
Keywords: drug combination effects, dose-response landscapes, high-throughput screening, supervised machine learning, open-source software
Introduction
Combination therapies have become a standard clinical management of several complex diseases, including cancer, asthma, diabetes and bacterial infections, since drug combinations can increase therapeutic efficacy and reduce toxic side-effects compared to mono-therapies.1–8 Accordingly, there is an increasing interest in exploring the massive combinatorial spaces spanned by both approved drugs and investigational agents, with the aim to identify effective and safe combination therapies for multiple indications. High-throughput screening (HTS) instruments have made it possible to profile the phenotypic effects of thousands of drug combinations in pre-clinical model systems (e.g. cell lines or primary patient-derived cells), and systematic combinatorial screens are being widely used by pharma industry, medical chemistry community as well as by translational researchers to provide actionable hypotheses for pre-clinical or clinical studies9–11. However, most HTS combinatorial experiments are performed at single concentrations of individual agents, which make it difficult to capture combinatorial dose-response effects. Multi-dose combinatorial response matrices over various concentration ranges provide better insights into the synergistic (or antagonistic) dose windows, and enable the use of more robust surface-based synergy models (e.g. Loewe) for detection of the most potent hits from combinatorial screens12,13. However, due to the massive number of potential drug-dose combinations, large-scale multi-dose combinatorial screening requires extensive resources and instrumentation14, beyond the capability of most academic laboratories15.
To make the HTS combinatorial screening more feasible, especially in studies based on primary patient specimens, one solution is to use the half-maximal inhibitory concentration (IC50) of the individual compounds to pre-select the concentrations or combinations to be profiled16. However, this approach may miss potent synergies beyond the selected sets of doses or combinations. Another solution to the combinatorial explosion of potential dose and compound combinations is to measure only a part of the multi-dose-response matrix when screening for synergistic effects17–19. For instance, the popular Combination Index (CI)20 can be calculated using only the diagonal elements of the full dose-response matrix (i.e. fixed-ratio diagonal design) to identify synergistic compound pairs (Fig. 1, upper panel). Another popular approach to detect synergy with reduced experimental costs is to use a fixed-concentration design, in which various concentrations of one agent are tested with a pre-defined concentration (e.g. IC50) of the second agent21–23 (Fig. 1, upper panel). However, such cost-effective designs have also several shortcomings; for instance, any outlier measurements will have a drastic influence on the synergy detection, and the pre-defined concentration points can easily miss a narrow synergistic dose window of new drug pairs. Furthermore, many synergy scoring software, including Combenefit24, SynergyFinder25, and Chalice (Horizon Discovery Inc.), require the full dose-response matrix to calculate the overall synergy metrics, including those based on Loewe additivity or Bliss independence models26,27.
To address these challenges, we introduce DECREASE (Drug Combination RESponse prEdiction), an efficient machine learning approach to guiding and speeding-up the conduction of unbiased HTS-based screening for synergistic drug combination effects using cost-effective combinatorial screening with a minimal set of experiments. DECREASE implements a two-step procedure for robust synergy prediction and scoring (Fig. 1, second panel). In the first step, outlier measurements inherent in HTS experiments are detected using differences between the observed responses and those expected based on the Bliss approximation (Supplementary Fig. 1), followed by prediction of the complete dose-response combination matrices with our novel, composite Non-negative Matrix Factorization (cNMF) algorithm. Based on the predicted combinatorial dose-response landscapes (Fig. 1, third panel), the overall synergy scores of the drug combinations are calculated using any selected reference model (e.g. Loewe26, Bliss27, HSA28 or ZIP29) to identify the most synergistic hits (Fig. 1, bottom panel). Since combination effects between drugs are known to depend on a variety of factors, including their mode-of-action, tested concentration ranges and biological heterogeneity across model systems30–32, we carried out a systematic evaluation among 23,595 pairwise combinations tested in various cancer cell lines, as well as in malaria and Ebola infection models, to demonstrate the wide applicability of DECREASE. The source code and web-implementation as well as dose-response matrices of 210 novel anti-cancer drug combinations are available at http://decrease.fimm.fi.
Results
We initially developed the DECREASE method based on an in-house compendium of 192 anti-cancer agent combinations among both approved drugs and investigational compounds, tested in 8×8 dose-response matrix designs in 10 breast cancer cell lines (BT-549, CAL-148, CAL-51, CAL-85–1, DU-4475, HCC-1599, MDA-MB-231, MDA-MD-436, MFM-223, and MCF10A). These combinations present with a wide spectrum of both synergistic and antagonistic mechanisms (the unpublished dose-response matrix data are available at http://decrease.fimm.fi/data_availability), and were used to investigate the performance of various machine learning algorithms in the task of predicting drug combination responses (Supplementary Fig. S2). Our comparative analysis indicated that the best prediction accuracy was obtained with the cNMF method, together with regularized boosted regression trees (XGBoost). In applications of DECREASE to new drug combination datasets, the implemented ensemble of cNMF and XGBoost algorithms is trained independently for each user-provided incomplete dose-response matrix using Bayesian optimization with repeated cross validation (see Methods).
Combination synergy prediction using a minimal set of measurements
To test the robustness of the selected machine learning algorithms against technical, biological and pharmacological variability, we first validated the predictive accuracy of the best-performing learning algorithms implemented in DECREASE (an average ensemble of cNMF and XGBoost), using pairwise combinations of ibrutinib with 466 broadly-targeting anti-cancer compounds tested in 6×6 dose-response matrix experiments (the published dose-response matrix data are available at http://tripod.nih.gov/matrix-data/m3-btk-6x6-ls)11. This extensive combinatorial data set covers a variety of agents with distinct mechanisms of action tested in activated B-cell-like subtype (ABC) of diffuse large B-cell lymphoma (DLBCL) cell line TMD8.
The validation results in the external DLBCL dataset revealed that already a limited number of dose-combination response measurements, regardless whether they originated from a single row, column, diagonal, or random points of the dose-response combination matrix, led to a relatively high prediction accuracy for both synergistic and antagonistic effects of drug combinations with the DECREASE method (Pearson correlation rBLISS 0.82–0.91, Fig. 2a–d; Supplementary Fig. S3a). Only the design where the concentration row was selected at IC50 of the single agents led to markedly decreased drug combination synergy prediction (rBLISS=0.58, Fig. 2e), suggesting that IC50 is not an optimal measure for defining the fixed-concentrations for combination assays.
Overall, the top 5% of most synergistic combinations (23 out of 466) identified using the full dose-response matrices were found among the top 7% (32 out of 466) synergistic combinations identified with the DECREASE predictions, when using the fixed-concentration, single row design (P<0.0001, Fisher’s exact text; the expected overlap is 2 pairs based on random sampling). Similarly, the top 5% of most antagonistic combinations were present among the top 9% (43 out of 466) antagonistic pairs predicted with DECREASE (P<0.0001; see Supplementary Fig. S4). For comparison, the recently introduced Dose model42 led to markedly poorer combination effect predictions across the full synergy and antagonism spectrum (rBLISS=0.22, Fig. 2f; Supplementary Fig. S3b).
Accuracy of DECREASE in predicting dose combination surfaces
Even though the principal aim of the DECREASE pipeline is to predict the combination synergies across the tested drug pairs (Fig. 1), we also investigated how accurately the full dose-response matrices predicted by DECREASE capture the combinatorial dose-response patterns across various concentration levels (so-called combination surfaces). For this challenging prediction task, we experimentally measured 18 novel anti-cancer combinations tested in 8×8 matrix designs in three cancer cell lines (HEK293, HeLa and Hep G2). We used the fixed-concentration design (see Fig 2c), where the same middle-concentration row was selected in each of the 18 matrices to predict the full dose-response combination surfaces with both the DECREASE and Dose models. The models were trained independently for each incomplete dose-response matrix, and then tested on the remaining matrix elements.
We found that the Bliss synergy surfaces calculated based on the DECREASE predictions resemble those based on the full dose-response combination matrix (see Supplementary File 1 for all the tested combinations). Notably, even though there existed marked differences (Fig. 3a), the DECREASE-predicted Bliss synergies deviated on average 1.7 units from the measured synergies at the level of dose combinations, showing significantly better predictive accuracy compared to the Dose model (P < 0.0001, Welch’s t-test; Fig. 3b). Furthermore, DECREASE-predicted dose-response matrices were closer to the measured inhibition levels also in terms of the root mean squared error (RMSE), calculated over the full range of concentrations (P = 0.004, Wilcoxon test; Fig. 3b). Similar results were obtained when using the middle-concentration column design (Supplementary Fig. S5; Supplementary File 2).
Effect of the measured sub-matrix on DECREASE prediction accuracy
To find out which concentrations of the dose-response matrix should be measured for maximal predictive performance, we explored various experimental designs in terms of how much they provide information for the synergy prediction. DECREASE showed its best prediction performance when one of the middle rows of the dose-response matrix was selected for the model training (rBLISS of 0.90, Fig. 4a). In our in-house 8×8 dosing regimen, the middle rows corresponded to EC20–30% levels of the agents, but the variability in EC percentages among the combinations was relatively large. Strikingly, adding other concentration rows of the dose-response matrix did not markedly improve the synergy prediction performance beyond that obtained with the middle row alone (Fig. 4b). However, the use of the diagonal dose-combination pairs resulted in a similar accuracy with the same number of measurements (rBLISS = 0.92). As expected, when only the first row of the matrix (i.e., single-agent response profile) was used for training, the prediction accuracy remained very low (rBLISS = 0.22).
We next challenged the DECREASE model using a broader dataset from the study of O’Neil et al., which includes a total of 22,737 anticancer drug combinations (583 drug pairs tested in 39 diverse cancer cell lines with a 4×4 dosing regimen)3. This dataset is more challenging, not only due to various cancer cell lines, but also because the single-compound dose levels that were used to measure the monotherapy responses were not aligned with the concentrations used in the dose-combination matrix. Using the Hill equation for the singe-compound dose-response curves (see Methods), we interpolated the corresponding dose-responses and then applied the DECREASE method with the diagonal design to the interpolated 5×5 dose-response matrices. This rather small combinatorial matrix experiment led to a high correlation between the predicted and measured synergies (average rBLISS = 0.94). Notably, the cell lines originating from six different tissues (breast, colon, lung, melanoma, ovarian and prostate) resulted in similar prediction accuracies (Supplementary Fig. S6).
To explore the optimal use of the measured sub-matrices for the cost-effective high-throughput combinatorial screening, we further investigated several ways how to best utilize the information form the dose-response matrix for the prediction of combination effects (Supplementary Fig. S7a). Using the 6×6 dose-response combination matrix data from the DLBCL experiment, we found out that incorporating both the single agents’ concentrations together with their responses led to improved prediction accuracy of the combination responses (RMSE = 7.7 %inhibition, Supplementary Fig. S7b), compared to making use of merely the single agent concentrations (RMSE = 11.3 %inhibition) or responses (RMSE = 10.7 %inhibition). Taken together, these results demonstrate that carefully-selected single-concentration rows (or the diagonal) provide sufficient training data for accurate prediction of combination synergy. While additional combination measurements increase the accuracy to certain degree, these may not be warranted given the cost of the additional measurements.
Application of DECREASE to non-cancer drug combination datasets
To further illustrate the wide applicability and performance of DECREASE also in non-cancer combinatorial screens, we used a published dataset of 104 anti-malarial agent combinations (available at https://tripod.nih.gov/matrix-client/rest/matrix/blocks/506/table)10, where the combinations were tested in the HB3 strain of Plasmodium falciparum, a unicellular protozoan parasite that causes malaria in humans. DECREASE was again able to pinpoint the most synergistic and antagonistic drug combinations using the fixed-concentration and diagonal designs (rBLISS of 0.78 and 0.80, respectively, Fig. 5). The somewhat lower prediction accuracy indicates that a single row or diagonal cannot capture all the complex synergy patterns in such larger 10×10 dose-response matrices. However, the top 5% of most-synergistic combinations (5 out of 104) identified using the full dose-response matrices were found among the top 7% (7 out of 104) synergistic combinations identified with DECREASE when using the single-row experimental design (P<0.0001, Fisher’s exact test). This is sufficient for practical screening applications, where the aim is to find a few most synergistic combinations for a given cancer patient or malaria strain. For instance, DECREASE identified artesunate-mefloquine as the most synergistic pair in the HB3 strain, which is one the combination therapies recommended by WHO for the treatment of uncomplicated falciparum malaria worldwide46.
In another non-cancer application case, we used DECREASE to prioritize the most potent synergistic antiviral combinations among 78 drug combinations for Ebola treatment47. These combinations were tested in 6×6 dose-response matrix format in the Huh7 liver cell line infected with Ebola virus Makona isolate (data are available at https://matrix.ncats.nih.gov/matrix-client/rest/matrix/blocks/6324/table). Similar to the DLBCL and malaria applications, we observed a high prediction accuracy of the drug combination effects with the DECREASE model, when using both the fixed, single-row (rBLISS = 0.87, Fig. 5c) and the diagonal designs (rBLISS of 0.94, Fig. 5d). In particular, 24 out of 25-top synergistic combinations (Bliss synergy>10, 5% of all pairs) based on the fully-measured dose-response matrix were present among the top-25 synergistic pairs identified using the DECREASE model based on the diagonal design (the one remaining top-25 combination had a rank of 30 in the DECREASE-predicted list). These results indicate that the DECREASE method is highly accurate and applicable also to a wide variety of model systems and experimental designs, other than high-throughput anticancer drug combination screening, where the aim is to prioritize top-synergistic combination effects for further experimentation using a minimal set of combinatorial measurements.
Discussion
With the aim to reduce the cost and time required for high-throughput combinatorial experiments, we introduced an efficient experimental-computational approach to identify most potent synergistic and antagonistic combinations with a minimal set of measurements. We illustrated this approach using a large number of pre-clinical model systems and experimental setups, encompassing a total of 23,595 pairwise drug combinations tested in 53 cancer cell lines and also in a malaria and Ebola infection models. Notably, one can use DECREASE with cost-effective designs and still capture almost the same degree of information content for synergy and antagonism detection as provided by the fully-measured multi-dose combinatorial matrix. DECREASE was also shown to be robust and compatible with various experimental designs, dose ranges and model systems, making it thus widely applicable to various biomedical problems, such as those aiming at identification of anti-cancer, bacterial, fungal or antiviral drug combination synergies. To promote its wide application, alone or together with popular synergy scoring models (e.g., Bliss, Loewe, HSA or ZIP), we have implemented DECREASE as an interactive R Shiny-based web-tool (http://decrease.fimm.fi). The R codes are freely available to enable its further extensions to other similar applications. The web-tool comes with a user documentation and website tour that walks the user through the steps of the DECREASE pipeline. The Methods section and Supplementary provide further details of the implemented methodology.
We have also made available 210 novel anti-cancer combinations tested in-house in 13 cancer cell lines. These preclinical combination effects show rather heterogeneous synergy across the tested cell line panel (Supplementary Fig. S8), suggesting that individualized combination testing is critical for therapy decision. The DECREASE approach is therefore likely to prove useful also for accelerating drug combination testing in primary patient samples13, where the cost-efficient DECREASE design enables screening of a much larger set of combinations that would be possible with the traditional full matrix design. This is especially important when testing patient-derived samples that are scarce and often difficult to obtain. We note that the aim of the DECREASE model is not to completely replace the full combinatorial experiments, rather to prioritize the most potent combinations among the massive number of potential combinations for a given patient that should be focused on in further testing, and to de-prioritize those that are not showing any predicted synergy to avoid unnecessary human or animal studies. For instance, we found that two of the combinations with relatively high synergy in our cell line screens have completed Phase 1 trials and two combinations are currently undergoing Phase 2 clinical trials, suggesting these have already passed the safety and efficacy testing both in animal models and human subjects (Supplementary Fig. S8). The rest of the novel combinations provide a rich resource for the community toward further combination discoveries.
Several computational approaches to predict drug combination effects have been described32,43. However, we still have relatively poor understanding of the mechanisms that may lead to synergistic interactions between drugs, and mechanism-unbiased approaches are therefore important for identifying new synergies. Systematic HTS strategy traditionally involves very extensive experimental efforts. Here, we took a data-driven and a mechanism-independent approach to allow for quick and cost-effective predictions of synergistic interactions using considerably fewer tested data points than the standard strategies. The only input needed for the DECREASE model is a sub-matrix of the drug combination dose-response measurements, without the need to have structural or target information of compounds, nor molecular profiles of the cell models. This enables users to run the DECREASE model for each cell line and drug combination separately, trained on the fly based on user-provided incomplete dose-response matrix, hence enabling real-time personalized medicine applications. Compared to our previous approach to predicting cell-contex44 or patient-specific synergy scores13, we demonstrated here that the single-compound responses alone provide only a sub-optimal accuracy (Fig. 4), rather additional combinatorial measurements are required for obtaining maximal accuracies of the combinatorial patterns across several doses for a given drug pair. However, each additional dose-combination measurement comes with increased experimental costs, in terms of time, reagents, labour and specimens, which often needs to be minimized in practical applications.
Many high-throughput drug combination experiments have used a fixed-concentration design, where a single concentration of one drug (e.g. IC50 or EC50) is combined with various concentrations of the other drug. Therefore, we wanted initially to examine the applicability of DECREASE to the fixed concentration designs. It was found out that the performance of DECREASE was not too sensitive to the selection of the concentration row, provided it is one of the middle rows (Fig. 4). This suggests that the concentration ranges of individual compounds of the combination assays investigated in this work were selected so that the middle rows or columns captured sufficient information for combination effect prediction. Surprisingly, the best concentrations did not originate from around IC50 or EC50 levels, rather the EC percentages of the most informative concentrations were relatively variable across both the combinations and datasets (Supplementary Fig. S9). In larger-scale combination assays, it may be more challenging to select the concentration ranges of the individual compounds so carefully, which may result in cases where the middle concentration rows or columns turn out to be sub-optimal. In those applications, the diagonal design provides an accurate and robust option that avoids the selection of any fixed single concentration, either row or column. The diagonal design is implemented as the default option in the DECREASE tool, since it captures the widest range of different types of synergistic effects, and the diagonal pairs are practically feasible to measure in high-throughput instruments, providing thus a valid compromise between the fixed-concentration (easiest) and the random design (most challenging experimentally).
As also shown in the earlier work42, linear models are not flexible enough for accurate prediction of drug combination effects across various drug classes and model systems (Supplementary Fig. 10). We further showed that our cNMF algorithm, combined with XGBoost, markedly outperformed the Dose model in terms of predicting the spectrum of combination effects (both synergy and antagonism, Fig. 2 and Fig. 3). We note that our outlier detection approach (Supplementary Fig. S1) shares similarities with the Dose model42 in that they are both based on the Bliss approximation for calculating the expected combination effects. However, the Dose model can go also beyond the pairwise combinations to predict the effects of combining three or more antibiotics or anticancer drugs at various doses, based only on measurements of drug pairs at selected doses. The focus of the present work was solely on pairwise combinations, but in the future extensions, we will also consider higher-order combination prediction using the DECREASE model, once there is large enough high-order dose-response tensor data available for testing its predictions. An additional future improvement would be to assess also the potential toxic effects of the drug combinations for clinical applications. In addition to machine learning predictions, the cost- and time-constraints of high-throughput drug combination screens can be reduced also by novel types of experimental assays, such as reagent-saving and cell-sparing microfluidics devices, to support applications of combinatorial screening45.
Methods
In-house combination experiments
Our in-house experiments involved a total of in 210 anti-cancer combinations among 34 distinct compounds tested in 13 cancer cell lines. The list of the tested compounds along with their concentration ranges, target mechanisms and approval statuses are shown in Supplementary Table S1. The compound-specific concentration ranges were selected based on the single-agent activities and earlier dose-response assays in breast cancer cell lines33, which were used for defining a range of active and inactive concentrations for these agents. The 192 anti-cancer combinations used in developing of the DECREASE model were tested in 8×8 matrix designs in 10 breast cancer cell lines (BT-549, CAL-148, CAL-51, CAL-85–1, DU-4475, HCC-1599, MDA-MB-231, MDA-MD-436, MFM-223, and MCF10A), utilizing the drug-combination assay described in the previous study33. The 18 additional anti-cancer combinations used in validating the model predictions were also tested in 8×8 matrix designs using the same assay in HEK293 (embryonic kidney), HeLa (cervical cancer) and Hep G2 (hepatocellular cancer) cell lines. The cell lines were authenticated using microsatellite markers (Promega GenePrint 10 System, date: August 2, 2017). All the cell lines were grown in larger volume, and were regularly tested for mycoplasma each time assay ready cells were prepared using PCR based test kit and frozen in several ampules.
Before the screens, the cell lines were passaged twice after thawing. The cell lines were maintained in their culture media at 37 °C with 5 % CO2 in a humidified incubator (see Supplementary Table S2 for details of the culture conditions). The compounds and their combinations were plated in 7 different concentrations in half-log dilution (approximately 3-fold), covering a 1000-fold concentration range on clear bottom 384-well plates (Corning #3712), using an Echo 550 Liquid Handler (Labcyte). As positive (total killing) and negative (non-effective) controls, we used 100 μM benzethonium chloride and 0.1 % dimethyl sulfoxide (DMSO), respectively, when calculating the relative inhibition %. All the liquid handling was performed with a MultiDrop Combi dispenser (Thermo Scientific). The pre-dispensed compounds were dissolved in 5 μl of culture medium per well for 1 h on an orbital shaker, followed by 20 μl cell suspension per well seeded in the drugged plates (the optimal final cell densities were adjusted for each cell lines). After 72 h incubation, cell viability was measured using CellTiter-Glo (Promega) reagent. The unpublished dose-response matrix data for the 210 in-house combinations has been made available at http://decrease.fimm.fi/data_availability.
Published combination datasets
A total of 466 anti-cancer compounds combined with ibrutinib were tested in 466 6×6 matrix experiments in the ABC DLBCL line TMD8, cultured in triplicate in 96-well plates as described in the original study11. Briefly, 1,000 cells per well in 5 μL of media (RPMI −1640 supplemented with 5% FBS, 100 U/mL penicillin and 100 μg/mL streptomycin, and 2 mM glutamine) were added into pre-plated matrix plates, immediately after compounds were acoustically dispensed using an ATS-100 (EDC Biosystems). The plates incubated at 37 °C with 5% CO2 under 95% humidity for 48 h and then 3 μL of CellTiter Glo luminescent cell viability assay reagent (Promega) was added using a Bioraptor Flying Reagent Dispenser (Aurora DiscoveryBD) for cell proliferation assays. The plates were then incubated for 15 min at room temperature. The cell viability was measured using a 10 s exposure with a ViewLux (Perkin-Elmer) with a luminescent filter. For apoptosis assays, the cells were incubated for 8 or 16 h, followed by adding 3 μL of Caspase Glo 3/7 luminescent apoptosis assay reagent (Promega). Relative luminescence units (RLU) for each well were normalized using the median RLUs from the negative (DMSO) and positive (bortezomib) controls.
The second published dataset of 104 antimalarial combinations among 29 distinct compounds were tested in 10×10 matrix designs in Plasmodium falciparum strain HB3, as described in the original study10. Briefly, Plasmodium falciparum strain was maintained in RPMI 1640 medium containing 5% human O+ erythrocytes (5% hematocrit), 0.5% Albumax (GIBCO), 24 mM sodium bicarbonate, and 10 μg/ml gentamycin at 37°C with 5% CO2, 5% O2, and 90% N2. Compounds were dissolved in either deionized water, 50% EtOH, or DMSO, depending on solubility. Pre-plating of the chemical library and plating of cells were done similar to the 466 anti-cancer drug combinations as described above for the DLBCL TMD8 cell lines. The parasite inhibition was measured using a the SYBR Green viability assay that measures the SYBR Green I fluorescence emission (530 ± 4.5 nm) from each well at an excitation wavelength of 490 nm. The intensity values in each well were normalized based on intensity from negative control (no compounds) and positive controls (chloroquine at high concentrations).
The third published dataset was from the study of O’Neil et al.3, and it included a total of 22,737 experiments of 583 pairwise combinations measured using a 4×4 dosing regimen in 39 diverse cancer cell lines obtained from ATCC or Sigma-Aldrich. Briefly, cells were plated in 1,536-well tissue culture-treated plates (Brooks Automation) at 400cells/well in 10 mL growth media, followed by the addition of 50 nL of compounds in DMSO and incubation at 37 at 400cells/well in10 mL growth media. The total cell viability of each well was measured using CellTiter-Glo cell viability reagent (Promega), according to the manufacturer’s protocol. The high-throughput screen was carried out using a fully auto-mated GNF PolyTarget robotic platform (GNF Systems).
The fourth published dataset of 78 antiviral drug combinations were tested in the Huh7 liver cells infected with Makona isolate, Ebola virus/H.sapiens-tc/GIN/14/WPG-C05, as described in the original study47. The Huh7 cells were maintained at the Integrated Research Facility, National Institutes of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH; Frederick, MD), following cell source instructions described in the original manuscript47. Briefly, drugs in 50-μL of Dulbecco’s modified Eagle’s medium were transferred to the Huh7 cells seeded in black, clear-bottomed, 96-well plates 1 hour prior to inoculation with EBOV/Mak. After 48 hours, plates were fixed, and EBOV/Mak was detected with a mouse antiEBOV VP40 antibody. Drug combination efficacy was measured in triplicates with a 6 × 6 matrix design using Cell Titer Glo assays.
The DECREASE workflow
The DECREASE approach is based on a 2-step procedure: (i) detection of outlier measurements in the sparse matrices obtained by limited measurement experimental designs (e.g. fixed-concentration or diagonal designs; see Fig. 1, upper panel), followed by (ii) prediction of complete dose-response matrices using the cNMF algorithm (Fig. 1, second panel). Finally, top-synergistic drug combinations are identified using any of the synergy models (e.g., Bliss, Loewe, HSA or ZIP; Fig. 1, bottom panel).
Outlier detection
Since the drug-combination inhibition responses in the dose-response matrices are usually non-monotonically increasing as a function of concentration levels, and drug synergy or antagonism may occur at any concentration range, none of the standard distribution-based methods is applicable to the outlier detection. This also complicates the discovery of two closely-located outliers, as they may be confused with synergistic or antagonistic areas. To detect outliers both in combination and individual agent response measurements used for model training, we applied a novel strategy based on the Bliss approximation (Supplementary Fig. S1a). More specifically, the Bliss formula calculates the expected combination responses measured at concentrations d1 and d2 of single drugs as g*12(d1, d2) = g1(d1) + g2(d2)-g1(d1)g2(d2). The deviations between the measured and expected combination responses are calculated as gd (d1, d2) = |g12 (d1, d2) - g*12 (d1,d2)|, where g12 (d1,d2) are the experimentally-measured combination responses (Supplementary Fig. S1a). Next, the outlier(s) X(d1,d2) in deviations gd (d1,d2) are defined as those observations that (i) fall below Q1 − 4 × IQR or above Q3 + 4 × IQR, where Q1 and Q3 are the first and third quartiles, respectively, and IQR is the interquartile range (Supplementary Fig. S1b), and (ii) that deviate more than 25% from the measured inhibition level.
Furthermore, it is important to identify whether the detected outlier is due to error in combination response measurement, error in single drug response, or a result of both of these factors. For this purpose, the individual drug dose-response curves are fitted using the 4-parameter nonlinear logistic equation, also called the Hill equation34, 35 (Supplementary Fig. S1c):
where g is the response of single agent at dose d, a is the minimum asymptote (response at d = 0), b is the maximum asymptote (response at infinite d), c is the half-maximal effective concentration (EC50), and s is the slope of the curve. The fitting of the dose-response curves is done using the drc package (version: 3.0–1)36 in R. In case the single-agent response is deviating more than 10% inhibition from the fitted value of the dose-response curve, then both the combination response g12(d1, d2) as well as the single agent response g(d) are removed as outliers, before going into the full matrix prediction.
Full matrix prediction
The outlier removal is followed by prediction of the missing responses in the sparse dose-response matrix (both non-measured and outlier combinations). We used constrained, weighted non-negative Matrix Factorization (NMF) for the prediction of drug combination response matrices, since the response values are always non-negative (% inhibition ranges between 0 and 100). The constrained NMF imposes additional regularization constraints to reduce overfitting and to enhance the uniqueness of the model estimation solutions, while a weighted formulation of NMF allows one to predict elements of sparse dose-response matrix (X) by assigning zero weights to the missing elements of the matrix and ones to the measured combination responses in the matrix decomposition:
where W is the weight matrix (comprising elements of 0’s and 1’s), U is the basis matrix and V is the coefficient matrix (Supplementary Fig. S11). We utilized an alternating non-negative least square (NNLS) algorithm for NMF decomposition implemented in the NNLM R-package (v 0.4.2). The alternating NNLS algorithm starts by random initialization of U and V matrices, which are iteratively updated to minimize
In this objective function, L is a loss (the mean square error) function, and
Where
Here, I is an identity matrix, E is a matrix with all entries equal to 1, and i and j are the matrix row and column indices respectively. J1 is a ridge penalty to control for the magnitudes and smoothness, J2 is used to minimize correlations among columns and J3 is a LASSO like penalty, which controls both for magnitude and sparsity. The loss function is minimized by firstly fixing U and solving for V using NNLS, and then fixing V and solving for U. This procedure is repeated until the change of X–WUV is small enough (relative tolerance between two successive iterations < 0.0001) or maximum number of iterations (here, 500) is reached. Finally, the multiplication of U and V results in a full NMF predicted matrix.
The NMF requires tuning of matrix rank r and regularization parameters (a1, a2, a3, β1, β2, β3) in order to achieve its optimal performance37. To avoid full parameter optimization, which easily leads to model overfitting, as very few combination measurements are given compared to the full matrix measurements, we adopted an approach similar to multiple imputation to maximize the prediction accuracy. Namely, we run NMF multiple times on the sparse dose-response matrix, with n random parameter sets (r ranging between 2 and 3, and regularization parameters between 0 and 1), resulting in n predicted full response matrices. Due to the regularization, m out of n predicted matrices include multiple zero responses and those m matrices were excluded. Finally, each element of the full dose-response matrix was predicted as Venter mode38 of n - m predicted matrices. We call this approach as composite of weighted penalized Non-negative Matrix Factorization (cNMF).
Synergy scoring and detection
The predicted complete dose-response matrix was utilized to calculate combinatorial landscapes over the full concentration ranges using a selected synergy scoring reference model. To allow straightforward synergy analysis and detection, DECREASE enables users to export the predicted combination response matrices in the format compatible with direct uploading to SynergyFinder25 and Combenefit24 software tools. The overall synergy score for a drug pair is calculated as the average score between the predicted and expected synergy over the dose-response combinatorial matrix with SynergyFinder (Supplementary Fig. 12). The positive and negative scores denote synergy and antagonism, respectively. The Bliss independence model was used as the primary method to score drug combination synergy in the datasets (see Supplementary Fig. S13), although the results based on the Loewe additivity, the highest single agent (HSA) and zero interaction potency (ZIP) model29 are additionally shown for providing complementary information from various models39. Since the dilution factors were constant within each study in this work, the unweighted synergy score was calculated for each drug combination (as implemented in SynergyFinder25). The constant dilution factor for the drugs makes the synergy score calculated by the unweighted synergy sum proportional to a synergy score integrated in the logarithmic space, and avoids the need to normalize the overall synergy scores by the dilution range that is required when non-constant dilution factor is used (the normalization would be needed is those applications to avoid the spurious increase of synergy due to a large number of drug concentrations chosen around the maximum synergy).
Comparative evaluation
Using our in-house dataset of 192 anti-cancer combinations, we compared the performance of cNMF against seven state-of-the-art supervised machine learning algorithms in terms of their prediction accuracy at predicting missing values in the sparse dose-response matrices (Supplementary Fig. S2). For each algorithm, we trained 192 matrix-specific models using the single-agent drug responses and given subset of combination responses (e.g. row/column or diagonal elements) from the dose-response matrix as training data. For each algorithm, the best parameter set was identified using Bayesian optimization (mlrMBO R-package, version: 1.1.1)40, with three times repeated 5-fold cross validation. Supplementary Table S3 lists the R-packages used for implementation of the prediction algorithms in the comparative evaluation, along with their optimized parameters. After model training, each optimized matrix-specific model was used to predict the missing responses within the incomplete matrix it was trained on. The performance of each model was calculated as Root Mean Square Error (RMSE) between the predicted and measured values of the matrix (see Statistical analysis). Our comparative analysis indicated that cNMF together with regularized boosted regression trees (XGBoost) implementing extreme gradient boosting in xgboost R-package41 outperformed all the other supervised machine learning algorithms (Supplementary Fig. S2). To further increase the robustness of our prediction, we therefore complemented our cNMF predictions with the XGBoost results using their averaging as the final DECREASE prediction model. In applications of DECREASE to any given drug combination dataset, the implemented ensemble of cNMF and XGBoost algorithms is optimized and tuned independently for each user-provided incomplete dose-response matrix using the Bayesian optimization with three times repeated 5-fold cross validation.
Code and data availability
The source-code of the DECREASE prediction algorithm is freely available at http://decrease.fimm.fi/source_code to allow replication of the results and also comparing or combining the cNMF algorithm with other prediction models. The source-codes of the other algorithms are publicly available (web-links and package versions are listed in Supplementary Table S3). The recent Dose model42 implementation was provided by request from the authors, Drs. Zimmer and Alon. All the data used in developing and testing of the models are openly available. The unpublished in-house anti-cancer dose-response matrix data for all the 210 combinations are available at http://decrease.fimm.fi/data_availability. The published DLBCL anti-cancer dose-response matrix data are available at http://tripod.nih.gov/matrix-data/m3-btk-6x6-ls11. The data for 22,737 anticancer drug combination from the study of O’Neil et al.3 was downloaded from http://mct.aacrjournals.org/content/15/6/1155.figures-only#fragments-additional-data. The published anti-malarial dose-response matrix data are available at https://tripod.nih.gov/matrix-client/rest/matrix/blocks/506/table10. The published antiviral dose-response matrix data for Ebola treatment are available at https://matrix.ncats.nih.gov/matrix-client/rest/matrix/blocks/6324/table.47
Statistical analysis
Prediction accuracy for synergy detection was evaluated with the Pearson correlation coefficients calculated between the predicted and measured synergy scores, using Bliss, Loewe, the highest single agent (HSA) or the zero interaction potency (ZIP) model29 across all the combinations. Prediction accuracy for dose-combination patterns was evaluated with the root mean squared error (RMSE), calculated between the predicted and measured combination effects (% inhibition), over the dose-response submatrix that was predicted and not used by DECREASE or Dose models in their training phase. The effective concentration (EC) values were estimated based on the single-agent dose-response curves fitted using the 4-parameter nonlinear logistic equation. The relative ECx percentage value corresponds to the concentration that causes an inhibition effect that is x% of the maximum effect for the agent. Statistical significance of the overlap in drug combinations detected using the full dose-response matrix and the DECREASE model was assessed with the Fisher’s exact test. Differences in the synergy scores and combination effects were assessed with either the Welch’s t-test or Wilcoxon’s test, depending on normality of the distributions tested with the Shapiro-Wilk test.
Supplementary Material
Acknowledgements
We thank the authors of the original publications (refs. 3,10,11,47) for making their drug combination data matrices publicly available. We thank the authors of the Dose model publication (ref. 42) for providing their MATLAB implementation and guaranteeing its appropriate use in this work; Dr. Daria Bulanova (FIMM) for her valuable comments and discussions; and Olle Hansson (FIMM) for technical support with the web-server. This work was funded by the Academy of Finland (grants 272577 and 277293 to KW; 292611, 279163, 295504, 310507, 313267 and 326238 to TA), European Union’s Horizon 2020 Research and Innovation Programme (ERA PerMed JAKSTAT-TARGET), Cancer Society of Finland (TA and KW), Sigrid Jusélius Foundation (KW and TA), and Novo Nordisk Foundation (NNF17CC0027852 to KW). The FIMM High Throughput Biomedine Unit is financially supported by the University of Helsinki and Biocenter Finland (SP, JS).
References
- 1.Webster RM Combination therapies in oncology. Nat Rev Drug Discov 2016;15:81–82. [DOI] [PubMed] [Google Scholar]
- 2.Lehár J et al. Synergistic drug combinations tend to improve therapeutically relevant selectivity. Nature Biotechnology 2009;27:659–666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.O’Neil J et al. An Unbiased Oncology Compound Screen to Identify Novel Combination Strategies. Mol Cancer Ther 2016; 15:1155–62. [DOI] [PubMed] [Google Scholar]
- 4.Radic-Sarikas B et al. Combinatorial Drug Screening Identifies Ewing Sarcoma-specific Sensitivities. Mol Cancer Ther 2017;16:88–101. [DOI] [PubMed] [Google Scholar]
- 5.Pelaia G, Vatrella A & Maselli R The potential of biologics for the treatment of asthma. Nat Rev Drug Discov 2012;11:958–972. [DOI] [PubMed] [Google Scholar]
- 6.Chakradhar S All in one: Researchers create combination drugs for diabetes and obesity. Nat Med 2016;22:694–696. [DOI] [PubMed] [Google Scholar]
- 7.Worthington RJ & Melander C Combination approaches to combat multidrug-resistant bacteria. Trends Biotechnol 2013;31:177–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Chandrasekaran S et al. Chemogenomics and orthology-based design of antibiotic combination therapies. Mol Syst Biol 2016;12:872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Borisy AA et al. Systematic discovery of multicomponent therapeutics. Proc Natl Acad Sci U S A 2003;100:7977–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mott BT et al. High-throughput matrix screening identifies synergistic and antagonistic antimalarial drug combinations. Sci Rep 2015;5:13891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mathews Griner LA et al. High-throughput combinatorial screening identifies drugs that cooperate with ibrutinib to kill activated B-cell-like diffuse large B-cell lymphoma cells. Proc Natl Acad Sci U S A 2014;111:2349–2354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Janzen WP Screening technologies for small molecule discovery: the state of the art. Chem Biol 2014;21:1162–1170. [DOI] [PubMed] [Google Scholar]
- 13.He L et al. Patient-customized Drug Combination Prediction and Testing for T-cell Prolymphocytic Leukemia Patients. Cancer Res 2018;78:2407–2418. [DOI] [PubMed] [Google Scholar]
- 14.Holbeck SL et al. The National Cancer Institute ALMANAC: A Comprehensive Screening Resource for the Detection of Anticancer Drug Pairs with Enhanced Therapeutic Activity. Cancer Res 2017;13:3564–3576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Roy A et al. Open access high throughput drug discovery in the public domain: a Mount Everest in the making. Curr Pharm Biotechnol 2010;11:764–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kurtz SE, Eide CA, Kaempf A et al. Molecularly targeted drug combinations demonstrate selective effectiveness for myeloid- and lymphoid-derived hematologic malignancies. Proc Natl Acad Sci USA 2017;114:7554–7563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yu D et al. Identification of Synergistic, Clinically Achievable, Combination Therapies for Osteosarcoma. Sci Rep 2015;5:16991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Horn T et al. High-Order Drug Combinations Are Required to Effectively Kill Colorectal Cancer Cells. Cancer Res 2016;76:6950–6963. [DOI] [PubMed] [Google Scholar]
- 19.Chou A et al. Tailored first-line and second-line CDK4-targeting treatment combinations in mouse models of pancreatic cancer. Gut, to appear. doi: 10.1136/gutjnl-2017-315144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chou Ting-Chao. Theoretical Basis, Experimental Design, and Computerized Simulation of Synergism and Antagonism in Drug Combination Studies. Pharmacological Reviews September 2006;58:621–681. [DOI] [PubMed] [Google Scholar]
- 21.Sun W et al. Rapid antimicrobial susceptibility test for identification of new therapeutics and drug combinations against multidrug-resistant bacteria. Emerg Microbes Infect 2016;5:e116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Shehata M et al. Reconstitution of PTEN activity by CK2 inhibitors and interference with the PI3-K/Akt cascade counteract the antiapoptotic effect of human stromal cells in chronic lymphocytic leukemia. Blood 2010;116:2513–2521. [DOI] [PubMed] [Google Scholar]
- 23.Tan X et al. Systematic identification of synergistic drug pairs targeting HIV. Nat Biotechnol 2012;30:1125–1130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Di Veroli GY et al. Combenefit: an interactive platform for the analysis and visualization of drug combinations. Bioinformatics 2016;32:2866–2868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ianevski A, He L, Aittokallio T & Tang J SynergyFinder: a web application for analyzing drug combination dose-response matrix data. Bioinformatics 2017;33:2413–2415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.The Loewe S. problem of synergism and antagonism of combined drugs. ArzneimiettelForschung 1953;3:286–290 [PubMed] [Google Scholar]
- 27.Bliss CI The toxicity of poisons applied jointly. Ann Appl Biol 1939;26:585–615. [Google Scholar]
- 28.Berenbaum MC What is synergy. Pharmacol. Rev 1989;41:93–141. [PubMed] [Google Scholar]
- 29.Yadav B et al. Searching for drug synergy in complex dose–response landscapes using an interaction potency model. Comput. Struct. Biotechnol. J 2015;13:504–505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chevereau G & Bollenbach T Systematic discovery of drug interaction mechanisms. Mol Syst Biol 2015;11:807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chou TC Drug combination studies and their synergy quantification using the Chou-Talalay method. Cancer Res 2010;70:440–446. [DOI] [PubMed] [Google Scholar]
- 32.Al-Lazikani B et al. Combinatorial drug therapy for cancer in the post-genomic era. Nat Biotechnol 2012;30:679–692. [DOI] [PubMed] [Google Scholar]
- 33.Gautam P et al. Identification of selective cytotoxic and synthetic lethal drug responses in triple negative breast cancer cells. Mol. Cancer 2016;15:34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gadagkar SR & Call GB Computational tools for fitting the Hill equation to dose-response curves. J Pharmacol Toxicol Methods 2015;71:68–76. [DOI] [PubMed] [Google Scholar]
- 35.Hill AV The possible effects of the aggregation of the molecules of hemoglobin on its dissociation curves. The Journal of Physiology 1910;40:i–vii. [Google Scholar]
- 36.Ritz C, Baty F, Streibig JC & Gerhard D Dose-Response Analysis Using R. PLoS One 2015;10:e0146021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wang YX & Zhang YJ. et al. , Non-negative Matrix Factorization: a Comprehensive Review. IEEE Transactions on Knowledge and Data Engineering 2013;25:1336–1353. [Google Scholar]
- 38.Venter JH On Estimation of the Mode. The Annals of Mathematical Statistics 1967;38:1446–1455. [Google Scholar]
- 39.Tang J, Wennerberg K, and Aittokallio T What is synergy? The Saariselkä agreement revisited. Front Pharmacol 2015;6:181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Bischl B, Richter J, Bossek J, Horn D, Thomas J & Lang M. mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions. arXiv:1703.03373. [Google Scholar]
- 41.Chen T & Guestrin C Xgboost: A scalable tree boosting system. SIGKDD 2016;785–794 ACM. DOI: 10.1145/2939672.2939785. [DOI] [Google Scholar]
- 42.Zimmer A et al. Prediction of multidimensional drug dose responses based on measurements of drug pairs. Proc Natl Acad Sci USA 2016;113:10442–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Weinstein ZB et al. Prediction of synergistic drug combinations. Current Opinion in Systems Biology 2015;4:24–28. [Google Scholar]
- 44.Szwajda A et al. Systematic Mapping of Kinase Addiction Combinations in Breast Cancer Cells by Integrating Drug Sensitivity and Selectivity Profiles. Chem Biol 2015;8:22. [DOI] [PubMed] [Google Scholar]
- 45.Mullard A Microfluidics platform lowers barrier to drug combination screening. Nat Rev Drug Discov 2018; 17:691–692. [DOI] [PubMed] [Google Scholar]
- 46.Eziefula AC Artesunate-mefloquine: a malaria treatment for African children? Lancet Infect Dis 2016;16:1086–1087. [DOI] [PubMed] [Google Scholar]
- 47.Dyall J et al. Identification of Combinations of Approved Drugs With Synergistic Activity Against Ebola Virus in Cell Cultures. J Infect Dis 2018; 218(suppl_5):S672–S678. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The source-code of the DECREASE prediction algorithm is freely available at http://decrease.fimm.fi/source_code to allow replication of the results and also comparing or combining the cNMF algorithm with other prediction models. The source-codes of the other algorithms are publicly available (web-links and package versions are listed in Supplementary Table S3). The recent Dose model42 implementation was provided by request from the authors, Drs. Zimmer and Alon. All the data used in developing and testing of the models are openly available. The unpublished in-house anti-cancer dose-response matrix data for all the 210 combinations are available at http://decrease.fimm.fi/data_availability. The published DLBCL anti-cancer dose-response matrix data are available at http://tripod.nih.gov/matrix-data/m3-btk-6x6-ls11. The data for 22,737 anticancer drug combination from the study of O’Neil et al.3 was downloaded from http://mct.aacrjournals.org/content/15/6/1155.figures-only#fragments-additional-data. The published anti-malarial dose-response matrix data are available at https://tripod.nih.gov/matrix-client/rest/matrix/blocks/506/table10. The published antiviral dose-response matrix data for Ebola treatment are available at https://matrix.ncats.nih.gov/matrix-client/rest/matrix/blocks/6324/table.47