logo
DrugBank Dataset

Dataset Overview

Our drug–drug interaction prediction model is trained on a curated dataset derived from DrugBank. The dataset has been carefully split into training, validation, and test sets to ensure robust model evaluation. The dataset is publicly available on Access Zenodo.

868,069

Total Drug Pairs

2,957

Unique Drugs

178

Interaction Types

3,780

Features SMILES

Download Dataset Splits

Access our preprocessed dataset splits for reproducible research

Dataset SplitSamplesPercentageDescription
Training Set520,84160%Used for model training
Test Set173,61420%Used for final evaluation
Validation Set173,61420%Used for hyperparameter tuning
Total Samples:868,069
Download All Here

Citation

If you use this dataset in your research, please cite our work

Kha, Q.-H., Nguyen, D.-Q.-A., Pham, V.-H.-P., Huynh, K.-M.-U., Pham, D.-K., Phung, M.-T., Huynh, T.-P. et al. T-DDI: Robust Prediction of Drug Interactions using Chemical Descriptors. npj Digital Medicine (under review).