iBioProVis has its own in house bioactivity dataset curated from ChEMBL (v25) having 15,506,670 data points (i.e., bioactivity measurements). Several filtering and pre-processing steps were applied in order to generate a reliable compound-target dataset.
  • Data points were filtered based on “assay type” (i.e., binding), “target type” (i.e., single protein), “taxonomy” (i.e., mammalian) and “standard type” (i.e., IC50, XC50, EC50, AC50, Ki, Kd and Potency) attributes.
  • Duplicate bioactivities for each compound-target protein pair were removed by considering the median bioactivity values.
  • Finally, the bioactivities without an assigned pChEMBL value were removed, since these data points are considered less reliable.
  • After all these preprocessing and filtering steps, the number of bioactivity measurements was reduced to 890,887.

    You can download the dataset from this link.