The library consists of a collection of 1,400 compounds that appear to be analogues of approved drugs. These compounds are designed to explore various therapeutic areas and are related to different categories of drugs.
Within the library, there are 111 compounds that are related to antiviral therapies, 90 compounds related to pain relievers, 89 compounds related to allergy medications, 75 compounds related to Duchenne Muscular Dystrophy, 68 compounds related to immunosuppressants, 46 compounds related to Parkinson’s Disease, 45 compounds related to antibiotics, and 30 compounds related to diabetes treatments.
Moreover, the library offers analogues to specific registered drugs. There are 78 compounds that serve as analogues to Chlorcyclizine, 77 compounds as analogues to Daclatasvir, 75 compounds as analogues to Ataluren, 43 compounds as analogues to Istradefylline, 38 compounds as analogues to Nitazoxanide, 27 compounds as analogues to Nateglinide, 20 compounds as analogues to Cloperastine, and 19 compounds as analogues to Sulfamethizole.
Analogues of known drugs, which have been approved by regulatory agencies such as the FDA, EMA, and PMDA, hold potential for drug repositioning and serve as starting points for developing new compounds with improved pharmacokinetic and pharmacodynamic properties.
The library design involved several key steps. Data mining was performed using public databases like PubChem to select over 100 drugs as a reference. These drugs were used as a basis for searching similar compounds within the stock of 1.6 million compounds. Tanimoto (Ti) similarity was employed using circular fingerprints with ECFP4 implementation and Mol2Vec representation. Compounds with a Ti-similarity greater than 0.6 to the approved drugs were selected. Structural diversity picking was then carried out using hierarchical clustering, the Min-Max algorithm, and Dice Similarity with ECFP4 fingerprints.