Welcome To Support Community

Discovery Studio

Advanced Search
Ask Search:
senliusenliu 

how to cluster the molecules in an unsupervised way

I have found that DS could cluster the molecules in two supervised way, either average number per cluster or the  cluster number must be fixed if you want to get the library clustered.

So I want to know if we can cluster the molecules just by setting a Tanimoto coefficient(e.g. 0.8) ?
WolfgangWolfgang (BIOVIA) 
Dear Sen Liu,

the method applied in the Design and Analyze Libraries tools > Cluster Ligands protocol is "unsupervised", which means that only independent descriptors are taken into account and the property of interest (e.g. activity) is not used in the process.

What exactly are you trying to achieve ?  Typically, the protocol would be used for the selection of subsets, and it is their number that is the limiting factor, because you can only synthesize/buy/screen so many compounds.

When you use the protocol from the Pipeline Pilot client, you will find a MaximumDistance parameter (under Clustering Options), which you could expose for use from Discovery Studio.
Please note, however, that it is only used after an initial clustering has been performed based on the Number of Clusters or the Average Number of Molecules per Cluster.