Feedback

type to search

By: Asked from Israel

What tools are available for clustering small molecules ?

What tools are available for clustering small molecules ? (I have a set of say 1000 small molecules and I want to select a non-redundant diverse set)

1

baoilleach [ Admin ]

To my mind, diverse set selection is best done directly using the distance matrix. The Kennard-Stone algorithm, for example, is easy to understand and easy to implement. You just start with the pair of molecules most distant, and keep selecting additional molecules that are most distant to the selected ones. Stop when you have enough diverse molecules. I don’t know of any available tool for this though.

By the way, if you are selecting a diverse set for use as a training set, you may want to reconsider – the results on the test set will be overly optimistic if you use anything but a random set.

NN comments
chem-bla-ics
-

Do you got a literature ref for the K-S algorithm?

baoilleach
-
-bla-ics: No, but you just start with the pair of moleules most distant, and keep selecting additional molecules that are most distant to the selected ones. Stop when you have enough diverse molecules.

or Cancel
You need to join Blue Obelisk eXchange to complete this action, click here to do so.