Hi, given a circular fingerprint, such as ECFP’s what is the accepted way to evaluate the similarity between two such fingerprints? My first thought is that the features (usually unsigned int’s derived from atom environments) get mapped into a fixed length bit strings, after which one proceeds as usual.
nina [ Editor ]
One can calculate Tanimoto scores without explicit mapping to a bit string, as the Tanimoto formula asks for the number of features / number of common features, not necessary bits.
For example the usual formula can be used Tanimoto=common(NA,NB)/( NA+NB-common(NA,NB)), where NA is the number of fragments ( (circular fingerprints) in molecule A, NB is the number of fragments in molecule B and common(NA,NB) is the number of common fragments between the two molecules.
This means that the length of the (implicit) bit string will be different for different pairs of molecules.