Hi, given a circular fingerprint, such as ECFP’s what is the accepted way to evaluate the similarity between two such fingerprints? My first thought is that the features (usually unsigned int’s derived from atom environments) get mapped into a fixed length bit strings, after which one proceeds as usual.
Noel O'Boyle [ Admin ]
I guess you just need to assume that if two molecules share several bits in common, then it is not by chance (i.e. a collision of two distinct fragments) but rather is due to a real similarity in structure. Makes one think about the e-values that Baldi worked on for significance of matches depending on database size (a la Blast).