(For background on the title, and use of the term 'bicycle' here, see http://www.straightdope.com/columns/read/1173/why-is-a-raven-like-a-writing-desk .)
Suppose fingerprint fp has no bits set. What should Tanimoto(fp, fp) be? There's three answers I've come across:
- 0.0, since neither are like a bicycle (this is what I do)
- 1.0, since both are equally like a bicycle (this is what OpenEye and RDKit do)
- +infinity (this is what OpenBabel and CDK do)
I justify my answer in the context of a search results. If the query has no bit set then there should be no preference for any solution, while #2 and #3 would always sort with the other targets which have no bits set.
However, I'm obviously a minority here. Does anyone have any practical experience with this? (Knowing already that 0 bits set isn't practical.)
Imported from: http://blueobelisk.stackexchange.com/questions/217
Okay, I’ll change my code to make it return 1.0, in agreement with OpenEye, RDKit and the upcoming change to CDK, and everyone’s responses.