For chemfp I’ve implemented interface to 8 or so different fingerprint generation methods in several different toolkits, plus come up with my own cross-platform variation of the PubChem/CACTVS fingerprints.
rajarshi guha [ Editor ] from Bethesda, United States of America
I think your approach to measuring effectiveness will guided by which of two tasks a fingerprint is being used for: finding similar structures (database screening) or finding active compounds given a query compound (virtual screening).
If you’re doing the latter you could try using the MUV datasets and measuring effectiveness by the ranks of the designated actives when the dataset is ordered by similarity to the active compound. Since each dataset has 15 (or 30?) actives, you can evaluate some sort of overall score. (MUV is probably a tough test case, since the datasets are designed to avoid problems associated with benchmarking 2D virtual screening methods). See slides 153-156 in http://www.slideshare.net/rguha/cheminformatics-in-r for an example.