You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
In RDKit 2024.09.1 I am seeing different HetAtomTautomerv2 hashes being generated depending on the order of that atoms and/or bonds in the input. It appears to be related to the recent fix #7502 that shrinks tautomeric zones in the v2 tautomer/protomer hash - I don't see the issue if I revert that change.
To Reproduce
>>>fromrdkitimportChem>>>fromrdkit.ChemimportrdMolHash>>># Same molecule with atoms in different order in input SMILES>>>mol1=Chem.MolFromSmiles("CNC(=O)N[C@@H](C)c1ccccc1")
>>>mol2=Chem.MolFromSmiles("C[C@H](NC(=O)NC)c1ccccc1")
>>>Chem.MolToSmiles(mol1) ==Chem.MolToSmiles(mol2)
True>>>hash1=rdMolHash.MolHash(mol1, rdMolHash.HashFunction.HetAtomTautomerv2)
>>>hash2=rdMolHash.MolHash(mol2, rdMolHash.HashFunction.HetAtomTautomerv2)
>>>hash1==hash2False>>>hash1'[CH3]-[N]:[C](:[O]):[N]:[C](-[CH3])-[c]1:[cH]:[cH]:[cH]:[cH]:[cH]:1_3_0'>>>hash2'[C]:[N]:[C](:[O]):[N]-[C@@H](-[CH3])-[c]1:[cH]:[cH]:[cH]:[cH]:[cH]:1_5_0'
Interestingly, using Chem.RenumberAtoms with _smilesAtomOutputOrder on both molecules in this example to get a consistent atom order does not seem to fix this - they still produce different hashes. The bonds are still in a different order so I presume that is the issue. In general I don't think it is straightforward to consistently renumber atoms/bonds for all different tautomers/protomers anyway, so I don't think it is possible to fix this just by renumbering atoms/bonds ahead of hash generation.
Expected behavior
Expect atom order not to affect hashes, i.e. hash1 == hash2 in the above example.
Configuration (please complete the following information):
RDKit version: 2024.09.1
Python version (if relevant): 3.10
Are you using conda? No
If you are not using conda: how did you install the RDKit? Source compile
The text was updated successfully, but these errors were encountered:
Describe the bug
In RDKit 2024.09.1 I am seeing different
HetAtomTautomerv2
hashes being generated depending on the order of that atoms and/or bonds in the input. It appears to be related to the recent fix #7502 that shrinks tautomeric zones in the v2 tautomer/protomer hash - I don't see the issue if I revert that change.To Reproduce
Interestingly, using
Chem.RenumberAtoms
with_smilesAtomOutputOrder
on both molecules in this example to get a consistent atom order does not seem to fix this - they still produce different hashes. The bonds are still in a different order so I presume that is the issue. In general I don't think it is straightforward to consistently renumber atoms/bonds for all different tautomers/protomers anyway, so I don't think it is possible to fix this just by renumbering atoms/bonds ahead of hash generation.Expected behavior
Expect atom order not to affect hashes, i.e.
hash1 == hash2
in the above example.Configuration (please complete the following information):
The text was updated successfully, but these errors were encountered: