[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong behaviour of the permutation tag for non-tetrahedral stereochemistry #7948

Open
Strandgaard96 opened this issue Oct 21, 2024 · 0 comments
Labels

Comments

@Strandgaard96
Copy link

Describe the bug

Two potentially related issues:

Embedding a TMC SMILES with an octahedral chiral tag and permutation number results in incorrect permutation of ligands.
When embedding the same SMILES multiple times, the resulting 3D structure seems to randomly alternate between two different ligand permutations.

Canonicalization of some TMC SMILES changes the permutation tag every time the SMILES is canonicalized.

To Reproduce

Embedding

Embed the OH1 example from the Support for non-tetrahedral atomic stereochemistry section of the RDKit book multiple times.
The embedding will alternate between the OH1 and OH2 permutations.

mol1 = Chem.MolFromSmiles("O[Co@OH1](Cl)(C)(N)(F)P")
Chem.SanitizeMol(mol1)
mol1 = Chem.AddHs(mol1)
# Embed with ETKDG
_ = rdDistGeom.EmbedMolecule(
    mol1,
)
Canonicalization

We start with a non-canonical SMILES. Canonicalization of this leads to one permutation. Reading a new mol object from that and writing that to SMILES results in a new permutation tag.
None of the canonicalized SMILES correspond to the starting structure.

smi= "[O+]#[C-]->[Mo@OH25]1(<-[C-]#[O+])(<-O)(<-O)<-P(CCP->1)"
mol1 = Chem.MolFromSmiles(smi)
print(f"Starting smiles:      {smi}")

canon_smi = Chem.CanonSmiles(smi)
print(f"Canonical smiles:     {canon_smi}")
mol2 = Chem.MolFromSmiles(canon_smi)

new_canon_smi = Chem.MolToSmiles(mol2)
print(f"New canonical smiles: {new_canon_smi}")
mol3 = Chem.MolFromSmiles(new_canon_smi)

canon_smi

Expected behavior

  1. When embedding the same SMILES multiple times, the 3D structure should be identical. Ligands should not be permuted.
  2. The specified stereochemistry of a canonicalized SMILES should correspond to the input SMILES stereochemistry.

Screenshots

Embedding

Images of the two structures (OH1,OH2) obtained when embedding an OH1 SMILES multiple times as shown above:

First possible embedding

first_mol

Second possible embedding

second_mol

We observe the same behavior for other pairs: (OH3/OH16), (OH27/OH28), (OH21/OH22) etc ....

We can see that in NontetrahedralStereo.cpp these pairs have the same entries in octahedral_across, which might be related.

Canonicalization

Example of the canonicalization affecting the embedding of a SMILES with specified stereochemistry.
One axial CO ligand switches to an equatorial position.

first_mol_canon

second_mol_canon

Configuration:

  • RDKit version: 2024.03.3 (Has also been tested on 2022.09.5 and 2024.09.1 with the same outcome)
  • OS: Ubuntu 22.04.5
  • Python version: 3.10.14
  • RDKit installed from conda

Additional context

This seems to relate to functionality introduced with commit : cd74dc2.

Potentially the issue is addressed with this recently merged PR: #6777

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant