10000 Exhaustive fragmenter returns oversaturated fragment containers · Issue #1119 · cdk/cdk · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Exhaustive fragmenter returns oversaturated fragment containers #1119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ToLeWeiss opened this issue Oct 20, 2024 · 4 comments
Open

Exhaustive fragmenter returns oversaturated fragment containers #1119

ToLeWeiss opened this issue Oct 20, 2024 · 4 comments

Comments

@ToLeWeiss
Copy link

Dear CDK-Developers,

i encountered oversaturated fragments in the exhaustive fragmenter, when extracting the fragments as atom containers via getFragmentsAsContainers. In contrast, when extracting the fragments via the getFragments method, it returns the expected fragments with the right saturation. So this occurs only when extracting fragments as containers.

Code to replicate

IFragmenter fragmenter = new ExhaustiveFragmenter();
SmilesParser tmpSmiPar = new SmilesParser(SilentChemObjectBuilder.getInstance());
SmilesGenerator tmpSmiGen = new SmilesGenerator((SmiFlavor.UseAromaticSymbols));
IAtomContainer tmpOriginalMolecule = tmpSmiPar.parseSmiles(
    // DL-Phenylalanin CID	994
    "C1=CC=C(C=C1)CC(C(=O)O)N");
fragmenter.generateFragments(tmpOriginalMolecule);
String[] smilesFragments = fragmenter.getFragments();
IAtomContainer[] atomContainerFragments = fragmenter.getFragmentsAsContainers();
System.out.println("Fragments as SMILES:");
for (String fragment : smilesFragments) {
    System.out.println(fragment);
}
System.out.println("Fragments as containers:");
for (IAtomContainer container : atomContainerFragments) {
    System.out.println(tmpSmiGen.create(container));
}

the above, results in the following output:

Fragments as SMILES:
O=C(O)C(N)C
c1ccccc1
NCCc1ccccc1
c1ccc(cc1)C
Fragments as containers:
C[CH2]([CH](=O)O)N
c1ccccc1
C([CH3][cH]1ccccc1)N
C[cH]1ccccc1
@johnmay
Copy link
Member
johnmay commented Oct 20, 2024

Thanks will take a look, probably just the container one is reusing atoms. You can probably fix it by calling atom typing/add hydrogens but should clean it up as we go.

I actually think it makes more sense to leave the atoms undervalent (radicals) but some might not expect that

@johnmay
Copy link 8000
Member
johnmay commented Oct 20, 2024

Yes it is as I expected, basically internally it does this before generating each fragment:

AtomContainerManipulator.clearAtomConfigurations(frag);
for (IAtom atom : frag.atoms())
    atom.setImplicitHydrogenCount(null);
AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(frag);
CDKHydrogenAdder.getInstance(frag.getBuilder()).addImplicitHydrogens(frag);
Aromaticity.cdkLegacy().apply(frag);

I'll look a bit more if there is an easy fix, like I said you might not like what I think the correct answer is:

C1=CC=C(C=C1)CC(C(=O)O)N:
 [c]1ccccc1
 [CH2]C(C(=O)O)N
 [CH](Cc1ccccc1)N
 [CH2]c1ccccc1

or with attachment points:

 *c1ccccc1 |$_AP1$|
 *CC(C(=O)O)N  |$_AP1$|
 *C(Cc1ccccc1)N  |$_AP1$|
 *Cc1ccccc1  |$_AP1$|

@ToLeWeiss
Copy link
Author

Thank you for your quick response

I actually think it makes more sense to leave the atoms undervalent (radicals) but some might not expect that

I also prefer this variant because it lets the user of the CDK decide whether to saturate or not.
I would be interested in implementing these changes if it would be helpful.

@johnmay
Copy link
Member
johnmay commented Oct 21, 2024

Sure, I think actually you just remove the code I pasted above... perhaps add an option for it though.

4586
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0