10000 Added check to keep consistency with alphabet by whitead · Pull Request #143 · ur-whitelab/exmol · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Added check to keep consistency with alphabet #143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 4, 2023
Merged

Added check to keep consistency with alphabet #143

merged 3 commits into from
Dec 4, 2023

Conversation

whitead
Copy link
Contributor
@whitead whitead commented Nov 27, 2023

Fixes #139

@whitead whitead requested a review from geemi725 November 27, 2023 18:45
@whitead whitead merged commit 89d7626 into main Dec 4, 2023
@whitead whitead deleted the issue-139 branch December 4, 2023 18:04
@jangerit
Copy link

Thanks for adding the consistency check - I had the same problem as described in #139.

As a follow-up: For the special case of SMILES strings containing explicit hydrogens, i.e., H, the consistency check will raise an error (for start_smiles) or discard the proposed molecules by STONED because H is not explicitly encoded in the SELFIES alphabet.

For instance, I tried to generate counterfactuals for the SMILES C[C@H]1CC[C@H](CC1)C which raised an ValueError: symbols not in alphabet {'H'} (lines 397 ff. in exmol.py caused by the initial check in line 429). Should "H" be excluded from the alphabet consistency check?

By the way: Lines 397-399 did first raise TypeError: can only concatenate str (not "set") to str. Therefore, I modified those to:

raise ValueError(
    "symbols not in alphabet" + str(smiles_symbols.difference(alphabet_symbols))
)

Thanks a lot for your help in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4CD1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

STONED implementation adds nitrogen
3 participants
0