10000 GitHub - PatrickPenner/lefshift
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

PatrickPenner/lefshift

Repository files navigation

Local Environment of Fluorine (LEF) Shift Prediction Tools

Getting Started

A PyPi package is available:

pip install lefshift

You can also clone the repo manually and then install the lefshift package:

git clone https://github.com/PatrickPenner/lefshift.git
cd lefshift
pip install .

Be aware that the data/ and model/ directory used in the following are in the lefshift repo.

Running lefshift --help should give you an overview of how to use the lefshift commandline tool. You can quickly train a model using the Enamine data in data/ like this:

lefshift train data/train.csv --model models/my_model --id-column 'Catalog ID' --shift-column 'Shift 1 (ppm)' --smiles-column 'SMILES' --verbose

Training requires 3 columns: an ID column, a chemical shift column, and a SMILES column. The above command gives the name of those columns explicitly. You can also check the lefshift train --help output to see the default column names that lefshift expects.

You can perform a prediction with the model you just trained with the following command:

lefshift predict data/test.csv --model models/my_model data/test_predicted.csv --smiles-column 'SMILES' --verbose

Prediction only requires a SMILES column. This is once again give explicitly, but you can also ensure that there is an input column named "SMILES".

To split a data set into those samples in the applicability domain of the model and those not in the applicability domain run this command:

lefshift split data/test.csv --model models/my_model data/test_known.csv data/test_unknown.csv --verbose

Splitting also only requires a SMILES column. The SMILES column in the input is already called "SMILES" so we don't have to be explicit.

For more advanced usage please check the --help output of lefshift and the subtools train, predict, and split. There is also a QM Assisted ML Tutorial.md that describes a workflow to combine lefshift and lefqm.

Development

Formatting pre-commit hook

Pre-commit is only used for consistent formatting. The core of that is the black formatter and code style.

Install the formatting pre-commit hook with:

pre-commit install

Code Quality

All quality criteria have a range of leniency.

Criteria Threshold
pylint >9.0
coverage (overall) >90%
coverage (single file) >80%

Utility commands

Pre-commit on one file:

pre-commit run --files

Test command:

python -m unittest tests

PyLint command:

pylint lefshift tests > pylint.out

Coverage commands:

coverage run -m unittest tests
coverage html

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

0