8000 CWL Parsing by illusional · Pull Request #22 · common-workflow-lab/python-cwlgen · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Jun 9, 2020. It is now read-only.

CWL Parsing #22

Merged
merged 20 commits into from
Jul 5, 2019
Merged

CWL Parsing #22

merged 20 commits into from
Jul 5, 2019

Conversation

illusional
Copy link
Collaborator

Hey all,

This is the start of my parsing a CWL into the workflow classes, with the goal for it to be almost automatic (with a couple of annotated hints).

Most of the work is in utils, based on my previous Serializable submission.

https://github.com/illusional/python-cwlgen/blob/parsing/cwlgen/utils.py#L69-L177

I make two assumptions:

  1. CWL you parse is valid, there's no sanity checking (you can use CWLTool) for that.
  2. The field names in CWL (and the dict) EXACTLY match the attribute names on the class. It will throw a KeyException if this is not the case.

Essentially, it loops through the keys on the CWL Dictionary and sets that attribute (if it exists) on the required class. You can provide an ordered list of "potential" classes that it might be, and it will do its best to go through until it finds one.

You can see how this works on the Workflow (L41-44) class.

I've swapped out the converter and left the CommandLineTool tests (from @khillion) the same (except for secondaryFiles) and it seems to work okay.

@kellrott, do you mind having a look at my code, as both yourself at @khillion are the people who have worked on workflow parsing, and this should just extend it to Workflow (for now).

@codecov
Copy link
codecov bot commented May 28, 2019

Codecov Report

Merging #22 into master will increase coverage by 6.38%.
The diff coverage is 83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #22      +/-   ##
==========================================
+ Coverage   76.75%   83.13%   +6.38%     
==========================================
  Files           9       10       +1     
  Lines         684      593      -91     
==========================================
- Hits          525      493      -32     
+ Misses        159      100      -59
Impacted Files Coverage Δ
cwlgen/commandlinetool.py 87.34% <100%> (+0.85%) ⬆️
cwlgen/__init__.py 100% <100%> (ø) ⬆️
cwlgen/common.py 79.43% <100%> (+0.59%) ⬆️
cwlgen/workflow.py 91.83% <100%> (+10.35%) ⬆️
cwlgen/workflowdeps.py 79.31% <79.31%> (ø)
cwlgen/utils.py 82.72% <80.24%> (-6.93%) ⬇️
cwlgen/import_cwl.py 94.44% <87.5%> (+28.8%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca06118...71ff30a. Read the comment docs.

@illusional illusional mentioned this pull request Jul 5, 2019
:type glob: STRING
:param load_contents: For each file matched, read up to the 1st 64 KiB of text and
place it in the contents field
:type load_contents: BOOLEAN

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice this isn't an actual Python type (should be bool). Is this intentional? It doesn't seem ideal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a legacy thing that I haven't changed. Might be worth changing in #24

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If/once you deprecate Python 2, you can make these all PEP 484 type annotations and then most tooling should work better

@@ -15,13 +16,109 @@
# Class(es) ------------------------------


class CommandOutputBinding(Serializable):
Copy link
8000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These classes should probably be dataclasses, no? That would save you from having to define the __init__ methods all the time. Admittedly they require Python 3.6 (via backport) or Python 3.7 (natively). Not sure what versions you're supporting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah that's sick! I didn't know Dataclasses were a thing! When this project drops EOL support for Python 2.7 I reckon this is definitely the way to go.

@@ -0,0 +1,111 @@
#!/usr/bin/env python

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about doing an integration test that does CWL → Python → CWL and comparing the input and output CWL?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good test to improve this translation mechanism. For this change I know it won't produce identical CWL because it's opinionated on the dict / array output of inputs / outputs / steps.

It would be good to functionally validate the CWL tbh.

@illusional
Copy link
Collaborator Author

Thanks @TMiguelT, all tests pass so I'm going to merge this!

@illusional illusional merged commit d322635 into common-workflow-lab:master Jul 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0