8000 ENH: cogent3 Tree classes now record their source by GavinHuttley · Pull Request #2351 · cogent3/cogent3 · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ENH: cogent3 Tree classes now record their source #2351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 12, 2025

Conversation

GavinHuttley
Copy link
Collaborator
@GavinHuttley GavinHuttley commented Jun 12, 2025

Summary by Sourcery

Record and propagate a tree’s origin throughout the library by adding a source attribute to tree nodes and related data objects.

Enhancements:

  • Add a source property to TreeNode and PhyloNode backed by the node’s params dict
  • Capture and set the source when creating or loading trees via make_tree and load_tree
  • Propagate the source attribute through tree-manipulation apps (e.g., scaling, uniformizing) and distance computations
  • Extend get_data_source to handle tree nodes and alignments for consistent source retrieval
  • Improve type annotations (use str | bytes, frozenset, and PEP 484 unions) and refine copy exclusions

Tests:

  • Add unit tests to verify the source attribute on trees created from strings, files, and datasets
  • Add tests to ensure source is preserved through pipeline apps and I/O operations

Copy link
Contributor
sourcery-ai bot commented Jun 12, 2025

Reviewer's Guide

This PR introduces a “source” attribute to tree nodes (TreeNode/PhyloNode), ensures it’s set during tree construction and loading, extends the data store to retrieve source metadata from trees, propagates this metadata through tree-transforming apps and distance calculators, and updates tests and type hints to reflect these enhancements.

Sequence Diagram: Tree Initialization and source attribution via load_tree

sequenceDiagram
    participant Client
    participant load_tree_func as "load_tree()"
    participant FileSystem
    participant make_tree_func as "make_tree()"
    participant CreatedTree as "Created Tree Object"

    Client->>load_tree_func: load_tree(filename, ...)
    activate load_tree_func
    load_tree_func->>FileSystem: Read treestring from filename
    FileSystem-->>load_tree_func: treestring
    load_tree_func->>make_tree_func: make_tree(treestring, ..., source=filename)
    activate make_tree_func
    make_tree_func-->>make_tree_func: Internal parsing & tree creation using treestring
    Note right of make_tree_func: make_tree sets tree_object.source = filename
    make_tree_func-->>load_tree_func: tree_object_with_source (as CreatedTree)
    deactivate make_tree_func
    load_tree_func-->>Client: tree_object_with_source
    deactivate load_tree_func
Loading

Sequence Diagram: Propagation of source attribute in scale_branches app

sequenceDiagram
    participant Caller
    participant scale_branches_main as "scale_branches.main()"
    participant InputTree as "Input Tree"
    participant OutputTree as "Output Tree (copy)"

    Caller->>scale_branches_main: main(input_tree)
    activate scale_branches_main
    scale_branches_main->>InputTree: Get original_source (input_tree.source)
    activate InputTree
    InputTree-->>scale_branches_main: original_source
    deactivate InputTree
    scale_branches_main->>InputTree: deepcopy()
    activate InputTree
    InputTree-->>scale_branches_main: new_tree_instance (as OutputTree)
    deactivate InputTree
    Note right of scale_branches_main: Perform scaling operations on OutputTree
    scale_branches_main->>OutputTree: Set source = original_source (output_tree.source = original_source)
    activate OutputTree
    deactivate OutputTree
    scale_branches_main-->>Caller: modified_tree (OutputTree)
    deactivate scale_branches_main
Loading

Sequence Diagram: Retrieving source from TreeNode via get_data_source

sequenceDiagram
    participant Caller
    participant get_data_source_func as "get_data_source()"
    participant TreeNodeInstance as "tree_node : TreeNode"

    Caller->>get_data_source_func: get_data_source(tree_node)
    activate get_data_source_func
    get_data_source_func->>TreeNodeInstance: .source (access property)
    activate TreeNodeInstance
    TreeNodeInstance-->>get_data_source_func: source_value
    deactivate TreeNodeInstance
    get_data_source_func-->>Caller: source_value
    deactivate get_data_source_func
Loading

Class Diagram: TreeNode and PhyloNode with new 'source' property

classDiagram
    class TreeNode {
        +name: str | None
        +children: list~TreeNode~ | None
        -_parent: TreeNode | None
        +params: dict~str, object~ | None
        +name_loaded: bool
        +source: str | None
        +__init__(name: str | None, children: list~TreeNode~ | None, parent: TreeNode | None, params: dict~str, object~ | None, name_loaded: bool)
    }
    class PhyloNode {
        %% Inherits members from TreeNode
    }
    TreeNode <|-- PhyloNode
Loading

File-Level Changes

Change Details Files
TreeNode now tracks a source attribute and refines internal behavior
  • Added @Property and setter for .source storing value in params
  • Replaced mutable dict in _exclude_from_copy with frozenset
  • Refined init type annotations and simplified children truth check
src/cogent3/core/tree.py
make_tree and load_tree record and pass along the tree’s source
  • Added optional source parameter to make_tree signature
  • Assigned source to root nodes in both tip-based and string-based branches
  • Set tree.source in load_tree for JSON and file parsing paths
src/cogent3/core/tree.py
Extended get_data_source to support tree nodes and return .source
  • Updated StrOrBytes alias to new union syntax
  • Added get_data_source registrations for TreeNode and PhyloNode returning .source
  • Refactored alignment and path handlers to use direct .source or Path.name
src/cogent3/app/data_store.py
Apps and distance modules now preserve source metadata
  • Stored and reapplied tree.source in scale_branches and uniformize_tree apps
  • Replaced direct info.source assignments with get_data_source calls in fast_slow_dist and jaccard_dist
  • Updated NJ app to set tree.source via get_data_source
src/cogent3/app/tree.py
src/cogent3/app/dist.py
Test suite updated to validate source propagation and fixtures
  • Switched file-path fixtures to use DATA_DIR and adjust param names
  • Added tests for .source on TreeNode/PhyloNode and pipelines
  • Modified IO tests to expect ‘source’ or ‘info’ key based on env
  • Expanded evo and align tests to confirm source preservation
tests/test_core/test_tree.py
tests/test_app/test_tree.py
tests/test_app/test_io.py
tests/test_app/test_evo.py
tests/test_app/test_align.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor
@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @GavinHuttley - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟡 Testing: 3 issues found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@GavinHuttley GavinHuttley merged commit 01f89cb into cogent3:develop Jun 12, 2025
15 of 16 checks passed
@coveralls
Copy link
Collaborator
coveralls commented Jun 12, 2025

Pull Request Test Coverage Report for Build 15608655109

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 52 of 54 (96.3%) changed or added relevant lines in 5 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 90.716%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/cogent3/app/data_store.py 19 21 90.48%
Totals Coverage Status
Change from base Build 15572657767: 0.02%
Covered Lines: 30055
Relevant Lines: 33131

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0