8000 feat: add tracing spans to ScanExec and TakeExec by wjones127 · Pull Request #3766 · lancedb/lance · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: add tracing spans to ScanExec and TakeExec #3766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 5, 2025

Conversation

wjones127
Copy link
Contributor
@wjones127 wjones127 commented Apr 30, 2025
  • Fixes spans to record object_store::path::Path with .as_ref(), so it displays as "/my/path" instead of Path { inner: "/my/path/" }.
  • Add size fields to some object store spans
  • Pass span does into the background IO loop of the scheduler. This way we can see all IO calls associated with a query in its spans.
  • Changes read_deletion_file span to only start if there is a deletion file. This reduces span spam in the traces where there are no deletion files.
  • Change apply_deletions span to only start if there are deletions. Similar to above, reduces span clutter in traces.
  • Add spans to MergeInsertJob::execute_uncommitted_impl and CommitBuilder::execute so we can clearly see the write and commit step in each attempt of MergeInsert job retries.
  • Add spans for LanceScanExec::execute and TakeExec::execute. Each of these will also record the IO metrics as part of their span attributes.
  • perf: In V2 scans, use out-of-order buffering of read futures if the user didn't request ordered output for the scan.

@github-actions github-actions bot added the enhancement New feature or request label Apr 30, 2025
@github-actions github-actions bot added the python label May 2, 2025
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 88.99083% with 12 lines in your changes missing coverage. Please review.

Project coverage is 78.55%. Comparing base (1f7b270) to head (a29d678).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
rust/lance-io/src/object_store/tracing.rs 69.23% 3 Missing and 1 partial ⚠️
rust/lance-encoding/src/decoder.rs 88.46% 3 Missing ⚠️
rust/lance/src/io/exec/scan.rs 88.46% 3 Missing ⚠️
rust/lance-table/src/utils/stream.rs 85.71% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3766      +/-   ##
==========================================
- Coverage   78.60%   78.55%   -0.05%     
==========================================
  Files         272      273       +1     
  Lines      101951   101933      -18     
  Branches   101951   101933      -18     
==========================================
- Hits        80138    80075      -63     
- Misses      18656    18690      +34     
- Partials     3157     3168      +11     
Flag Coverage Δ
unittests 78.55% <88.99%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wjones127 wjones127 marked this pull request as ready for review May 5, 2025 21:01
@wjones127 wjones127 merged commit 985a488 into lancedb:main May 5, 2025
28 of 30 checks passed
HaochengLIU pushed a commit to HaochengLIU/lance that referenced this pull request May 7, 2025
* Fixes spans to record `object_store::path::Path` with `.as_ref()`, so
it displays as `"/my/path"` instead of `Path { inner: "/my/path/" }`.
* Add `size` fields to some object store spans
* Pass span does into the background IO loop of the scheduler. This way
we can see all IO calls associated with a query in its spans.
* Changes `read_deletion_file` span to only start if there is a deletion
file. This reduces span spam in the traces where there are no deletion
files.
* Change `apply_deletions` span to only start if there are deletions.
Similar to above, reduces span clutter in traces.
* Add spans to `MergeInsertJob::execute_uncommitted_impl` and
`CommitBuilder::execute` so we can clearly see the write and commit step
in each attempt of `MergeInsert` job retries.
* Add spans for `LanceScanExec::execute` and `TakeExec::execute`. Each
of these will also record the IO metrics as part of their span
attributes.
* perf: In V2 scans, use out-of-order buffering of read futures if the
user didn't request ordered output for the scan.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0