-
Notifications
You must be signed in to change notification settings - Fork 96
Automatic tracing::Span
propagation
#266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In async Rust without using `ractor`, annotating async fns with `#[tracing::instrument]` will cause an appropriate nesting hierarchy to be created when calling async fns from other async fns. This preserves the callstack and is very useful when leveraging other crates in the tracing ecosystem. When using `ractor`, this automatic span nesting is lost since using `call` and `cast` cause the current span to not be propagated. It is possible to manually propagate the span by having a `tracing::Span` field in your custom message struct for the particular Actor you're impling. However, if you have many enum variants of your message and wish to propagate the span in all of them, it is cumbersome. Additionally, you must manually instrument any futures you invoke within your message handler with the span from the message you're handling. This is error-prone. In this PR I add the automatic inclusion of the current Span when instantiating a `BoxedMessage`. Then when the handler is invoked, I instrument the future with that captured Span. This allows both `cast` and `call` to automatically benefit from tracing span propagation. The only thing a user of `ractor` would need to do to benefit from this is annotate their `handle` fn in their `impl Actor for ActorStruct` logic with `#[tracing::instrument]`. Even if the user did not do this, the current span could still be retrieved since we're propagating it anyways. Test Plan: `cargo bench -p ractor` on main ``` Running benches/actor.rs (target/release/deps/actor-fd9a9f57eedaa4d3) Gnuplot not found, using plotters backend Creation of 100 actors time: [266.39 µs 270.86 µs 275.36 µs] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low severe 5 (5.00%) low mild 3 (3.00%) high mild Creation of 10000 actors time: [31.654 ms 32.843 ms 34.145 ms] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe Benchmarking Waiting on 100 actors to process first message: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 50. Waiting on 100 actors to process first message time: [1.0713 ms 1.1400 ms 1.2095 ms] Waiting on 1000 actors to process first message time: [12.655 ms 13.147 ms 13.635 ms] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild Waiting on 100000 messages to be processed time: [14.885 ms 15.123 ms 15.522 ms] Found 14 outliers among 100 measurements (14.00%) 5 (5.00%) low severe 4 (4.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe Running benches/async_traits.rs (target/release/deps/async_traits-e1754e50a02bb4eb) Gnuplot not found, using plotters backend Waiting on 50 messages with large data in the Future to be processed time: [120.73 µs 122.43 µs 124.51 µs] Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe ``` `cargo bench -p ractor` on these changes ``` Running benches/actor.rs (target/release/deps/actor-fd9a9f57eedaa4d3) Gnuplot not found, using plotters backend Creation of 100 actors time: [268.98 µs 274.91 µs 281.19 µs] change: [+0.4473% +4.3969% +8.4713%] (p = 0.03 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low severe 5 (5.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe Creation of 10000 actors time: [26.730 ms 27.284 ms 27.868 ms] change: [-20.488% -16.925% -13.452%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe Benchmarking Waiting on 100 actors to process first message: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.2s, enable flat sampling, or reduce sample count to 60. Waiting on 100 actors to process first message time: [851.50 µs 884.69 µs 928.90 µs] change: [-29.934% -26.442% -22.701%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe Waiting on 1000 actors to process first message time: [8.1509 ms 8.3325 ms 8.5340 ms] change: [-39.237% -36.622% -33.672%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) high mild 3 (3.00%) high severe Waiting on 100000 messages to be processed time: [17.252 ms 17.303 ms 17.358 ms] change: [+11.411% +14.419% +16.342%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild Running benches/async_traits.rs (target/release/deps/async_traits-e1754e50a02bb4eb) Gnuplot not found, using plotters backend Waiting on 50 messages with large data in the Future to be processed time: [124.63 µs 125.51 µs 126.45 µs] change: [+2.6848% +3.7470% +4.7820%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild ```
ractor
, annotating async fns with #[tracing::instrument]
will cause an appropriate nesting hierarchy to be created when calling async fns from other async fns. This preserves the callstack and is very useful when leveraging other crates in the tracing ecosystem. When using ractor
, this automatic span nesting is lost since using call
and cast
cause the current span to not be propagated. It is possible to manually propagate the span by having a tracing::Span
field in your custom message struct for the particular Actor you're impling. However, if you have many enum variants of your message and wish to propagate the span in all of them, it is cumbersome. Additionally, you must manually instrument any futures you invoke within your message handler with the span from the message you're handling. This is error-prone.tracing::Span
propagation
I did some more benchmarks. I ran
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #266 +/- ##
==========================================
+ Coverage 79.95% 79.98% +0.02%
==========================================
Files 60 60
Lines 10273 10287 +14
==========================================
+ Hits 8214 8228 +14
Misses 2059 2059 ☔ View full report in Codecov by Sentry. |
We do appear to take ~17% hit in message processing performance (benchmark which runs consistently in the 13s range degraded to 15s). As for memory
measured with |
47ac581
to
ec66563
Compare
c1fc1ae
to
7312f84
Compare
7312f84
to
ad9d53f
Compare
I'm disabling the async-std CI suite. There's a flaky test that I can't repro outside of github's CI pipeline and no one uses it anyways (since we have a hard dep on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, let's merge it
This reverts commit d695638.
Context
In async Rust without using
ractor
, annotating async fns with#[tracing::instrument]
will cause an appropriate nesting hierarchy to be created when calling async fns from other async fns. This preserves the callstack and is very useful when leveraging other crates in the tracing ecosystem. When usingractor
, this automatic span nesting is lost since usingcall
andcast
cause the current span to not be propagated. It is possible to manually propagate the span by having atracing::Span
field in your custom message struct for the particular Actor you're impling. However, if you have many enum variants of your message and wish to propagate the span in all of them, it is cumbersome. Additionally, you must manually instrument any futures you invoke within your message handler with the span from the message you're handling. This is error-prone.Solution
In this PR I add the automatic inclusion of the current Span when instantiating a
BoxedMessage
. Then when the handler is invoked, I instrument the future with that captured Span. This allows bothcast
andcall
to automatically benefit from tracing span propagation. The only thing a user ofractor
would need to do to benefit from this is annotate theirhandle
fn in theirimpl Actor for ActorStruct
logic with#[tracing::instrument]
. Even if the user did not do this, the current span could still be retrieved since we're propagating it anyways.Test Plan:
cargo bench -p ractor
on maincargo bench -p ractor
on these changes