8000 Automatic `tracing::Span` propagation by texascloud · Pull Request #266 · slawlor/ractor · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Automatic tracing::Span propagation #266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 4, 2024

Conversation

texascloud
Copy link
Contributor
@texascloud texascloud commented Oct 2, 2024

Context

In async Rust without using ractor, annotating async fns with #[tracing::instrument] will cause an appropriate nesting hierarchy to be created when calling async fns from other async fns. This preserves the callstack and is very useful when leveraging other crates in the tracing ecosystem. When using ractor, this automatic span nesting is lost since using call and cast cause the current span to not be propagated. It is possible to manually propagate the span by having a tracing::Span field in your custom message struct for the particular Actor you're impling. However, if you have many enum variants of your message and wish to propagate the span in all of them, it is cumbersome. Additionally, you must manually instrument any futures you invoke within your message handler with the span from the message you're handling. This is error-prone.

Solution

In this PR I add the automatic inclusion of the current Span when instantiating a BoxedMessage. Then when the handler is invoked, I instrument the future with that captured Span. This allows both cast and call to automatically benefit from tracing span propagation. The only thing a user of ractor would need to do to benefit from this is annotate their handle fn in their impl Actor for ActorStruct logic with #[tracing::instrument]. Even if the user did not do this, the current span could still be retrieved since we're propagating it anyways.

Test Plan:

cargo bench -p ractor on main

     Running benches/actor.rs (target/release/deps/actor-fd9a9f57eedaa4d3)
Gnuplot not found, using plotters backend
Creation of 100 actors  time:   [266.39 µs 270.86 µs 275.36 µs]
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  5 (5.00%) low mild
  3 (3.00%) high mild

Creation of 10000 actors
                        time:   [31.654 ms 32.843 ms 34.145 ms]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Benchmarking Waiting on 100 actors to process first message: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 50.
Waiting on 100 actors to process first message
                        time:   [1.0713 ms 1.1400 ms 1.2095 ms]

Waiting on 1000 actors to process first message
                        time:   [12.655 ms 13.147 ms 13.635 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

Waiting on 100000 messages to be processed
                        time:   [14.885 ms 15.123 ms 15.522 ms]
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) low severe
  4 (4.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

     Running benches/async_traits.rs (target/release/deps/async_traits-e1754e50a02bb4eb)
Gnuplot not found, using plotters backend
Waiting on 50 messages with large data in the Future to be processed
                        time:   [120.73 µs 122.43 µs 124.51 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

cargo bench -p ractor on these changes

     Running benches/actor.rs (target/release/deps/actor-fd9a9f57eedaa4d3)
Gnuplot not found, using plotters backend
Creation of 100 actors  time:   [268.98 µs 274.91 µs 281.19 µs]
                        change: [+0.4473% +4.3969% +8.4713%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  5 (5.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

Creation of 10000 actors
                        time:   [26.730 ms 27.284 ms 27.868 ms]
                        change: [-20.488% -16.925% -13.452%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

Benchmarking Waiting on 100 actors to process first message: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.2s, enable flat sampling, or reduce sample count to 60.
Waiting on 100 actors to process first message
                        time:   [851.50 µs 884.69 µs 928.90 µs]
                        change: [-29.934% -26.442% -22.701%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

Waiting on 1000 actors to process first message
                        time:   [8.1509 ms 8.3325 ms 8.5340 ms]
                        change: [-39.237% -36.622% -33.672%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Waiting on 100000 messages to be processed
                        time:   [17.252 ms 17.303 ms 17.358 ms]
                        change: [+11.411% +14.419% +16.342%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

     Running benches/async_traits.rs (target/release/deps/async_traits-e1754e50a02bb4eb)
Gnuplot not found, using plotters backend
Waiting on 50 messages with large data in the Future to be processed
                        time:   [124.63 µs 125.51 µs 126.45 µs]
                        change: [+2.6848% +3.7470% +4.7820%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

In async Rust without using `ractor`, annotating async fns with `#[tracing::instrument]` will cause an appropriate nesting hierarchy to be created when calling async fns from other async fns. This preserves the callstack and is very useful when leveraging other crates in the tracing ecosystem. When using `ractor`, this automatic span nesting is lost since using `call` and `cast` cause the current span to not be propagated. It is possible to manually propagate the span by having a `tracing::Span` field in your custom message struct for the particular Actor you're impling. However, if you have many enum variants of your message and wish to propagate the span in all of them, it is cumbersome. Additionally, you must manually instrument any futures you invoke within your message handler with the span from the message you're handling. This is error-prone.

In this PR I add the automatic inclusion of the current Span when instantiating a `BoxedMessage`. Then when the handler is invoked, I instrument the future with that captured Span. This allows both `cast` and `call` to automatically benefit from tracing span propagation. The only thing a user of `ractor` would need to do to benefit from this is annotate their `handle` fn in their `impl Actor for ActorStruct` logic with `#[tracing::instrument]`. Even if the user did not do this, the current span could still be retrieved since we're propagating it anyways.

Test Plan:
`cargo bench -p ractor` on main
```
     Running benches/actor.rs (target/release/deps/actor-fd9a9f57eedaa4d3)
Gnuplot not found, using plotters backend
Creation of 100 actors  time:   [266.39 µs 270.86 µs 275.36 µs]
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  5 (5.00%) low mild
  3 (3.00%) high mild

Creation of 10000 actors
                        time:   [31.654 ms 32.843 ms 34.145 ms]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

Benchmarking Waiting on 100 actors to process first message: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 50.
Waiting on 100 actors to process first message
                        time:   [1.0713 ms 1.1400 ms 1.2095 ms]

Waiting on 1000 actors to process first message
                        time:   [12.655 ms 13.147 ms 13.635 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

Waiting on 100000 messages to be processed
                        time:   [14.885 ms 15.123 ms 15.522 ms]
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) low severe
  4 (4.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

     Running benches/async_traits.rs (target/release/deps/async_traits-e1754e50a02bb4eb)
Gnuplot not found, using plotters backend
Waiting on 50 messages with large data in the Future to be processed
                        time:   [120.73 µs 122.43 µs 124.51 µs]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
```

`cargo bench -p ractor` on these changes
```
     Running benches/actor.rs (target/release/deps/actor-fd9a9f57eedaa4d3)
Gnuplot not found, using plotters backend
Creation of 100 actors  time:   [268.98 µs 274.91 µs 281.19 µs]
                        change: [+0.4473% +4.3969% +8.4713%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  5 (5.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

Creation of 10000 actors
                        time:   [26.730 ms 27.284 ms 27.868 ms]
                        change: [-20.488% -16.925% -13.452%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

Benchmarking Waiting on 100 actors to process first message: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.2s, enable flat sampling, or reduce sample count to 60.
Waiting on 100 actors to process first message
                        time:   [851.50 µs 884.69 µs 928.90 µs]
                        change: [-29.934% -26.442% -22.701%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

Waiting on 1000 actors to process first message
                        time:   [8.1509 ms 8.3325 ms 8.5340 ms]
                        change: [-39.237% -36.622% -33.672%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Waiting on 100000 messages to be processed
                        time:   [17.252 ms 17.303 ms 17.358 ms]
                        change: [+11.411% +14.419% +16.342%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

     Running benches/async_traits.rs (target/release/deps/async_traits-e1754e50a02bb4eb)
Gnuplot not found, using plotters backend
Waiting on 50 messages with large data in the Future to be processed
                        time:   [124.63 µs 125.51 µs 126.45 µs]
                        change: [+2.6848% +3.7470% +4.7820%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
```
@texascloud texascloud changed the title Summary: In async Rust without using ractor, annotating async fns with #[tracing::instrument] will cause an appropriate nesting hierarchy to be created when calling async fns from other async fns. This preserves the callstack and is very useful when leveraging other crates in the tracing ecosystem. When using ractor, this automatic span nesting is lost since using call and cast cause the current span to not be propagated. It is possible to manually propagate the span by having a tracing::Span field in your custom message struct for the particular Actor you're impling. However, if you have many enum variants of your message and wish to propagate the span in all of them, it is cumbersome. Additionally, you must manually instrument any futures you invoke within your message handler with the span from the message you're handling. This is error-prone. Automatic tracing::Span propagation Oct 2, 2024
@texascloud
Copy link
Contributor Author

I did some more benchmarks. I ran cargo bench -p ractor on main, this PR, and then this PR while having the spans not be Boxed. Each run is the second run to allow for any "warming". I am ignoring criterion's "changed" numbers cuz I noticed they fluctuated quite a bit even if I made no changes to the logic being benchmarked.

Results order:
main pr_boxed pr_unboxed

Creation of 100 actors
[279.44 µs 285.23 µs 290.89 µs] [286.42 µs 296.90 µs 312.37 µs] [344.79 µs 361.02 µs 381.86 µs]

Creation of 10000 actors
[26.891 ms 27.396 ms 27.937 ms] [27.969 ms 29.289 ms 31.113 ms] [31.933 ms 32.978 ms 34.170 ms]

Waiting on 100 actors to process first message
[753.90 µs 787.61 µs 842.21 µs] [835.32 µs 858.61 µs 897.29 µs] [889.10 µs 913.09 µs 938.30 µs]

Waiting on 1000 actors to process first message
[7.5806 ms 7.7571 ms 7.9715 ms] [8.2428 ms 8.3540 ms 8.4691 ms] [9.2888 ms 9.8780 ms 10.574 ms]

Waiting on 100000 messages to be processed
[12.822 ms 12.865 ms 12.916 ms] [17.474 ms 17.527 ms 17.586 ms] [15.876 ms 15.964 ms 16.110 ms]

Waiting on 50 messages with large data in the Future to be processed
[125.25 µs 126.03 µs 126.80 µs] [119.46 µs 120.47 µs 121.67 µs] [124.89 µs 126.23 µs 127.84 µs]

Copy link
codecov bot commented Oct 3, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.98%. Comparing base (3d6f629) to head (8a61dc1).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #266      +/-   ##
==========================================
+ Coverage   79.95%   79.98%   +0.02%     
==========================================
  Files          60       60              
  Lines       10273    10287      +14     
==========================================
+ Hits         8214     8228      +14     
  Misses       2059     2059              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@slawlor
Copy link
Owner
slawlor commented Oct 3, 2024

We do appear to take ~17% hit in message processing performance (benchmark which runs consistently in the 13s range degraded to 15s).

As for memory

  1. 197951488 bytes with propagating the span
  2. 197427200 with no span (a value of None)

measured with $ /usr/bin/time -l cargo run --example a_whole_lotta_messages -r on both span set and not set

@slawlor slawlor force-pushed the tracing_span_propagation branch from 47ac581 to ec66563 Compare October 3, 2024 21:00
@slawlor slawlor force-pushed the tracing_span_propagation branch 2 times, most recently from c1fc1ae to 7312f84 Compare October 3, 2024 23:39
@slawlor slawlor force-pushed the tracing_span_propagation branch from 7312f84 to ad9d53f Compare October 3, 2024 23:49
@slawlor
Copy link
Owner
slawlor commented Oct 3, 2024

I'm disabling the async-std CI suite. There's a flaky test that I can't repro outside of github's CI pipeline and no one uses it anyways (since we have a hard dep on tokio with at least the sync feature)

Copy link
Owner
@slawlor slawlor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, let's merge it

@slawlor slawlor merged commit d695638 into slawlor:main Oct 4, 2024
11 checks passed
@slawlor slawlor mentioned this pull request Dec 6, 2024
FancyQian added a commit to Righ-inc/ractor that referenced this pull request Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0