8000 Handle disconnects better by grcevski · Pull Request #1851 · grafana/beyla · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Handle disconnects better #1851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 22, 2025
Merged

Conversation

grcevski
Copy link
Contributor

For a while we've had the logic that completes artificially requests that have been left in limbo as timeouts with code 408. This was initially done to catch Beyla bugs, but in production systems it can cause confusion. With this change I'm doing the following:

  1. I'm changing the default behaviour for this 408s to be optional, that is we can enable it specifically for customers if there are suspicions that there are missing requests, or customers can enable automatic timeouts if they want to. Us forcing a value will always have negative side-effects. One example is slow APIs, we'd time them out instead of reporting the actual time. Another example is disconnects, which could be immediate and we'd report 30seconds timeout, which incorrectly increases the latency numbers.
  2. I've added code to detect disconnects, e.g. a request has started and then the client dropped the connection. In this case previously we'd report 408 timeout, now we report 400 (bad request) immediately as the tcp connection drops. I'm not sure if there's a better code to report here, I know NGINX uses 499 sometimes, but it's non-standard afaik.

@grcevski grcevski requested review from grafsean and a team as code owners April 21, 2025 19:34
@grcevski
Copy link
Contributor Author

I was able to reproduce the bad behaviour with wrk2 running against SSL ngix with a Ruby upstream. Once wkr2 stops, it simply drops the connection threads causing incomplete transactions.

Copy link
codecov bot commented Apr 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.90%. Comparing base (c22f792) to head (c944b7b).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1851      +/-   ##
==========================================
+ Coverage   73.89%   73.90%   +0.01%     
==========================================
  Files         177      177              
  Lines       19225    19225              
==========================================
+ Hits        14207    14209       +2     
+ Misses       4287     4285       -2     
  Partials      731      731              
Flag Coverage Δ
integration-test 57.50% <ø> (-0.15%) ⬇️
k8s-integration-test 55.03% <ø> (+<0.01%) ⬆️
oats-test 35.53% <ø> (ø)
unittests 46.03% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor
@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@grcevski grcevski merged commit 31bb0e3 into grafana:main Apr 22, 2025
16 checks passed
@grcevski grcevski deleted the handle_disconnects_better branch April 22, 2025 14:24
@grcevski grcevski added the port-to-otel-ebpf-inst PRs that need to be ported to otel-ebpf-instrumentation label Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
port-to-otel-ebpf-inst PRs that need to be ported to otel-ebpf-instrumentation
Projects
< 41E5 div aria-live="polite">
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0