8000 fix: close loop from run_async by jacobowitz · Pull Request #4734 · jina-ai/serve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix: close loop from run_async #4734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 5, 2022
Merged

fix: close loop from run_async #4734

merged 9 commits into from
May 5, 2022

Conversation

jacobowitz
Copy link
Contributor
@jacobowitz jacobowitz commented May 2, 2022

We are leaking asyncio loops in run_ascynio. We use get_or_reuse_loop() there to create a new asyncio loop in each call. This loop is then used in run_until_complete. This completes the coroutine, but does NOT close the loop.

To fix the issue this PR now gets every existing loop. Not only the running loop as before. This will correctly reuse previously created loops. If there is no loop or of it is closed already, we create a new loop.

I also add psutil as a new dependency for testing. I could also do its purpose manually, but it seems more convenient to use this library here and I dont think we need to be too restrictive for CI purposes?

Fixes: #4725

@jacobowitz jacobowitz requested a review from JoanFM May 2, 2022 08:43
@github-actions github-actions bot added size/S area/core This issue/PR affects the core codebase area/helper This issue/PR affects the helper functionality area/setup This issue/PR affects setting up Jina area/testing This issue/PR affects testing labels May 2, 2022
@jacobowitz jacobowitz closed this May 2, 2022
@jacobowitz jacobowitz reopened this May 2, 2022
jina/helper.py Outdated
return get_or_reuse_loop().run_until_complete(func(*args, **kwargs))
loop = get_or_reuse_loop()
result = loop.run_until_complete(func(*args, **kwargs))
loop.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only close a loop, if we created it. If it was already running, we shouldn't.

Copy link
Contributor

we should use some context manager here

Copy link
Contributor Author
@jacobowitz jacobowitz May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could, its a bit tricky though. The underlying design issue is that the caller does not necessarily own the loop.
We could potentially make the new get_or_reuse_loop_with_info return a context manager instead of the tuple?

Copy link
Contributor Author
@jacobowitz jacobowitz May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this now in the helper and used here

@github-actions
Copy link
github-actions bot commented May 2, 2022

Latency summary

Current PR yields:

  • 🐢🐢 index QPS at 1085, delta to last 2 avg.: -13%
  • 🐢🐢 query QPS at 55, delta to last 2 avg.: -19%
  • 🐎🐎🐎🐎 avg flow time within 1.9819 seconds, delta to last 2 avg.: +15%
  • 🐎🐎🐎🐎 import jina within 0.5928 seconds, delta to last 2 avg.: +19%

Breakdown

Version Index QPS Query QPS Avg Flow Time (s) Import Time (s)
current 1085 55 1.9819 0.5928
3.3.24 1142 56 1.9302 0.549
3.3.23 1372 80 1.4976 0.4393

Backed by latency-tracking. Further commits will update this comment.

@codecov
Copy link
codecov bot commented May 2, 2022

Codecov Report

Merging #4734 (5e789f9) into master (944afbc) will decrease coverage by 0.06%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##           master    #4734      +/-   ##
==========================================
- Coverage   87.68%   87.62%   -0.07%     
==========================================
  Files         119      119              
  Lines        8632     8670      +38     
==========================================
+ Hits         7569     7597      +28     
- Misses       1063     1073      +10     
Flag Coverage Δ
jina 87.62% <75.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/helper.py 81.72% <75.00%> (-0.06%) ⬇️
...deployments/config/k8slib/kubernetes_deployment.py 52.08% <0.00%> (-6.25%) ⬇️
...a/orchestrate/deployments/config/docker_compose.py 99.00% <0.00%> (-1.00%) ⬇️
jina/orchestrate/flow/base.py 89.57% <0.00%> (-0.61%) ⬇️
jina/logging/logger.py 90.90% <0.00%> (+1.58%) ⬆️
jina/orchestrate/deployments/config/helper.py 98.24% <0.00%> (+3.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 944afbc...5e789f9. Read the comment docs.

jina/helper.py Outdated
return get_or_reuse_loop().run_until_complete(func(*args, **kwargs))
loop = get_or_reuse_loop()
result = loop.run_until_complete(func(*args, **kwargs))
loop.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use some context manager here

@jacobowitz jacobowitz marked this pull request as draft May 2, 2022 09:01
JoanFM
JoanFM previously requested changes May 2, 2022
@jacobowitz
Copy link
Contributor Author

After discussion with @deepankarm we came up with a better solution which fixes the underlying issue: Not reusing existing loops. Lets see if it shows any undesired side effects in the tests. If not, lets use this solution.

@jacobowitz jacobowitz force-pushed the fix-event-loop-leaking branch from 8f00d50 to ede303d Compare May 2, 2022 15:19
@jacobowitz
Copy link
Contributor Author

actually there are some cases where we need to create a loop. We should not try to retrieve a running loop though. Any non closed loop is fine.
Also rebased against master

try:
loop = asyncio.get_running_loop()
loop = asyncio.get_event_loop()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the relevant change. We should use any loop (unless closed as checked in the next line).

@jacobowitz
Copy link
Contributor Author

I reset the loop now on forking due to this upstream issue

@jacobowitz jacobowitz marked this pull request as ready for review May 5, 2022 13:52
@jacobowitz jacobowitz requested review from JoanFM and deepankarm May 5, 2022 14:04
@jacobowitz jacobowitz dismissed JoanFM’s stale review May 5, 2022 14:35

Comment is outdated

@jacobowitz jacobowitz merged commit c1f0ae5 into master May 5, 2022
@jacobowitz jacobowitz deleted the fix-event-loop-leaking branch May 5, 2022 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core This issue/PR affects the core codebase area/helper This issue/PR affects the helper functionality area/setup This issue/PR affects setting up Jina area/testing This issue/PR affects testing component/resource size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Client.encode() does not close file descriptors used.
4 participants
0