8000 fix(core): fix request/response missmatch while prefetching by jacobowitz · Pull Request #2843 · jina-ai/serve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix(core): fix request/response missmatch while prefetching #2843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jul 7, 2021

Conversation

jacobowitz
Copy link
Contributor
@jacobowitz jacobowitz commented Jul 2, 2021

This fixes the issue surfaced here: #2837

I added a message buffer during prefetching, which is used for matching the incoming responses from the flow to the correct requests from the clients. The clients are still waiting for their messages to be received from the Flow, which is done by waiting for the futures in the message buffer.
As prefetching now is stateful, I removed the mixin design of the prefetcher.

I also added a very simple unit test. The test sends an unexpected response (simulating other clients running concurrently) before sending the expected response. In current master this test fails, as we would incorrectly return the unexpected message. With the new functionality this message is dropped and the correct message is returned later.

@jina-bot jina-bot added size/S area/testing This issue/PR affects testing labels Jul 2, 2021
@codecov
Copy link
codecov bot commented Jul 2, 2021

Codecov Report

Merging #2843 (6d90e34) into master (c9ee572) will increase coverage by 1.38%.
The diff coverage is 97.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2843      +/-   ##
==========================================
+ Coverage   86.51%   87.89%   +1.38%     
==========================================
  Files         138      138              
  Lines        9365     9402      +37     
==========================================
+ Hits         8102     8264     +162     
+ Misses       1263     1138     -125     
Flag Coverage Δ
daemon 43.51% <77.64%> (-0.27%) ⬇️
jina 87.81% <97.64%> (+1.35%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/peapods/runtimes/zmq/base.py 100.00% <ø> (ø)
jina/peapods/runtimes/gateway/prefetch.py 93.05% <94.28%> (-1.07%) ⬇️
jina/peapods/peas/__init__.py 95.59% <100.00%> (+2.69%) ⬆️
jina/peapods/runtimes/container/__init__.py 85.18% <100.00%> (+1.02%) ⬆️
jina/peapods/runtimes/gateway/grpc/__init__.py 100.00% <100.00%> (ø)
jina/peapods/runtimes/gateway/http/app.py 93.15% <100.00%> (+0.09%) ⬆️
jina/peapods/runtimes/gateway/websocket/app.py 92.50% <100.00%> (+0.19%) ⬆️
jina/peapods/runtimes/jinad/__init__.py 95.58% <100.00%> (+47.31%) ⬆️
jina/peapods/runtimes/zmq/asyncio.py 92.30% <100.00%> (+2.56%) ⬆️
jina/peapods/runtimes/zmq/zed.py 92.44% <100.00%> (ø)
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34c17cd...6d90e34. Read the comment docs.

@jina-bot jina-bot added area/core This issue/PR affects the core codebase area/network This issue/PR affects network functionality component/peapod labels Jul 3, 2021
@jacobowitz jacobowitz changed the title fix(core): add unit test for concurrent prefetching fix(core): fix request/response missmatch while prefetching Jul 3, 2021
@jina-bot jina-bot added size/M and removed size/S labels Jul 5, 2021
@jacobowitz jacobowitz force-pushed the fix-concurrent-client-requests branch from de2e96a to 37da579 Compare July 5, 2021 12:22
@jacobowitz jacobowitz force-pushed the fix-concurrent-client-requests branch from 37da579 to 066880a Compare July 5, 2021 12:32
@jacobowitz jacobowitz marked this pull request as ready for review July 5, 2021 12:42
@jacobowitz jacobowitz requested a review from a team as a code owner July 5, 2021 12:42
@jacobowitz jacobowitz requested review from davidbp, JoanFM and maximilianwerk and removed request for davidbp July 5, 2021 12:42
Copy link
Member
@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see the original test from @numb3r3 included in our CI/CD pipeline.

@jacobowitz
Copy link
Contributor Author

@jacobowitz
Copy link
Contributor Author

@CatStark CatStark added the area/jinaD Related JinaD label Jul 6, 2021
@numb3r3
Copy link
Member
numb3r3 commented Jul 7, 2021

Storm test works now: https://github.com/jina-ai/jina/pull/2843/checks?check_run_id=2997162540
It failed before: https://github.com/jina-ai/jina/pull/2837/checks?check_run_id=2986555682

@jacobowitz wired, some errors are not captured by pytest, the following is the error log for grpc protocol
Screen Shot 2021-07-07 at 2 30 05 PM

or maybe, this error log is normal one we expected. I think you should double check them.

@jacobowitz
Copy link
Contributor Author

Storm test works now: https://github.com/jina-ai/jina/pull/2843/checks?check_run_id=2997162540
It failed before: https://github.com/jina-ai/jina/pull/2837/checks?check_run_id=2986555682

@jacobowitz wired, some errors are not captured by pytest, the following is the error log for grpc protocol
Screen Shot 2021-07-07 at 2 30 05 PM

or maybe, this error log is normal one we expected. I think you should double check them.

@numb3r3
I think this error is a multi threading issue of the GRPC client, maybe one could work around it in the test somehow but I dont think its worth it: grpc/grpc#25364

@JoanFM
Copy link
Contributor
JoanFM commented Jul 7, 2021

Storm test works now: https://github.com/jina-ai/jina/pull/2843/checks?check_run_id=2997162540
It failed before: https://github.com/jina-ai/jina/pull/2837/checks?check_run_id=2986555682

@jacobowitz wired, some errors are not captured by pytest, the following is the error log for grpc protocol
Screen Shot 2021-07-07 at 2 30 05 PM
or maybe, this error log is normal one we expected. I think you should double check them.

@numb3r3
I think this error is a multi threading issue of the GRPC client, maybe one could work around it in the test somehow but I dont think its worth it: grpc/grpc#25364

It is indeed an upstream error seen in many tests. Would not pay attention in this context

@jacobowitz jacobowitz dismissed stale reviews from deepankarm and maximilianwerk July 7, 2021 09:40

addressed all comments

Copy link
Member
@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I wondered, whether the PrefetchCaller should be a parent class of the GRPCPrefectchCall, but I think I like the current implementation more.

@jacobowitz jacobowitz merged commit 7cbf37e into master Jul 7, 2021
@jacobowitz jacobowitz deleted the fix-concurrent-client-requests branch July 7, 2021 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core This issue/PR affects the core codebase area/jinaD Related JinaD area/network This issue/PR affects network functionality area/testing This issue/PR affects testing component/peapod size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0