8000 feat: make docs behavior consistent :boom: by maximilianwerk · Pull Request #3227 · jina-ai/serve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: make docs behavior consistent 💥 #3227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 24, 2021

Conversation

maximilianwerk
Copy link
Member
@maximilianwerk maximilianwerk commented Aug 20, 2021

The expected behavior from Executor side should be:

if Documents are returned:
    forward them to next Pod
else:
    forward whatever the docs object contains after Executor execution to next Pod

Subtasks

  • repair all tests (this will test the new behavior sufficiently!)
  • adopt cookbooks

@maximilianwerk maximilianwerk requested a review from a team as a code owner August 20, 2021 07:43
@github-actions github-actions bot added size/S area/core This issue/PR affects the core codebase area/network This issue/PR affects network functionality component/peapod labels Aug 20, 2021
@maximilianwerk maximilianwerk changed the title feat: make docs behavior consistent feat: make docs behavior consistent 💥 Aug 20, 2021
@maximilianwerk maximilianwerk force-pushed the feat-consistent-executor-behavior branch from 0357f56 to 2e01ef9 Compare August 20, 2021 07:45
@codecov
Copy link
codecov bot commented Aug 20, 2021

Codecov Report

Merging #3227 (23b3f6a) into master (d8d8070) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3227   +/-   ##
=======================================
  Coverage   89.11%   89.12%           
=======================================
  Files         148      148           
  Lines       10392    10398    +6     
=======================================
+ Hits         9261     9267    +6     
  Misses       1131     1131           
Flag Coverage Δ
daemon 44.90% <30.00%> (-0.02%) ⬇️
jina 88.94% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/clients/mixin.py 100.00% <ø> (ø)
jina/clients/request/__init__.py 100.00% <ø> (ø)
jina/clients/request/asyncio.py 88.88% <ø> (ø)
jina/peapods/pods/compound.py 90.34% <100.00%> (-0.07%) ⬇️
.../runtimes/request_handlers/data_request_handler.py 90.66% <100.00%> (+0.96%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8d8070...23b3f6a. Read the comment docs.

@github-actions
Copy link
github-actions bot commented Aug 20, 2021

Latency summary

Current PR yields:

  • 🐎🐎🐎🐎 index QPS at 1385, delta to last 2 avg.: +23%
  • 🐎🐎🐎🐎 query QPS at 60, delta to last 2 avg.: +24%
  • 🐎🐎🐎🐎 dam extend QPS at 64515, delta to last 2 avg.: +30%
  • 🐎🐎🐎🐎 avg flow time within 1.1413 seconds, delta to last 2 avg.: -34%
  • 🐢🐢 import jina within 0.3444 seconds, delta to last 2 avg.: -11%

Breakdown

Version Index QPS Query QPS DAM Extend QPS Avg Flow Time (s) Import Time (s)
current 1385 60 64515 1.1413 0.3444
2.0.20 1151 49 54296 1.7248 0.3979
2.0.19 1095 46 44210 1.776 0.3803

Backed by latency-tracking. Further commits will update this comment.

JoanFM
JoanFM previously requested changes Aug 20, 2021
Copy link
Contributor
@JoanFM JoanFM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May u add a test proving ur point? What is the problem trying to be solved here?

@maximilianwerk
Copy link
Member Author

May u add a test proving ur point? What is the problem trying to be solved here?

@jacobowitz can you add the test that shows your problem? Thanks!

@maximilianwerk
Copy link
Member Author

The failing test is actually quite interesting. Imagine:

Flow()
.add(name='foo').add(
    **external_args,
    name='external_fake',
    external=True,
    needs=['gateway', 'foo'],
)

Without a merging strategy, the second Pod will forward all Documents duplicated. Once from gateway and once from foo. I would argue, this is to-be-expected. Before, it would forward one RANDOM set of Documents. You don't even have deterministic behavior which one will be forwarded.

This makes the PR even more valid in my eyes.

Copy link
Contributor
@JoanFM JoanFM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure cookbook is updated and tests added

@maximilianwerk maximilianwerk marked this pull request as draft August 20, 2021 11:05
@maximilianwerk
Copy link
Member Author

Following example

from jina import Document, Executor, Flow, requests
import random
import time


class MyExecutor(Executor):
    def __init__(self, value, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.value = value

    @requests
    def any(self, docs, *args, **kwargs):
        for doc in docs:
            doc.text = self.value
        sleep_interval = random.random()
        print(sleep_interval)
        time.sleep(sleep_interval)


f = (
    Flow()
    .add(uses='MyExecutor', uses_with={'value': 'a'})
    .add(uses='MyExecutor', uses_with={'value': 'b'}, needs=['gateway'])
    .needs_all()
)

with f:
    result = f.post(on='/search', inputs=[Document(text='0')], return_results=True)
    print({doc.text for doc in result[0].data.docs})

produces randomly {'a'} or {'b'}. I would prefer {'a', 'b'}.

@github-actions github-actions bot added the area/testing This issue/PR affects testing label Aug 20, 2021
@maximilianwerk maximilianwerk force-pushed the feat-consistent-executor-behavior branch from 5b5d7a7 to eadd508 Compare August 20, 2021 14:10
@@ -119,7 +118,6 @@ def start(self) -> 'CompoundPod':
self.head_pea = Pea(head_args)
self._enter_pea(self.head_pea)
for _args in self.replicas_args:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is important to be kept. For the inner replica pod right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is kept. I just moved it to the place, where all args are set. Inside _set_replica_args. I prefer to have only one place in the code, where args are set.

@@ -132,18 +132,27 @@ def handle(
self.logger.debug(
f'skip executor: mismatch request, exec_endpoint: {msg.envelope.header.exec_endpoint}, requests: {self._executor.requests}'
)
if partial_requests:
DataRequestHandler.replace_docs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here u should do nothing right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified offline. For the record:

The reason I added this: When you do a Flow.add(name='1').add(name='2', needs['gateway', '1']) by default it uses the BaseExecutor in the HeadPea of 2. But the BaseExecutor has no endpoint defined. Anyhow, it should merge the the docs from both pathways to one big DocumentArray.

@maximilianwerk maximilianwerk force-pushed the feat-consistent-executor-behavior branch from eadd508 to 958f0db Compare August 23, 2021 07:33
@github-actions github-actions bot added size/M area/housekeeping This issue/PR is housekeeping component/client and removed size/S labels Aug 23, 2021
@maximilianwerk
Copy link
Member Author

Make sure cookbook is updated and tests added

Actually the Cookbook was already stating the new behavior. See https://github.com/jina-ai/jina/blob/master/.github/2.0/cookbooks/Flow.md#joiningmerging.

@maximilianwerk maximilianwerk force-pushed the feat-consistent-executor-behavior branch from 1be9136 to 23b3f6a Compare August 24, 2021 11:34
@maximilianwerk maximilianwerk marked this pull request as ready for review August 24, 2021 13:25
@maximilianwerk maximilianwerk dismissed JoanFM’s stale review August 24, 2021 13:26

tests are adopted

Copy link
Contributor
@jacobowitz jacobowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thank you very much for this improvement 👍

@maximilianwerk maximilianwerk merged commit e46d974 into master Aug 24, 2021
@maximilianwerk maximilianwerk deleted the feat-consistent-executor-behavior branch August 24, 2021 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core This issue/PR affects the core codebase area/housekeeping This issue/PR is housekeeping area/network This issue/PR affects network functionality area/testing This issue/PR affects testing component/client component/peapod size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0