8000 docs: indexers benchmark by cristianmtr · Pull Request #4003 · jina-ai/serve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

docs: indexers benchmark #4003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 1, 2021
Merged

docs: indexers benchmark #4003

merged 7 commits into from
Dec 1, 2021

Conversation

cristianmtr
Copy link
Contributor
@cristianmtr cristianmtr commented Nov 26, 2021

TODO

  • find right place in docs. Maybe on same page as Indexers. Then wait for docs: refactor indexers page #4002 to be merged
  • include table with results
  • include plots. As iframe? Can we include html files somehow?

@github-actions github-actions bot added size/S area/docs This issue/PR affects the docs labels Nov 26, 2021
@cristianmtr cristianmtr requested a review from hanxiao November 26, 2021 16:12
@github-actions
Copy link
github-actions bot commented Nov 26, 2021

Latency summary

Current PR yields:

  • 🐢🐢 index QPS at 1195, delta to last 2 avg.: -11%
  • 🐢🐢 query QPS at 22, delta to last 2 avg.: -14%
  • 🐢🐢 dam extend QPS at 34515, delta to last 2 avg.: -13%
  • 🐢🐢 avg flow time within 1.1538 seconds, delta to last 2 avg.: +1%
  • 😶 import jina within 0.4128 seconds, delta to last 2 avg.: +3%

Breakdown

Version Index QPS Query QPS DAM Extend QPS Avg Flow Time (s) Import Time (s)
current 1195 22 34515 1.1538 0.4128
2.5.0 1544 30 47720 1.0776 0.3579
2.4.10 1158 20 31807 1.2048 0.441

Backed by latency-tracking. Further commits will update this comment.

@hanxiao
Copy link
Member
hanxiao commented Nov 30, 2021

let me restructure the TOC a bit

@hanxiao
Copy link
Member
hanxiao commented Nov 30, 2021

nested
image

Copy link
Contributor
@davidbp davidbp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you talk about 1k query vectors it is not clear if this is 'done as a batch' (one function call that processes 1k query vector) or if query vectors are processes one at a time. If I were a user probably I would like to know

  • What's the expected time between .search and getting a result for a single query?
  • Whats the expected throughput of queries I can expect using X resources in delta time.

The document seems to answer the first question since it states Then we search with the respective search set, using a batch size of `1`, to mimic single query operations. The second question is not as relevant but still interesting for users with really high traffic.

Copy link
Member
@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it so far. Obviously the results data is needed.

In the final version, I'd put the results at the top and put the methodology behind. Most views will be for the results and thus they should be at the top and easiest accessible.

@cristianmtr
Copy link
Contributor Author

@maximilianwerk moved results to top.

@hanxiao @davidbp added results now too. Can you check again?

@cristianmtr cristianmtr marked this pull request as ready for review November 30, 2021 11:29
@cristianmtr
Copy link
Contributor Author

Not sure why the deployment doesn't include the subpage. Locally it is there

davidbp
davidbp previously approved these changes Nov 30, 2021
Copy link
Contributor
@davidbp davidbp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now the benchmarks.

@cristianmtr
Copy link
Contributor Author

I see now the benchmarks.

In the netlify deployment? I still don't. Tried a hard refresh but nothing.

@maximilianwerk
Copy link
Member

I see now the benchmarks.

In the netlify deployment? I still don't. Tried a hard refresh but nothing.

me neither. weird.

Copy link
Member
@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing, that the x-axis has varying scales in all the plots. This makes it much harder to get fast conclusions. Having a changing y-axis is OK. Opinions?

@cristianmtr
Copy link
Contributor Author

I find it confusing, that the x-axis has varying scales in all the plots. This makes it much harder to get fast conclusions. Having a changing y-axis is OK. Opinions?

I'd need to recreate them and the code from @Hippopotamus0308 is not yet in the benchmarks repo. We can either wait and recreate them with fixed axes, or merge it as it is now.

@Hippopotamus0308
Copy link
Contributor

I find it confusing, that the x-axis has varying scales in all the plots. This makes it much harder to get fast conclusions. Having a changing y-axis is OK. Opinions?

I'd need to recreate them and the code from @Hippopotamus0308 is not yet in the benchmarks repo. We can either wait and recreate them with fixed axes, or merge it as it is now.

@cristianmtr I've added to the pr in https://github.com/jina-ai/benchmark_indexers/pull/33, please check it.

@cristianmtr
Copy link
Contributor Author

@maximilianwerk

I tried to render them with fixed x_range but they are not all readable

ex.

image
image

@maximilianwerk
Copy link
Member

@maximilianwerk

I tried to render them with fixed x_range but they are not all readable

ex.

Ok then don't do that and keep it as they are :)

Copy link
Member
@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apart from this and the precision/recall, I am good with merging.

Copy link
Member
@maximilianwerk maximilianwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
github-actions bot commented Dec 1, 2021

📝 Docs are deployed on https://docs-indexers-benchmark--jina-docs.netlify.app 🎉

@cristianmtr cristianmtr merged commit faaacf1 into master Dec 1, 2021
@cristianmtr cristianmtr deleted the docs-indexers-benchmark branch December 1, 2021 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docs This issue/PR affects the docs size/S size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0