8000 feat: unify the fields for sparse and dense array by nan-wang · Pull Request #2578 · jina-ai/serve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: unify the fields for sparse and dense array #2578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jun 15, 2021

Conversation

nan-wang
Copy link
Member
@nan-wang nan-wang commented Jun 7, 2021

resolved #2438

@nan-wang nan-wang requested a review from a team as a code owner June 7, 2021 16:01
@nan-wang nan-wang requested review from imsergiy and alanthssss June 7, 2021 16:01
@github-actions
Copy link
github-actions bot commented Jun 7, 2021

This PR closes: #2438

@jina-bot jina-bot added size/S area/core This issue/PR affects the core codebase area/network This issue/PR affects network functionality component/proto labels Jun 7, 2021
alanthssss
alanthssss previously approved these changes Jun 9, 2021
Copy link
Contributor
@alanthssss alanthssss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


@nan-wang plz pull the latest change on the master to the feature branch

@alanthssss
Copy link
Contributor

LGTM

@nan-wang plz pull the latest change on the master to the feature branch

Done

@alanthssss alanthssss force-pushed the feat-unify-sparse-array-2438 branch from 65fafbe to c36eeef Compare June 9, 2021 08:30
@alanthssss
Copy link
Contributor

@nan-wang Something new is happening.

@bwanglzu
Copy link
Member

this change will affect some of our core code:

9F83E460-8368-44BC-BFAC-4E0604F16668

@bwanglzu bwanglzu removed the request for review from imsergiy June 12, 2021 17:14
@jina-bot jina-bot added area/testing This issue/PR affects testing component/type labels Jun 12, 2021
@codecov
Copy link
codecov bot commented Jun 12, 2021

Codecov Report

Merging #2578 (8ccb8d5) into master (6346e84) will increase coverage by 2.79%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2578      +/-   ##
==========================================
+ Coverage   86.58%   89.37%   +2.79%     
==========================================
  Files         148      146       -2     
  Lines        8943     8969      +26     
==========================================
+ Hits         7743     8016     +273     
+ Misses       1200      953     -247     
Flag Coverage Δ
daemon 48.31% <40.00%> (-0.12%) ⬇️
jina 89.59% <100.00%> (+2.99%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/proto/jina_pb2.py 100.00% <ø> (ø)
jina/__init__.py 70.76% <100.00%> (ø)
jina/types/ndarray/sparse/__init__.py 100.00% <100.00%> (ø)
jina/parsers/__init__.py 100.00% <0.00%> (ø)
jina/jaml/parsers/executor/legacy.py 91.93% <0.00%> (ø)
jina/helloworld/multimodal/my_executors.py 89.64% <0.00%> (ø)
jina/parsers/hub/__init__.py
jina/parsers/hub/list.py
jina/parsers/hub/new.py
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6346e84...8ccb8d5. Read the comment docs.

@jina-bot jina-bot added size/M and removed size/S labels Jun 12, 2021
@bwanglzu bwanglzu marked this pull request as draft June 12, 2021 19:32
@bwanglzu bwanglzu marked this pull request as ready for review June 13, 2021 11:48
@bwanglzu
Copy link
Member

@nan-wang should be ready

0969397110
0969397110 previously approved these changes Jun 13, 2021
@nan-wang
Copy link
Member Author

@bwanglzu Have you rerun the docker run -v $(pwd)/jina/proto:/jina/proto jinaai/protogen and bumped the __proto_version__ in jina/__init__.py?

@bwanglzu
Copy link
Member
bwanglzu commented Jun 14, 2021

@nan-wang I ran the docker command as was specified in build-proto.sh, I'm using mac, do I need to manually update the version number __proto_version__?

@nan-wang nan-wang requested a review from hanxiao as a code owner June 14, 2021 15:51
@jina-bot jina-bot added area/entrypoint This issue/PR affects the entrypoint codebase area/helper This issue/PR affects the helper functionality labels Jun 14, 2021
@@ -253,7 +253,7 @@ def test_jina_document_to_pydantic_document_sparse():
== pydantic_doc.embedding.sparse.values.buffer.decode()
)
assert (
jina_doc['embedding']['sparse']['values']['shape']
list(map(int, jina_doc['embedding']['sparse']['values']['shape']))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoanFM after type changed, the shape is str, while the shape passed to pandantic is int

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why shape is in str?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, why is that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shape in proto is a repeated int64

Copy link
Member
@bwanglzu bwanglzu Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoanFM @hanxiao

As per proto3 JSON spec, uint64/int64 fields should be printed as decimal strings. See:
https://developers.google.com/protocol-buffers/docs/proto3#json
The reason is that uint64/int64 is not part of JSON spec and many JSON libraries only support double precision. To prevent precision loss our proto3 JSON spec requires serializers to put int64/uint64 values in strings.

checkout here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another option is to use int32 to represent the shape, how do you think @nan-wang ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint32 it is

@bwanglzu bwanglzu self-assigned this Jun 15, 2021
@github-actions
Copy link

Latency summary

Current PR yields:

  • 😶 index QPS at 1598, delta to last 1 avg.: -1%
  • 😶 query QPS at 69, delta to last 1 avg.: +0%

Breakdown

Version Index QPS Query QPS
current 1598 69
2.0.0rc6 1615 69

Backed by latency-tracking. Further commits will update this comment.

@bwanglzu
Copy link
Member

unified all shape to uint32

@hanxiao hanxiao merged commit 4419d07 into master Jun 15, 2021
@hanxiao hanxiao deleted the feat-unify-sparse-array-2438 branch June 15, 2021 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core This issue/PR affects the core codebase area/entrypoint This issue/PR affects the entrypoint codebase area/helper This issue/PR affects the helper functionality area/network This issue/PR affects network functionality area/testing This issue/PR affects testing component/proto component/type size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify the naming and type for the shape field in DenseNdArray and SparseNdArray
7 participants
0