-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat: unify the fields for sparse and dense array #2578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR closes: #2438 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done |
65fafbe
to
c36eeef
Compare
@nan-wang Something new is happening. |
Codecov Report
@@ Coverage Diff @@
## master #2578 +/- ##
==========================================
+ Coverage 86.58% 89.37% +2.79%
==========================================
Files 148 146 -2
Lines 8943 8969 +26
==========================================
+ Hits 7743 8016 +273
+ Misses 1200 953 -247
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@nan-wang should be ready |
@bwanglzu Have you rerun the |
@nan-wang I ran the docker command as was specified in |
@@ -253,7 +253,7 @@ def test_jina_document_to_pydantic_document_sparse(): | |||
== pydantic_doc.embedding.sparse.values.buffer.decode() | |||
) | |||
assert ( | |||
jina_doc['embedding']['sparse']['values']['shape'] | |||
list(map(int, jina_doc['embedding']['sparse']['values']['shape'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoanFM after type changed, the shape
is str, while the shape passed to pandantic is int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why shape
is in str?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the shape in proto is a repeated int64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per proto3 JSON spec, uint64/int64 fields should be printed as decimal strings. See:
https://developers.google.com/protocol-buffers/docs/proto3#json
The reason is that uint64/int64 is not part of JSON spec and many JSON libraries only support double precision. To prevent precision loss our proto3 JSON spec requires serializers to put int64/uint64 values in strings.
checkout here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another option is to use int32
to represent the shape
, how do you think @nan-wang ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32 it is
Latency summaryCurrent PR yields:
Breakdown
Backed by latency-tracking. Further commits will update this comment. |
unified all |
resolved #2438