8000 Define msgpack MP_INT type behavior · Issue #10132 · tarantool/tarantool · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Define msgpack MP_INT type behavior #10132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ImeevMA opened this issue Jun 14, 2024 · 5 comments
Closed

Define msgpack MP_INT type behavior #10132

ImeevMA opened this issue Jun 14, 2024 · 5 comments
Assignees
Labels
bug Something isn't working sql

Comments

@ImeevMA
Copy link
Collaborator
ImeevMA commented Jun 14, 2024

Currently Tarantool assumes in some places that MP_INT only contains negative integers, but this is different from the msgpack definition of MP_INT, where MP_INT can contain both negative and positive integers. It seems like we should either mention in the documentation that we don't support non-negative MP_INT values, or add appropriate support for them.

@oleg-jukovec
Copy link
Contributor
oleg-jukovec commented Jun 14, 2024

It seems like we should either mention in the documentation that we don't support non-negative MP_INT values

Then we will suddenly lose compatibility with MessagePack for no good reason. At least we could cast MP_INT > 0 -> MP_UINT at the encoding/decoding stage.

@sergos
Copy link
Contributor
sergos commented Jun 14, 2024

It seems like we should either mention in the documentation that we don't support non-negative MP_INT values

Then we will suddenly lose compatibility with MessagePack for no good reason. At least we could cast MP_INT > 0 -> MP_UINT at the encoding/decoding stage.

This will impose additional overhead on the precious TX, especially since it should be applied to all fields in the msgpack: one can insert a lot of data and thereafter try to introduce an index (the really broken part) on an arbitrary field.

So I would not name it as 'lost of compatibility' since it was not there from day one (or so) rather introduce a dedicated rules to use the storage in the only correct way.

@oleg-jukovec
Copy link
Contributor
oleg-jukovec commented Jun 14, 2024

This will impose additional overhead on the precious TX, especially since it should be applied to all fields in the msgpack: one can insert a lot of data and thereafter try to introduce an index (the really broken part) on an arbitrary field.

I didn't mean to check all incoming data and transform it, but just make an internal (from the Tarantool point of view) typecast MP_INT > 0 to MP_UINT in internal C structs on the decoding stage, if needed. This should help to avoid the problem and should not produce too much overhead.

So I would not name it as 'lost of compatibility' since it was not there from day one (or so) rather introduce a dedicated rules to use the storage in the only correct way.

But the compatibility is already here. A user can create tuples with fields MP_INT > 0. A user can insert the tuples into spaces. A user can select the tuples. But it just doesn’t work in some scenarios. It will be easier to fix these scenarios than to suggest patching a MessagePack library for each language/connector.

Anyway, if we want to make this rule, how will we enforce it without:

This will impose additional overhead

?

Allowing a user to insert tuples with MP_INT > 0, to work with the tuples, but throw an error with Debug build or return an invalid data with Release build in some cases don't look like a nice solution.

@ahmadhabibi14
Copy link

Any news for update ?

@Gerold103
Copy link
Collaborator

In scope of this ticket must be resolved some specific places, which rely on mp_decode_int() returning negative values:

  • mem_cmp_msgpack() saves mp_decode_int() result as MEM_TYPE_INT which is strictly negative. It breaks in many places, from what it looks like.
  • mem_from_mp_ephemeral() - same.
  • port_c_get_vdbemem() - same.

Gerold103 added a commit that referenced this issue Dec 18, 2024
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered
Gerold103 added a commit that referenced this issue Dec 18, 2024
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Dec 18, 2024
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Dec 18, 2024
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Dec 18, 2024
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix
@Gerold103 Gerold103 added bug Something isn't working sql labels Feb 14, 2025
Gerold103 added a commit that referenced this issue Feb 14, 2025
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered
Gerold103 added a commit that referenced this issue Feb 14, 2025
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 14, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 14, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 14, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix
Gerold103 added a commit that referenced this issue Feb 18, 2025
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered
Gerold103 added a commit that referenced this issue Feb 18, 2025
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 18, 2025
* mem_get_uint(), mem_get_uint_unsafe(), mem_get_bool(),
    mem_get_bin(), mem_len(), mem_get_bool_unsafe(),
    mem_len_unsafe() were unused.

* mem_get_int() was only used by mem_get_int_unsafe(), which made
    impossible to test if mem_get_int() returns the correct sign
    of the resulting number.

Lets drop the uint getter, and make the int getter not return the
sign at all then.

While being in scope of #10132, this patch is not really needed to
fix the bug. Was done while trying some approaches, and now it
looks useful on its own.

NO_DOC=bugfix
NO_CHANGELOG=later
NO_TEST=already covered
Gerold103 added a commit that referenced this issue Feb 18, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 18, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 18, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix
Gerold103 added a commit that referenced this issue Feb 24, 2025
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered
Gerold103 added a commit that referenced this issue Feb 24, 2025
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later
Gerold103 added a commit that referenced this issue Feb 24, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 16c6c26)
Gerold103 added a commit that referenced this issue Feb 24, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 67ed4cd)
Gerold103 added a commit that referenced this issue Feb 24, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix

(cherry picked from commit 0c7469b)
Gerold103 added a commit that referenced this issue Feb 24, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 16c6c26)
Gerold103 added a commit that referenced this issue Feb 24, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 67ed4cd)
Gerold103 added a commit that referenced this issue Feb 24, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix

(cherry picked from commit 0c7469b)
Gerold103 added a commit that referenced this issue Feb 25, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 67ed4cd)
Gerold103 added a commit that referenced this issue Feb 25, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix

(cherry picked from commit 0c7469b)
locker pushed a commit that referenced this issue Feb 26, 2025
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered

(cherry picked from commit 36821d6)
locker pushed a commit that referenced this issue Feb 26, 2025
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 517529b)
locker pushed a commit that referenced this issue Feb 26, 2025
* mem_get_uint(), mem_get_uint_unsafe(), mem_get_bool(),
    mem_get_bin(), mem_len(), mem_get_bool_unsafe(),
    mem_len_unsafe() were unused.

* mem_get_int() was only used by mem_get_int_unsafe(), which made
    impossible to test if mem_get_int() returns the correct sign
    of the resulting number.

Lets drop the uint getter, and make the int getter not return the
sign at all then.

While being in scope of #10132, this patch is not really needed to
fix the bug. Was done while trying some approaches, and now it
looks useful on its own.

NO_DOC=bugfix
NO_CHANGELOG=later
NO_TEST=already covered

(cherry picked from commit 61efdac)
locker pushed a commit that referenced this issue Feb 26, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 16c6c26)
locker pushed a commit that referenced this issue Feb 26, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 67ed4cd)
locker pushed a commit that referenced this issue Feb 26, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix

(cherry picked from commit 0c7469b)
locker pushed a commit that referenced this issue Feb 26, 2025
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered

(cherry picked from commit 36821d6)
locker pushed a commit that referenced this issue Feb 26, 2025
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 517529b)
locker pushed a commit that referenced this issue Feb 26, 2025
* mem_get_uint(), mem_get_uint_unsafe(), mem_get_bool(),
    mem_get_bin(), mem_len(), mem_get_bool_unsafe(),
    mem_len_unsafe() were unused.

* mem_get_int() was only used by mem_get_int_unsafe(), which made
    impossible to test if mem_get_int() returns the correct sign
    of the resulting number.

Lets drop the uint getter, and make the int getter not return the
sign at all then.

While being in scope of #10132, this patch is not really needed to
fix the bug. Was done while trying some approaches, and now it
looks useful on its own.

NO_DOC=bugfix
NO_CHANGELOG=later
NO_TEST=already covered

(cherry picked from commit 61efdac)
locker pushed a commit that referenced this issue Feb 26, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 16c6c26)
locker pushed a commit that referenced this issue Feb 26, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 67ed4cd)
locker pushed a commit that referenced this issue Feb 26, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix

(cherry picked from commit 0c7469b)
locker pushed a commit that referenced this issue Feb 26, 2025
mem_set_int() -> mem_set_int_with_sign()
mem_set_int64() -> mem_set_int()

The motivation is that recently MP_INT was noticed to be able to
have positive values. Which means there are going to be places
where 'int' really means just an 'int', not necessarily a negative
int.

Given those news, it looks a bit strange that mem_set_int() used
to require a negative integer, and adding the suffix 64 was
allowing to pass an int of any sign.

Lets rename those functions to emphasize which of them relies on
the sign being given explicitly.

Part of #10132

NO_DOC=refactoring
NO_CHANGELOG=refactoring
NO_TEST=already covered

(cherry picked from commit 36821d6)
locker pushed a commit that referenced this issue Feb 26, 2025
The SQL bind parameters of type MP_INT were not correctly assigned
when the MP_INT contained a positive number.

Positive MP_INT is a valid msgpack, which must be supported. Lets
check the value's sign when assigning it then.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 517529b)
locker pushed a commit that referenced this issue Feb 26, 2025
* mem_get_uint(), mem_get_uint_unsafe(), mem_get_bool(),
    mem_get_bin(), mem_len(), mem_get_bool_unsafe(),
    mem_len_unsafe() were unused.

* mem_get_int() was only used by mem_get_int_unsafe(), which made
    impossible to test if mem_get_int() returns the correct sign
    of the resulting number.

Lets drop the uint getter, and make the int getter not return the
sign at all then.

While being in scope of #10132, this patch is not really needed to
fix the bug. Was done while trying some approaches, and now it
looks useful on its own.

NO_DOC=bugfix
NO_CHANGELOG=later
NO_TEST=already covered

(cherry picked from commit 61efdac)
locker pushed a commit that referenced this issue Feb 26, 2025
The function treated MP_INT as an always negative value. Mems
having a positive MP_INT + the by-design-negative MEM_TYPE_INT
type would always be considered <= 0 in comparison with raw
msgpack.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 16c6c26)
locker pushed a commit that referenced this issue Feb 26, 2025
The function is used to build a mem from a column. To use in
some operation, like a comparison or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Part of #10132

NO_DOC=bugfix
NO_CHANGELOG=later

(cherry picked from commit 67ed4cd)
locker pushed a commit that referenced this issue Feb 26, 2025
The function is used to build a mem from msgpack returned by a C
function via port_c. To use in some operation, like a comparison
or arithmetics.

It used to treat MP_INT as an always negative value, which
could result into those operations having invalid results or even
failing assertions in debug build.

Lets check the MP_INT sign before assigning the mem type.

Closes #10132

NO_DOC=bugfix

(cherry picked from commit 0c7469b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sql
Projects
None yet
Development

No branches or pull requests

5 participants
0