8000 Efficient way to find approximate median in data range · Issue #11111 · tarantool/tarantool · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Efficient way to find approximate median in data range #11111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
locker opened this issue Feb 7, 2025 · 1 comment · Fixed by #11226
Closed

Efficient way to find approximate median in data range #11111

locker opened this issue Feb 7, 2025 · 1 comment · Fixed by #11226
Assignees
Labels
feature A new functionality vinyl

Comments

@locker
Copy link
Member
locker commented Feb 7, 2025

The range-based sharding strategy requires the ability to quickly find a median in a data range so that it can split ranges for load balancing. With the memtx storage engine, this can be done efficiently using index:count(), which was recently optimized to have the logarithmic complexity, see #8204. In vinyl, however, index:count() is still prohibitively slow and it's unclear whether it's possible to optimize it at all.

The suggestion is to add a new API function that could be used for finding a median with a reasonable approximation. For example, we could probably take the deepest run file and find the median key there, assuming that updates done in more recent run files are unlikely to shift the median substantially.

@locker locker added feature A new functionality vinyl labels Feb 7, 2025
@locker locker self-assigned this Feb 17, 2025
@locker
Copy link
Member Author
locker commented Feb 25, 2025

locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
The new method will be used by the quantile() public index API function.
The stub raises "method unsupported" error.

Part of tarantool#11111

NO_DOC=internal
NO_TEST=stub
NO_CHANGELOG=internal
locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
The new function takes three arguments: level, begin_key, and end_key.
It returns the quantile point key from the target range corresponding to
the given level. It may return nil if the range is too small, in
particular, if there are no tuples in it. The new function works only
with TREE indexes. Currently, it isn't implemented by any engine.

Part of tarantool#11111

NO_DOC=added later
NO_CHANGELOG=added later
locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
The implementation is trivial: we simply return the key at the offset
equal to level * (end - begin) using the offsets built into the tree.
The only case when we return nil is when the target range is empty.

Part of tarantool#11111

NO_DOC=added later
NO_CHANGELOG=added later
locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
We find the quantile key assuming that all keys are stored at the
deepest disk layer while newer layers contain incremental updates.
First, we walk over all index ranges overlapping with the target range
and estimate the number of tuples in them. Then we compute the offset of
the quantile key and walk over the index ranges again to find the key.

To achieve that, we need two helper functions:

 1. vy_run_estimate_stmt_count() estimates the number of tuples in
    the given run within the specified range. Basically, it finds
    the number of pages falling in the range, then scales down
    the total number of rows stored in the run proportionally,
    assuming that each page stores approximately the same number
    of rows.

 2. vy_run_estimate_key_at() finds a key at the given offset.
    Like vy_run_estimate_stmt_count(), this function assumes that
    each page stores approximately the same number of rows to
    find the page where the target key is located, then it returns
    the page min key, which should be a good enough approximation.

None of the functions accesses disk (yields).

Note that the target key may happen to be outside the target range, for
example, if the range is so small that there's no page fitting in it.
To follow the index:quantile() function protocol, we ignore such keys
(return NULL).

Also note that runs store extended keys so we need to strip extra parts
before returning the found key to the user.

Closes tarantool#11111

@TarantoolBot document
Title: Document `index:quantile()`

The new index method finds a quantile point in an indexed data range.
It takes the quantile level (0 < L < 1) and optionally the target range
boundaries and returns such a key that the ratio of tuples less than
the key equals L.

Example:

```lua
local s = box.schema.space.create('test')
s:create_index('pk')

for i = 1, 100 do
    s:insert({i * i, i})
end

-- Find the median in the whole index.
s.index.pk:quantile(0.5) -- returns {2601}

-- Find the 90th percentile among all keys < 1000
s.index.pk:quantile(0.9, {}, {1000}) -- returns {784}
```

The target range is defined as the intersection of the following read
requests:

```lua
s:select(begin_key, {iterator = 'ge'})
s:select(end_key, {iterator = 'lt'})
```

This means that using the empty key `{}` or `nil` for both `begin_key`
and `end_key` finds the quantile in the whole index. The function raises
an error if there can't possibly be any tuples in the target range, i.e.
if `key_begin` >= `key_end`.

The function returns `nil` if there are no tuples in the target
range or if it failed to find the quantile due to the limitations of
the storage engine, which are described below.

The function is implemented by memtx and vinyl tree indexes only.
In memtx the function returns the quantile point exactly. It has
the logarithmic complexity. It may return `nil` only if there are
no tuples in the target range. There's guaranteed to be a tuple
in the target range corresponding to the returned key.

The vinyl implementation returns the quantile point with a reasonable
degree of error for performance considerations. It has the logarithmic
complexity as well. There may or may not be a tuple in the target
range corresponding to the returned key. The function may return nil
even if the target range is not empty, in particular, if there are
no disk layers or if the range is smaller than the vinyl page size.

The function doesn't yield.

The function doesn't participate in transactions.

Like other index methods, `index:quantile()` can be used through
a space object as `space:quantile()`, in which case it works as
a shortcut for the primary index.

For more details, see the RFC document: https://github.com/orgs/tarantool/discussions/11194
locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
The new function takes three arguments: level, begin_key, and end_key.
It returns the quantile point key from the target range corresponding to
the given level. It may return nil if the range is too small, in
particular, if there are no tuples in it. The new function works only
with TREE indexes. Currently, it isn't implemented by any engine.

Part of tarantool#11111

NO_DOC=added later
NO_CHANGELOG=added later
locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
The implementation is trivial: we simply return the key at the offset
equal to level * (end - begin) using the offsets built into the tree.
The only case when we return nil is when the target range is empty.

Part of tarantool#11111

NO_DOC=added later
NO_CHANGELOG=added later
locker added a commit to locker/tarantool that referenced this issue Mar 7, 2025
We find the quantile key assuming that all keys are stored at the
deepest disk layer while newer layers contain incremental updates.
First, we walk over all index ranges overlapping with the target range
and estimate the number of tuples in them. Then we compute the offset of
the quantile key and walk over the index ranges again to find the key.

To achieve that, we need two helper functions:

 1. vy_run_estimate_stmt_count() estimates the number of tuples in
    the given run within the specified range. Basically, it finds
    the number of pages falling in the range, then scales down
    the total number of rows stored in the run proportionally,
    assuming that each page stores approximately the same number
    of rows.

 2. vy_run_estimate_key_at() finds a key at the given offset.
    Like vy_run_estimate_stmt_count(), this function assumes that
    each page stores approximately the same number of rows to
    find the page where the target key is located, then it returns
    the page min key, which should be a good enough approximation.

None of the functions accesses disk (yields).

Note that the target key may happen to be outside the target range, for
example, if the range is so small that there's no page fitting in it.
To follow the index:quantile() function protocol, we ignore such keys
(return NULL).

Also note that runs store extended keys so we need to strip extra parts
before returning the found key to the user.

Closes tarantool#11111

@TarantoolBot document
Title: Document `index:quantile()`

The new index method finds a quantile point in an indexed data range.
It takes the quantile level (0 < L < 1) and optionally the target range
boundaries and returns such a key that the ratio of tuples less than
the key equals L.

Example:

```lua
local s = box.schema.space.create('test')
s:create_index('pk')

for i = 1, 100 do
    s:insert({i * i, i})
end

-- Find the median in the whole index.
s.index.pk:quantile(0.5) -- returns {2601}

-- Find the 90th percentile among all keys < 1000
s.index.pk:quantile(0.9, {}, {1000}) -- returns {784}
```

The target range is defined as the intersection of the following read
requests:

```lua
s:select(begin_key, {iterator = 'ge'})
s:select(end_key, {iterator = 'lt'})
```

This means that using the empty key `{}` or `nil` for both `begin_key`
and `end_key` finds the quantile in the whole index. The function raises
an error if there can't possibly be any tuples in the target range, i.e.
if `key_begin` >= `key_end`.

The function returns `nil` if there are no tuples in the target
range or if it failed to find the quantile due to the limitations of
the storage engine, which are described below.

The function is implemented by memtx and vinyl tree indexes only.
In memtx the function returns the quantile point exactly. It has
the logarithmic complexity. It may return `nil` only if there are
no tuples in the target range. There's guaranteed to be a tuple
in the target range corresponding to the returned key.

The vinyl implementation returns the quantile point with a reasonable
degree of error for performance considerations. It has the logarithmic
complexity as well. There may or may not be a tuple in the target
range corresponding to the returned key. The function may return nil
even if the target range is not empty, in particular, if there are
no disk layers or if the range is smaller than the vinyl page size.

The function doesn't yield.

The function doesn't participate in transactions.

Like other index methods, `index:quantile()` can be used through
a space object as `space:quantile()`, in which case it works as
a shortcut for the primary index.

For more details, see the RFC document: https://github.com/orgs/tarantool/discussions/11194
locker added a commit that referenced this issue Mar 10, 2025
The new method will be used by the quantile() public index API function.
The stub raises "method unsupported" error.

Part of #11111

NO_DOC=internal
NO_TEST=stub
NO_CHANGELOG=internal
locker added a commit that referenced this issue Mar 10, 2025
The new function takes three arguments: level, begin_key, and end_key.
It returns the quantile point key from the target range corresponding to
the given level. It may return nil if the range is too small, in
particular, if there are no tuples in it. The new function works only
with TREE indexes. Currently, it isn't implemented by any engine.

Part of #11111

NO_DOC=added later
NO_CHANGELOG=added later
locker added a commit that referenced this issue Mar 10, 2025
The implementation is trivial: we simply return the key at the offset
equal to level * (end - begin) using the offsets built into the tree.
The only case when we return nil is when the target range is empty.

Part of #11111

NO_DOC=added later
NO_CHANGELOG=added later
locker added a commit that referenced this issue Mar 10, 2025
We find the quantile key assuming that all keys are stored at the
deepest disk layer while newer layers contain incremental updates.
First, we walk over all index ranges overlapping with the target range
and estimate the number of tuples in them. Then we compute the offset of
the quantile key and walk over the index ranges again to find the key.

To achieve that, we need two helper functions:

 1. vy_run_estimate_stmt_count() estimates the number of tuples in
    the given run within the specified range. Basically, it finds
    the number of pages falling in the range, then scales down
    the total number of rows stored in the run proportionally,
    assuming that each page stores approximately the same number
    of rows.

 2. vy_run_estimate_key_at() finds a key at the given offset.
    Like vy_run_estimate_stmt_count(), this function assumes that
    each page stores approximately the same number of rows to
    find the page where the target key is located, then it returns
    the page min key, which should be a good enough approximation.

None of the functions accesses disk (yields).

Note that the target key may happen to be outside the target range, for
example, if the range is so small that there's no page fitting in it.
To follow the index:quantile() function protocol, we ignore such keys
(return NULL).

Also note that runs store extended keys so we need to strip extra parts
before returning the found key to the user.

Closes #11111

@TarantoolBot document
Title: Document `index:quantile()`

The new index method finds a quantile point in an indexed data range.
It takes the quantile level (0 < L < 1) and optionally the target range
boundaries and returns such a key that the ratio of tuples less than
the key equals L.

Example:

```lua
local s = box.schema.space.create('test')
s:create_index('pk')

for i = 1, 100 do
    s:insert({i * i, i})
end

-- Find the median in the whole index.
s.index.pk:quantile(0.5) -- returns {2601}

-- Find the 90th percentile among all keys < 1000
s.index.pk:quantile(0.9, {}, {1000}) -- returns {784}
```

The target range is defined as the intersection of the following read
requests:

```lua
s:select(begin_key, {iterator = 'ge'})
s:select(end_key, {iterator = 'lt'})
```

This means that using the empty key `{}` or `nil` for both `begin_key`
and `end_key` finds the quantile in the whole index. The function raises
an error if there can't possibly be any tuples in the target range, i.e.
if `key_begin` >= `key_end`.

The function returns `nil` if there are no tuples in the target
range or if it failed to find the quantile due to the limitations of
the storage engine, which are described below.

The function is implemented by memtx and vinyl tree indexes only.
In memtx the function returns the quantile point exactly. It has
the logarithmic complexity. It may return `nil` only if there are
no tuples in the target range. There's guaranteed to be a tuple
in the target range corresponding to the returned key.

The vinyl implementation returns the quantile point with a reasonable
degree of error for performance considerations. It has the logarithmic
complexity as well. There may or may not be a tuple in the target
range corresponding to the returned key. The function may return nil
even if the target range is not empty, in particular, if there are
no disk layers or if the range is smaller than the vinyl page size.

The function doesn't yield.

The function doesn't participate in transactions.

Like other index methods, `index:quantile()` can be used through
a space object as `space:quantile()`, in which case it works as
a shortcut for the primary index.

For more details, see the RFC document: https://github.com/orgs/tarantool/discussions/11194
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality vinyl
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant
0