8000 remove override range_search from hnsw, use iterator-based instead by alwayslove2013 · Pull Request #1199 · zilliztech/knowhere · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

remove override range_search from hnsw, use iterator-based instead #1199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alwayslove2013
Copy link
Collaborator

No description provided.

@sre-ci-robot sre-ci-robot requested a review from chasingegg May 21, 2025 10:16
@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alwayslove2013
To complete the pull request process, please assign zhengbuqian after the PR has been reviewed.
You can assign the PR to them by writing /assign @zhengbuqian in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
mergify bot commented May 21, 2025

@alwayslove2013 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

@alwayslove2013
Copy link
Collaborator Author

/kind improvement

Signed-off-by: min.tian <min.tian.cn@gmail.com>
@alwayslove2013 alwayslove2013 force-pushed the remove_hnsw_ivf_range_search branch from 092fe99 to 3828aa5 Compare May 21, 2025 10:42
@alwayslove2013 alwayslove2013 changed the title remove override range_search from hnsw / ivf, use iterator-based instead remove override range_search from hnsw, use iterator-based instead May 21, 2025
@alexanderguzhva
Copy link
Collaborator

@alwayslove2013 Do I get it correct that we're deprecating the range search completely?

@alwayslove2013
Copy link
Collaborator Author

@alexanderguzhva In knowhere, there are two approaches to range_search. One is an internally implemented range_search within a specific index (override, like ivf / faiss_hnsw), while the other is a unified method from the parent class (index_node) that utilizes an iterator. This iterator continuously calls next() to collect the best results that meet the specified range.

8000
/**
* @brief Performs a range search operation on the index.
*
* This method provides a default implementation of range search based on the `AnnIterator`, assuming the iterator
* will buffer an expanded range and return the closest elements on each Next() call. It can be overridden by
* derived classes for more efficient implementations.
*
* @param dataset Query vectors.
* @param cfg
* @param bitset A BitsetView object for filtering results.
* @return An expected<> object containing the range search results or an error.
* @note Since the config object needs to be held in a future or lambda function, a smart pointer is required to
* delay its release.
*/
virtual expected<DataSetPtr>
RangeSearch(const DataSetPtr dataset, std::unique_ptr<Config> cfg,
const BitsetView& bitset) const { // TODO: @alwayslove2013 test with mock AnnIterator after we

It's a historical misalignment between milvus and knowhere. Previously, it was defined that knowhere would return all results that fit within the range. However, Milvus does not require all results and operates under a k limit (which we can name range_search_k). If the specified range is too large while range_search_k is relatively small, it can lead to significant inefficiencies. The iterator-based range_search can appropriately halt when sufficient results are collected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0