8000 Release v1.31.0 - Backward compatible named vectors, MUVERA, HNSW Snapshotting, BM25 AND/OR · weaviate/weaviate · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

v1.31.0 - Backward compatible named vectors, MUVERA, HNSW Snapshotting, BM25 AND/OR

Compare
Choose a tag to compare
@parkerduckworth parkerduckworth released this 30 May 21:46
· 365 commits to main since this release

Breaking Changes

None

New Features

Backward Compatible Named Vectors

  • feat(named-vectors): change auto-schema to create a named vector instead of legacy one by @faustuzas in #7678
  • feat(named-vectors): forward 'default' vector to legacy vector in case of mixed vector by @faustuzas in #7711
  • feat: allow to refer to legacy vector as default named vector in mixed collections by @faustuzas in #7749
  • Revert "feat: change autoschema to create a named vector (#7678)" by @faustuzas in #7764
  • Enable by default adding of new named vectors to existing collections by @antas-marcin in #8122

MUVERA

HNSW Snapshotting

Keyword Search AND/OR Operators

  • feat(bm25_block): ✨ support to minimum should match and AND by @amourao in #8124
  • Refactor/bm25 and args by @amourao in #8229
  • refactor(bm25_block): ♻️ change to lowercase minimumOrTokensMatch by @amourao in #8292

Replica Movement

BlockMax WAND Migration

  • feature (blockmax migrator): configurable collections/properties/tenants to migrate by @aliszka in #7750
  • fix (blockmax migrator): move started marking to sync phase by @aliszka in #7773
  • Blockmax migrator rollback select by @aliszka in #7780
  • feat(bm25_block): Trigger reindexing using REST call by triggering shard reinit by @amourao in #7766
  • feature (blockmax migrator): reload shards after reindex phase by @aliszka in #7767

Modules

Fixes

  • hnsw: check nodes slice as part of tombstone cleanup by @trengrj in #7683
  • queue: recover corrupt chunks by asdine in #7729
  • Add missing checks if dynamic DB user management is enabled by @dirkkul in #7744
  • Change user id when rotating key by @dirkkul in #7755
  • Disable default dimensions in Azure by @antas-marcin in #7768
  • RBAC: Handle downgrades by @dirkkul in #7719
  • Add first 3 letters of api key to return values by @dirkkul in #7762
  • adapters/handlers/rest: Fix nil pointer dereference in /authz/roles and /users/db endpoints by @mohamedawnallah in #7779
  • fix: erase empty wal files by @jeroiraz in #7793
  • Rename key to baseURL in default class config map by @tsmith023 in #7783
  • fix: async replication fetch all local digest in range by @jeroiraz in #7751
  • fix(bm25_block): 🐛 Fix condition for filter matching no docs by @amourao in #7804
  • Batch vectorization: Cache tokenizer by @dirkkul in #7818
  • fix call errors.As with a nil value error err by @alingse in #7682
  • DB Users: Fix concurrency issues by @dirkkul in #7851
  • Remove symbols added during merge by @dirkkul in #7880
  • fix: flush and lock when listing shard files by @jeroiraz in #7876
  • RBAC: Fix upgrading from version without rbac snapshots to rbac snapshots by @dirkkul in #7891
  • RBAC: Add upgrade path 1.29=>1.30 to RAFT snapshots by @dirkkul in #7886
  • fix: guard raft snapshot distributedTasks restore by @moogacs in #7898
  • RBAC: Add downgrade path 1.30=>1.29 to RAFT snapshots by @dirkkul in #7888
  • fix: set distributed task scheduler in app state by @faustuzas in #7913
  • DB Users: Fix updating first letters of api key when rotating a key by @dirkkul in #7914
  • avoid nil logger in pv-pair by @etiennedi in #7940
  • fix: crash-tolerant memtable flushing by @jeroiraz in #7938
  • Add named Vectors to GroupHitAdditional struct by @tsmith023 in #7933
  • chore: Handle file handler cleanup on newSegment by @kavirajk in #7945
  • chore: Handle file descriptors cleanup in few places by @kavirajk in #7943
  • fix: 🐛 fix segment load order on recovery by @amourao in #7978
  • Fix: pass missing memwatch from segment group -> segment by @etiennedi in #7994
  • bugfix: AutoSchema and Asyncreplication configs fix in Dynamic configs by @kavirajk in #7993
  • Remove workers for dynamic index by @trengrj in #7836
  • dynamic index: unblock index during upgrade by @asdine in #8001
  • bug(raft): fix lastAppliedIndex to be updated if there was no error by @moogacs in #8008
  • fix(bm25_block): 🐛 fix reading BMW data with offset.end != 0 by @amourao in #8023
  • fix(raft-snapshot): backward compatibility downgrade path for new snapshots structure by @moogacs in #8092
  • fix(db-internal): phantom tenants as leftover on UpdateIndex from RAFT to COLD by @moogacs in #8019
  • chore: include md5 header in S3 requests by @antas-marcin in #8150
  • fix(raft): make sure store is open no matter the error status on catchup by @moogacs in #8163
  • fix nil in results error by @donomii in #8179
  • Fix flat filtered search with multivector by @robbespo00 in #8200
  • fix(shard shutdown): make sure shard is shut down when it's marked for shutdown by @donomii in #8089
  • Distinguish between err != nil and shard == nil in index.GetShard error returns by @tsmith023 in #8223
  • fix: flush write buffer before syncing by @jeroiraz in #8256
  • Fix filtered search with multivec by @robbespo00 in #8238
  • DB Users: Add support for RAFT snapshots to db users by @dirkkul in #8164
  • Fix parsing the azure openai response by @dirkkul in #8272
  • chore: pass dynUserManager to single node recovery by @moogacs in #8280
  • refact(cluster server): gracefully shutdown internal REST server by @moogacs in #8257
  • fix(memberlist): rejoin list on single node split brain by @andrewisplinghoff and @moogacs in #8246

Performance Improvements

  • chore: refactor vector index and queue access to be thread-safe by @faustuzas in #7606
  • Optimized stand-alone k-means clustering by @tobias-weaviate in #7556
  • feat(bm25_block): Allow setting an higher segment inspection limit by env var by @amourao in #7813
  • feat(bm25_block): Skip search if allowList is empty by @amourao in #7814
  • Handle concurrency in the cache by @dirkkul in #7828
  • Optimize Segmentindex header parsing by @dirkkul in #7837
  • Optimize commitlogger writes by @dirkkul in #7841
  • Optimize reading of bloom filter by @dirkkul in #7839
  • Optimize creating bytes out of Mappair by @dirkkul in #7844
  • Feature: rangeable index in memory by @aliszka in #7801
  • Feature: buf pool for rangeable segment-in-memory by @aliszka in #7817
  • feat: add distributed tasks management by @faustuzas in #7878
  • feat: include distributed tasks to raft snapshots by @faustuzas in #7887
  • fix: sort string in /tasks to make it more deterministic by @faustuzas in #7919
  • feat: introduce optimized mmap pkg by @faustuzas in #7929
  • chore: optimize reading headers to precompute compaction by @dirkkul in #7934
  • Migrate more mmap uses to optimzied package by @dirkkul in #7946
  • chore: convert shardsStatus lock to RWMutex to be used in GetStatus() by @moogacs in #7956
  • chore: create new fsm inside recoverSingleNode to avoid panics with new fields by @moogacs in #7954
  • refact:(shutdown) improve db and shard dropping shutdown performance by @moogacs in #7571
  • Read small segments to disk 2 by @dirkkul in #7964
  • Fix: change default full read mmap and add mem watcher by @amourao in #7972
  • fix: flush buf writer when not including checksum by @jeroiraz in #8016
  • feat(bm25_block): ✨ use better average prop length for max impact by @amourao in #8018
  • Reduce amount of writes when creating segments by @dirkkul in #7971
  • Simplify writing indices if no secondary indices are present by @dirkkul in #8045
  • chore: inactivity timeout to ensure maintenance tasks are resumed by @jeroiraz in #8021
  • Change default min mmap size to 8kb by @dirkkul in #8067
  • Optimize writing bloom filters + net additions by @dirkkul in #8051
  • feat(bm25_block): ✨ use per prop length from segments by @amourao in #8038
  • refact: convert TenantResponse to models.Tenant only by @moogacs in #8098
  • Remove introduction of getNoInitLocalShard by @tsmith023 in #8128
  • Clamping negative distances to zero by @abdelr in #8133
  • optimization: wal reuse upon restart by @dirkkul in #8126
  • refact(raft-config): introduce raft timeouts multipler and adjust Query() and Apply() retries by @moogacs in #8194
  • Fix overzealous cyclemanager with wal reuse by @etiennedi in #8235
  • Configure more buckets for wal reuse by @dirkkul in #8231

Observability Improvements

  • chore: make RAFT TrailingLogs configurable by @moogacs in #7791
  • DB Users: add last used time by @dirkkul in #7786
  • Adding Kubernetes Grafana dashboard by @Dabz in #7790
  • chore: add metric fo db internal shard status by @moogacs in #7850
  • Adds metrics for OpenAI operations by @donomii in #7843
  • bugfix: Handle shards count metrics correctly for StartUnloadingShard by @kavirajk in #7901
  • feat: add an HTTP endpoint to list active distributed tasks by @faustuzas in #7902
  • chore: add metric for auto tenant operations by @moogacs in #7855
  • Fix vector index tombstones metric on restart by @trengrj in #7909
  • chore: add shard shutdown as valid status in the db layer for metric purposes by @moogacs in #7969
  • chore: set NoLegacyTelemetry flag on raft config. by @kavirajk in #8004
  • Metrics for every write by @etiennedi in #8022
  • Better metrics for Mmap usage by @etiennedi in #8104
  • chore: Add raft FSM index metrics to cluster store by @kavirajk in #8007
  • chore(raft): pass metric for single node recovery by @moogacs in #8132
  • chore: Add metric to track last applied index on startup. by @kavirajk in #8120
  • More fine-grained tenant analysis + runtime config to skip revectorize check by @etiennedi in #8302

Testing Improvements

  • fix index testing on replica movement poc branch by @reyreaud-l in #7590
  • Add tests for POST, PUT, DELETE ops involving references by @tsmith023 in #7742
  • skip flaky test_ref_with_multiple_cycle test by @faustuzas in #7748
  • fix: cherry-pick a commit to fix multi-vector validation test by @faustuzas in #7759
  • chore: add test for adding multi-vector index to an existing legacy collection by @faustuzas in #7758
  • Add tests for batching with auto-tenancy by @dirkkul in #7798
  • chore(tests): add possibility to pass a prebuilt MockOIDC image to tests by @antas-marcin in #7824
  • Feature: rangeable segment-in-memory tests by @aliszka in #7838
  • test: add Snapshot test to verify FSM snapshots e.g. the RBAC was restored correctly from a snapshot by @moogacs in #7803
  • chore: TestRBACSnapshotRecovery sort files content before md5sum checks by @moogacs in #7861
  • Improve backup e2e tests by @donomii in #7785
  • chore(ci): move less relevant module tests to be run only during release by @antas-marcin in #7870
  • Adds additional OpenAI tests and bug fixes by @donomii in #7869
  • chore(ci): reduce retries on e2e tests from 3 to 2 by @antas-marcin in #7884
  • chore(tests): stabilize flaky ColBERT e2e test by @antas-marcin in #7866
  • chore(ci): split integration tests into two pipelines by @antas-marcin in #7873
  • chore(tests): replace gemini-1.0 model with gemini-1.5 in generative-google e2e test by @antas-marcin in #7924
  • simplify consume resuming test by @reyreaud-l in #7991
  • Make mock call .Maybe rather than relying on timings by @tsmith023 in #8036
  • fix: Google module e2e tests by @antas-marcin in #8062
  • Make improvements to attempt flake fixes by @tsmith023 in #8070
  • chore: fix backup tests pass the correct response by @moogacs in #8121
  • chore(test): stabilize backup tests by assigning unique bucket names by @moogacs in #6999
  • chore(backup-gcs): handle overrides in test by @moogacs in #8131
  • Improve waiting for node shutdown in test by @dirkkul in #8146
  • Fix flaky test condense loop with alloc checker by @trengrj in #8156
  • chore(test): add acceptance test to make sure DB open after faulty schema update followed by restart by @moogacs in #8167
  • chore(gcs-test): use STORAGE_EMULATOR_HOST for gcs backup test clients by @moogacs in #8175
  • Fix flaky test due to un-normalized vectors by @trengrj in #8248
  • Removing knnSearchByVector from tests by @abdelr in #8250
  • chore(runtimeconfig): Add test to lock lower_snake_case in config by @kavirajk in #8275
  • chore(test) memberlist single node split brain test in case of network interrupt by @moogacs in #8262
  • chore: Fix panic in test when introducing new runtime config by @kavirajk in #8316
  • chore: skip temporarily TestNetworkIsolationSplitBrain test by @antas-marcin in #8321

Chores and Docs

Security Updates

New Contributors

Full Changelog: v1.30.0...v1.31.0

0