8000 Release 3.28 · NVIDIA/aistore · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

3.28

Compare
Choose a tag to compare
@alex-aizman alex-aizman released this 10 May 18:05
· 197 commits to main since this release

The latest AIStore release, version 3.28, arrives nearly three months after the previous release. As always, v3.28 maintains compatibility with the previous version. We fully expect it to upgrade cleanly from earlier versions.

This release delivers significantly improved ETL offload with a newly added WebSocket communicator and optimized data flows between ETL pods and AIS targets in a Kubernetes cluster.

For Python users, we added resilient retrying logic that maintains seamless connectivity during lifecycle events - the capability that can be critical when running multi-hour training workloads. We've also improved JobStats and JobSnapshot models, added MultiJobSnapshot, extended and fixed URL encoding, and added props accessor method to the Object class.

Python SDK's ETL has been extended with a new ETL server framework that provides three Python-based web server implementations: FastAPI, Flask, and HTTPMultiThreadedServer.

Separately, 3.28 adds a dual-layer rate-limiting capability with configurable support for both frontend (client-facing) and backend (cloud-facing, adaptive) operation.

On the CLI side, there are multiple usability improvements listed below. Users now have further improved list-objects (ais ls) operation, amended and improved inline helps and CLI documentation. The ais show job command now displays cluster-wide objects and bytes totals for distributed jobs.

Enhancements to observability are also detailed below and include new metrics to track rate-limited operations and extended job statistics. Most of the supported jobs will now report a j-w-f metric: number of mountpath joggers, number of (user-specified) workers, and a work-channel-full count.

Other improvements include new (and faster) content checksum, fast URL parsing (for Go API), optimized buffer allocation for multi-object operations and ETL, support for Unicode and special characters in object names. We've refactored and micro-optimized numerous components, and amended numerous docs, including the main readme and overview.

Last but not least, for better networking parallelism, we now support multiple long-lived peer-to-peer connections. The number of connections is configurable, and the supported batch jobs include distributed sort, erasure coding, multi-object and bucket-to-bucket copy, ETL, global rebalance, and more.

Table of Contents

Assorted commits for each section are also included below with detailed changelog available at this link.

Configuration Changes

We made several additions to global (cluster-wide) and bucket configuration settings.

Multiple xactions (jobs) now universally include a standard configuration triplet that provides for:

  • In-flight compression
  • Minimum size of work channel(s)
  • Number of peer-to-peer TCP connections (referred to as stream bundle multiplier)

The following jobs are now separately configurable at the cluster level:

  • EC (Erasure Coding)
  • Dsort (Distributed Shuffle)
  • Rebalance (Global Rebalance)
  • TCB (Bucket-to-Bucket Copy/Transform)
  • TCO (Multi-Object Copy/Transform)
  • Archive (Multi-Object Archiving or Sharding)

In addition, EC is also configurable on a per-bucket basis, allowing for further fine-tuning.

Commit Highlights

  • 15cf1ca: Add backward compatible config and BMD changes.
  • fc3d8f3: Add cluster config (tco, arch) sections and tcb burst.
  • 7b46f0c: Update configuration part two.
  • 8c49b6b: Add rate-limiting sections to global configuration and BMD (bucket metadata).
  • 15d4ed5: [config change] and [BMD change]: the following jobs now universally support XactConf triplet

New Default Checksum

AIStore 3.28 adds a new default content checksum. While still using xxhash, it uses a different implementation that delivers better performance in large-size streaming scenarios.

The system now makes a clear internal delineation between classic xxhash for system metadata (node IDs, bucket names, object metadata, etc.) and cespare/xxhash (designated as "xxhash2" in configuration) for user data. All newly created buckets now use "xxhash2" by default.

Benchmark tests show improved performance with the new implementation, especially for large objects and streaming operations.

Commit Highlights

  • a045b21: Implement new content checksum.
  • 7b69dc5: Update modules for new content checksum.
  • 9fa9265: Refine cespare vs one-of-one implementation.
  • d630c1f: Add cespare to hash micro-benchmark.

Rate Limiting

Version 3.28 introduces rate-limiting capability that operates at both frontend (client-facing) and backend (cloud-facing) layers.

On the frontend, each AIS proxy enforces configurable limits with burst capacity allowance. You can set different limits for each bucket based on its usage patterns, with separate configurations for GET, PUT, and DELETE operations.

For backend operations, the system implements an adaptive rate shaping mechanism that dynamically adjusts request rates based on cloud provider responses. This approach prevents hitting cloud provider limits proactively and implements exponential backoff when 429/503 responses are received. The implementation ensures zero overhead when rate limiting is disabled.

Configuration follows a hierarchical model with cluster-wide defaults that can be overridden per bucket. You can adjust intervals, token counts, burst sizes, and retry policies without service disruption.

Commit Highlights

  • e71c2b1: Implemented frontend/backend dual-layer rate limiting system.
  • 9f4d321: Added per-bucket overrides and exponential backoff for cloud 429 errors.
  • 12e5787: Not rate-limiting remote bucket with no props.
  • be309fd: docs: add rate limiting readme and blog.
  • b011945: Rate-limited backend: complete transition.
  • fcba62b: Rate-limit: add stats; prometheus metrics.
  • 8ee8b44: Rate-limited backend; context to propagate vlabs; prefetch.
  • c4b796a: Enable/disable rate-limited backends.
  • 666796f: Core: rate-limited backends (major update).

ETL

ETL (Extract, Transform, Load) is a cornerstone feature designed to execute transformations close to the data with an extremely high level of node-level parallelism across all nodes in the AIS cluster.

WebSocket

Version 3.28 adds WebSocket (ws://) as yet another fundamental communication mechanism between AIS nodes and ETL containers, complementing the existing HTTP and IO (STDIN/STDOUT) communications.

The WebSocket implementation supports multiple concurrent connections per transform session, preserves message order and boundaries for reliable communication, and provides stateful session management for long-lived, per-xaction sessions.

Direct PUT

The release implements a new direct PUT capability for ETL transformations that optimizes the data flow between components. Traditionally, data would flow from a source AIS target to an ETL container, back to the source AIS target, and finally to the destination target. With direct PUT, data flows directly from the ETL container to the destination AIS target.

Stress tests show 3x to 5x performance improvement with direct PUT enabled. This capability is available across all communication mechanisms (HTTP, IO, WebSocket) and ETL containers can detect direct PUT capability through environment variables.

Pod Lifecycle

The ETL framework now implements structured lifecycle transitions between Initializing, Running, and Stopped states with automated cleanup of Kubernetes resources when ETL enters the Stopped state. We've enhanced error capture and reporting from pods, including initialization failures, and added the ability to restart previously stopped ETL instances without recreating them.

ETL Framework

Version 3.28 introduces a reusable server interface for implementing transformations in Go and adds extensible base classes for developing ETL servers in Python. These include a multi-threaded server based on BaseHTTPRequestHandler, a Flask-based implementation for synchronous processing, and a FastAPI-based implementation for asynchronous processing.

The release refactors ETL runtime logic from transaction handlers to xactions for better state control, adds detailed error information gathering from pod logs during failures, and includes configurable timeout options for ETL operations.

Commit Highlights

  • 7c42a3d: Support inline transform for websocket communicator.
  • d85f26c: Introduce generic Go ETL webserver framework.
  • de0df5c: Support multiple websocket connections per tcb/tco job.
  • e1e4935: Implement FastAPIServer base class for async ETL processing.
  • ab99308: Implement FlaskServer for ETL transformations.
  • af4718e: Enhance error handling for unexpected pod failure.
  • fa04c09: ETL pod lifecycle: basic state transitions.
  • e176c5e: Add stats for offline transform; refactor GET/PUT request flows

API Enhancements; Batch Jobs

AIStore 3.28 adds non-recursive (batch) operation capability and introduces a num-workers parallelism parameter for copy/transform/evict/delete bucket and multi-object operations. We've also added support for graceful error handling during batch operations with the continue-on-error parameter.

The release implements common logic to compute optimal worker counts based on current load and improves progress reporting with sentinel opcodes for intra-cluster synchronization during progress, finish, and abort operations. These enhancements have been standardized across all multi-object operations, including copy, prefetch, transform, delete, evict, and archive.

Commit Highlights

  • 72c0b8c: Add non-recursive option to multi-object archive operations.
  • fd3d6d8: Unify delete/evict with other multi-object APIs; support non-recursion.
  • 657f42d: Add 'num-workers' parallelism to copy/transform bucket operations.
  • 6c171cd: Add 'omitempty' to API JSON structures; bump versions.
  • 750eb3a: Go-based API: add common 'get-node' function, reduce code.
  • 373445c: Go-based API: add common 'membership' function, reduce code.
  • a592f87: Multi-object copy/transform: when targets run different UUIDs (corner).
  • 0b3b34a: [API change] copy, transform, prefetch jobs: non-recursive operation.
  • a419869: Copy/transform bucket: add 'channel full' count and log.
  • e5ec51a: Multi-object copy/transform: op-code 'abort'; refactoring.
  • fa04c09: Copy/transform bucket: 'num-workers' vs number of mountpaths (disks).
  • 717fd3a: Multi-object: archive, copy, and transform (major update).
  • 35e5bbc: Copy/transform bucket: add 'num-workers' parallelism.
  • 657f42d: [API change] copy/transform bucket: add 'num-workers' parallelism.
  • 8434a74: Copy/transform: refactor control msg-s; num-workers, continue-on-err.
  • ad14a2a: Multi-object copy/transform: more reasons to abort.
  • 71becff: Copy/transform: sentinels to synchronize finishing and aborting.

AWS S3

The AWS S3 backend has been enhanced with a configurable multipart-upload threshold. The new multipart_size bucket property allows users to set custom thresholds for when to use multipart uploads. The property supports a value of -1 to completely disable multipart uploads and accepts human-readable size formats (KB, MB, GB).

We've extended the AWS S3 bucket configuration options to include:

  • extra.aws.cloud_region
  • extra.aws.endpoint
  • extra.aws.max_pagesize
  • extra.aws.multipart_size
  • extra.aws.profile

The release adds support for S3 list-object-versions feature and enhances classification and handling of S3-specific errors with the introduction of an err-remote-retriable type for better AWS error management. We've also optimized retry logic for transient AWS issues.

Performance optimizations include improved S3 connectivity and transfer efficiency, enhanced presigned HEAD request optimization in GET context, and improved AWS request checksum validation handling.

Commit Highlights

  • 4f30cde: Enable custom multipart threshold via multipart_size bucket property.
  • e6a1bd9: Introduce err-remote-retriable and optimized 503 retry logic.
  • 91c86b6: S3 compatibility: add missing XML tags (for consistency).

CLI

The command-line interface has been enhanced in v3.28 with numerous usability improvements.

The ais show job command now displays cluster-wide objects and bytes totals for distributed jobs, including summaries with checkmarks (✓) for prefetch, copy, transform, EC, and mirror operations. We've added a command to list all supported job types and improved the display of joggers, workers, and parallelism in job outputs.

Command ais ls, when run with a --paged option, will now display page numbers and show in-cluster ("cached") objects counts.

In-cluster vs remote content comparison has been enhanced with improved ais ls --diff.

Administrative commands have also been improved and fixed, including: log, cluster download-logs commands, and cluster set-primary.

Added a new admin API and CLI to drop (discard) in-memory object metadata cache.

Commit Highlights

  • 0433042: Enhance 'ais show job' to display cluster-wide totals.
  • 31a869c: Update list-objects to show page number and cached object count.
  • a58881a: Improve 'ais ls --help' documentation.
  • 8b41992: Enhance 'log get' and 'download-log' commands with better help.
  • 1170fd3: Improve 'ais scrub' to generate detailed logs with relevant columns.
  • ee754f5: Add admin API and CLI to drop in-memory object metadata cache
  • a7354f0: Fix 'set-primary NODE_NAME' (inconsistency with 'NODE_ID').
  • fb07d4a: Fix listing EC (erasure-coding) jobs.
  • 108930e: Command 'ais show job' to list all supported jobs (names).

Observability

AIStore 3.28 adds new metrics to track rate-limited operations, including err.rate.retry.n to count rate-limited errors and rate.retry.ns.total to track total delay time due to rate limiting.

We've enhanced monitoring of interactions with remote storage by adding metrics for remote GET operations including count, size, and latency.

Enhancements also new metrics to track a j-w-f metric (number of mountpath joggers, number of (user-specified) workers, and a work-channel-full count) when running batch jobs.

We've improved performance monitoring with enhanced ais performance latency and ais performance throughput commands.

Commit Highlights

  • 5c6b4c7: Running jobs to report j-w-f: number of mountpath joggers, number of workers, and channel-full counter.
  • 1b0c39c: Implement j-w-f parallelism.
  • 37c2434: Add reporting of number of joggers, workers, and channel-full metrics.
  • dde4e8b: New feature flag to include (bucket, xaction) Prometheus variable labels with every GET and PUT transaction.
  • 9794f4f: Track all remote read operations with improved metrics.
  • 72275f7: Add 'xkind' variable label to remote GET metrics.
  • 118a821: Remove 'xid' (job ID) from Prometheus labels.

Python SDK

AIStore 3.28 includes Python SDK v1.13.7, a substantial update with numerous enhancements.

The SDK now implements unified retry configuration for both HTTP and network failures with separate handling for HTTP status-based retries vs. connection failures. We've updated the retry strategy for ConnectTimeout, RequestsConnectionError, ReadTimeout and implemented graceful recovery from transient network issues.

We've improved JobStats and JobSnapshot models with additional fields, including a glob_id field and an unpack() method to decode worker metrics, and introduced a MultiJobSnapshot model for detailed job information.

URL handling has been fixed with object name URL encoding in request URLs. Object properties have been enhanced with props_cached and props accessor methods to the Object class, where props forces object properties refresh via HEAD request on every access and props_cached returns cached properties without network calls.

The ETL support has been extended with an extensible ETL server framework offering multiple server implementations, direct PUT support in HTTP and WebSocket communicators, serialization of ETL arguments as query parameters, the ability to override configured timeouts for ETL requests, and improved WebSocket communication for high-throughput processing.

Commit Highlights

  • 30cdc3f: Release Python SDK v1.13.6.
  • 568e6f5: Implement unified retry configuration for HTTP and network failures.
  • 2df6f4d: Improve RequestClient connection retry strategy.
  • b1a8e29: Add props_cached and props accessor methods to Object class.
  • 5b46ac3: Add URL encoding support for object names in request URLs.
  • 55dd49a: Implement details method to retrieve detailed job information.
  • ff78492: Add integration tests for 'Streaming-Cold-GET' feature
  • bc6931f: Support 'num_workers' in bucket-to-bucket copy and transform.
  • b4eed70: Support Direct PUT in etl http webserver.
  • aaa45f1: Support Direct PUT in etl flask webserver.
  • be8bc92: Update ObjectFile stress tests and examples.
  • 950eede: Amend error handling (in re: for updated object naming).
  • 1cc3df4: Support URL encoding of object names in request URLs.

Benchmarking Tools

AIStore 3.28 includes enhancements to both aisloader (Go-based) and pyaisloader (Python-based) benchmarking tools.

The aisloader now supports a --pctupdate option to simulate workloads with object updates by creating a GET followed by PUT sequence on the same object. This allows testing write-update patterns common in certain ML workflows. Remote bucket testing has been improved with enhanced operation with remote buckets via the --s3endpoint parameter and added latency measurement for combo operations. Performance monitoring has been improved with better tracking of individual operation latencies, transition to monotonic time for more accurate measurements, and enhanced validation and error reporting.

The pyaisloader has been updated to utilize pickle-safe multiprocessing concurrency, enhanced parallelism with improved process safety, and better compatibility with Python's multiprocessing constraints. We've improved performance metrics gathering and reporting and enhanced integration with Python SDK testing infrastructure.

Commit Highlights

  • 066fe04: Update pyaisloader benchmark classes for pickle-safe multiprocessing.
  • ace8608: Add --pctupdate option to aisloader.
  • 216d4ad: Implement new update mode for aisloader to test versioning.
  • 554f95a: Optimize aisloader work requests and field alignment.
  • be8bc92: Update objectfile stress tests and examples for Python SDK.

Build and CI/CD

AIStore 3.28 has transitioned to Go 1.24 with various optimizations and upgraded multiple open-source dependencies for security, performance, and bug fixes. Specific upgrades include the JWT-Go package (to address security vulnerabilities), the LZ4 package (to v4), and Prometheus client library.

To improve GitLab and GitHub CI, added Kustomize-based deployment configurations with development-focused overlays for easier setup.

We've added ais-fs volume hostpath support in Kubernetes configurations and enhanced the container image build process with cached dependencies for faster builds and improved dependency management. Deployment scripts have been streamlined for various environments.

Commit Highlights

  • f30d327: Restore CI image tag to latest (after temporary revert).
  • e917cb0: Cache KinD dependencies in image.
  • da887f6: Include podman-docker install in CI image.
  • 00dfd23: Refactor rules for k8s jobs.
  • d10f6ea: Transition to rules + refactoring.
  • 04c6fe1: Run csp tests via labels in MR.
  • 645e8ee: Upgrade OSS packages.
  • bf64360: Upgrade all OSS packages except AWS.
  • cce5f41: Upgrade aws-sdk-go-v2 with disabled checksum validation warning.
  • 0949d46: Transition to Go 1.24.

Miscellaneous Improvements

Some of the other notable improvements include:

  • Optimized multi-object copying and transforming, new default content checksum, micro-optimized mountpath traversal, and improved logic to handle memory pressure.
  • New build tag (stdlibwalk) to switch between standard library's filepath.WalkDir and the default implementation.
  • Fast URL parsing (Go API).
  • Optimized buffer allocation for multi-object operations and ETL.

Batch jobs now support a num-workers parameter that allows user to further control (via API or CLI) parallelism when running IO and network-intensive workflows. Worker count is dynamically scaled based on system load and is never less than the number of mountpaths (disks).

The release also adds support for Unicode and special characters in object names, and improves classification and management of errors with the addition of an err-rate-limit type for better handling of rate-limited operations and enhanced handling of network failures with better retry logic.

Code quality has been generally improved:

  • Transitioned to golangci-lint v2
  • Enabled additional linters
  • Fixed spelling throughout the entire codebase and documentation
  • Amended and improved numerous docs, including the main readme and overview.
  • Refactored and micro-optimized numerous components

Commit Highlights

  • 12ea066: Replace FS Walk with WalkDir.
  • 0b3b34a: Add 'num-workers' parallelism to copy/transform operations.
  • 9749b4d: Improve memory pressure handling; increase size-to-GC (tunable and configurable).
  • 7026201: Go API: fast URL parsing; cache.
  • 2e38d81: Fix spelling across the board.
  • a2f0dd2: FS Walk: inline visiting callback (micro-optimize).
  • ad69848: Pkg 'keepalive': add common initialization, reduce code, refactor.
  • a250bfd: Micro-optimize prefetch; simplify cold-get; add stats-updater (i/f); align fields.
  • 5cf0b07: OCI backend: amend error handling (assorted status codes; formatting).
0