Releases: dask/dask
Releases · dask/dask
2025.5.1
2025.5.0
Changes
- Speed up slicing graph generation @fjetter (#11945)
- Fixed Array.setitem when both the array and the indexer have unknown shape @TomAugspurger (#11943)
- Op 8000 timize dask order for worst case of get_target @fjetter (#11935)
- Raise on local executor if tasks are missing dependency @fjetter (#11944)
- Fix
to_dask_array
for single partition @jrbourbeau (#11931) - Ensure parquet plan is fully cached during optimization @fjetter (#11933)
- Better documentation for expression system @fjetter (#11915)
- Simplify (and speed up) culling @fjetter (#11899)
- Update pre-commit @fjetter (#11926)
- Map_partitions again accepts delayed objects @fjetter (#11907)
- Fix delayed parsing for futures @fjetter (#11917)
- Don't run post
setup-miniconda
step in CI @jrbourbeau (#11925) - Try to pin pip for readthedocs @fjetter (#11923)
- Fix windows CI @fjetter (#11919)
See the Changelog for more information.
2025.4.1
Changes
- Ensure only HLGs are probited reuse @fjetter (#11906)
- Ensure xarray objects can continue sharing dependencies @fjetter (#11904)
- Ensure culling changes layer names @fjetter (#11903)
- Ensure FusedIO does not break Blockwise alignment Assign @fjetter (#11898)
- Implement ufuncs and gufunc for array-expr @phofl (#11818)
- Implement map_overlap for array-expr @phofl (#11822)
See the Changelog for more information.
2025.4.0
Changes
- Ensure Future value is in da.from_delayed task graph @TomAugspurger (#11896)
- Fix annotations passed to delayed @fjetter (#11893)
- migrate delayed unpack_collections @fjetter (#11881)
- Remove Pub/Sub references from docs @jrbourbeau (#11891)
- Ensure only classes without custom init are singletons @fjetter (#11886)
- Remove custom initializers for delayed expressions @fjetter (#11888)
- Fix persisting multiple DFs at the same time @fjetter (#11887)
- Avoid always parsing list inputs to
DataFrame.isin
as object type numpy arrays @mroeschke (#11869) - Unskip pandas-dev cov/corr tests @TomAugspurger (#11873)
- Hlg blockwise fix @fjetter (#11871)
- Ensure annotations for HLG objects are properly generated @fjetter (#11866)
- Factor out singleton logic from base Expr class @fjetter (#11868)
- Ensure HLGs are using dependencies properly in optimization @fjetter (#11859)
- Ensure dictionaries tokenize deterministically @fjetter (#11867)
- Ensure default dask scheduler only compute what's needed @fjetter (#11861)
- Faster tokenization of pd.RangeIndex @fjetter (#11863)
- Update link to Quansight in community doc @pavithraes (#11860)
- Relax tolerance in autocorr test @TomAugspurger (#11857)
- Use
map_blocks
inarray.store
to avoid materialization and dropping of annotations @fjetter (#11844) - Ensure repartition does not trigger memory size computation during lowering (i.e. on the scheduler) @fjetter (#11855)
- Support args and kwargs for rolling aggregations @fjetter (#11856)
- Remove nightly
h5py
fromupstream
CI job @jrbourbeau (#11847) - Ensure HLGExpr tokenize uniquely @fjetter (#11849)
- Do not inject median in describe for pandas 3 @fjetter (#11846)
- Fixed
Expr.__setattr__
for subclasses @TomAugspurger (#11845) - Wrap HLGs in an Expr to avoid Client side materialization @fjetter (#11736)
See the Changelog for more information.
2025.3.0
Changes
- Fix dataset info cache assignment @fjetter (#11840)
- Expr setattr @fjetter (#11836)
- Follow up to expression tokenization caching @fjetter (#11837)
- Consolidate getattr for expr classes @fjetter (#11835)
- Reduce pickle size of ReadParquet expression @fjetter (#11797)
- arange loses precision on ~2**63 @crusaderky (#11801)
- Remove numbagg from upstream build @phofl (#11821)
- Dispatch to numbagg for nanmedian and nanquantile @phofl (#11817)
- Make missing meta warning more ergonomic @phofl (#11814)
- Remove name doc from from_pandas @phofl (#11812)
- Implement an Array Scalar @phofl (#11810)
- Added
to\_orc
to DataFrame API @TomAugspurger (#11807) - Implement reverse indexing for DataFrames @phofl (#11803)
- Add lazy
to_pandas_dispatch
registration for cudf @rjzamora (#11799) - Fix missing imports in array-expr @fjetter (#11796)
- Cache tokens on expressions and restore after pickle roundtrip @fjetter (#11791)
- Use random dashboard ports for LocalCluster in distributed tests @fjetter (#11795)
- Implement slicing for array-expr @phofl (#11783)
- Never use an asynchronous Client when calling top level compute function @fjetter (#11790)
- Refactor import tests @fjetter (#11794)
- Migrate base.unpack_collections to Task class @fjetter (#11793)
- Ensure map_blocks generates unique tokens @fjetter (#11792)
- Speed up normalize_pickle by 50 percent @fjetter (#11788)
- Fix divisions calculation with duplicates @phofl (#11787)
- Fix assign align for duplicated divisions @phofl (#11786)
- Ensure concat optimize project does not raise @fjetter (#11784)
- Add array-expr from_array @phofl (#11772)
- Keep chunksizes consistent in
apply\_gufunc
@phofl (#11683) - Test
dask.dataframe.__all__
@flying-sheep (#11782) - Add __all__ to
dask.bag
@flying-sheep (#11781) - add test for
dask.array.__all__
@flying-sheep (#11780) - Bump JamesIves/github-pages-deploy-action from 4.7.2 to 4.7.3 @dependabot[bot] (#11777)
- Export dask.array members @flying-sheep (#11779)
- Fix sorted_divisions_locations with duplicates @TomAugspurger (#11773)
- Small typo in best-practices.rst @SCORE1387 (#11775)
- Allow unknown chunks in blockwise adjust_chunks @lgray (#11769)
- Fix crash in
asarray(..., like=...)
vs. scipy.sparse objects @crusaderky (#11755) - Remove flaky optional dependency @TomAugspurger (#11771)
- Add support for scipy sparray @flying-sheep (#11750)
- Added
flaky
to tests extra @TomAugspurger (#11770) - Ensure divisions are plain scalars @TomAugspurger (#11767)
- Remove divisions code duplication @fjetter (#11764)
- Ensure divisions not diverging from npartitions in Merge @fjetter (#11762)
- skip test_visualize_int_overflow on windows @fjetter (#11761)
- Reduce pickle size for tasks @fjetter (#11687)
- Implement unify_chunks and Rechunk @phofl (#11692)
- Fix expression getitem to avoid alignment @phofl (#11760)
arange(..., like=x)
embeds the graph of x @crusaderky (#11754)- Simplify assert_divisions @fjetter (#11745)
- Fix Projection logic for Series objects @phofl (#11747)
- Remove bytes as keys @fjetter (#11757)
- Ensure
map_partitions
returns Series object if function returns scalar @fjetter (#11756) - Don't upload env twice @phofl (#11748)
See the Changelog for more information.
2025.2.0
Changes
- Add big array example @jrbourbeau (#11744)
- Fix exploding chunksizes in pad for constant padding @phofl (#11743)
- Move optimize method to base class @fjetter (#11742)
- Add changelog entry for fixed deadlock @hendrikmakait (#11741)
- Fix graph creation in dask-expr to_delayed @phofl (#11739)
- Remove culling from delayed optimisation @phofl (#11737)
- Compute meta for from_map on the cluster @phofl (#11738)
- Bugs in
__setitem__
with dask bool mask @crusaderky (#11728) - Implement infrastructure, random, blockwise and Elemwise @phofl (#11689)
- array/asarray with both like= and dtype= @crusaderky (#11733)
- Fix annotations warnings test @phofl (#11734)
- Catch warnings when writing to remote storage with to_parquet @phofl (#11731)
- Remove LocalCluster from tests @phofl (#11729)
- Fix partition pruning when using from_array @phofl (#11725)
- Fix concatentation with mixed dtype columns @phofl (#11727)
arange
: fix extreme values @crusaderky (#11707)- Graph corruption on scalar getitem->setitem @crusaderky (#11723)
- array: Never share buffers after compute() @crusaderky (#11697)
- Extract Dask Array from xarray DataArray in from_array @phofl (#11712)
arange
: support kwargs @crusaderky (#11710)- Ensure
normalize_token
is threadsafe @fjetter (#11709) - Expand advise for instance types and processes @fjetter (#11705)
- Drop legacy timeseries implementation @fjetter (#11704)
- Update Dask Cloud Provider documentation to include Nebius as a supported cloud option @SalikovAlex (#11703)
- Fix normalize_chunks when squashing into a single chunk @phofl (#11702)
- Fix positional indexing with newaxis @phofl (#11699)
- Set array backend in scipy-sparse-indexing @TomAugspurger (#11700)
- Fix value_counts shuffling strategy @phofl (#11698)
- Disentangle core expression class from dataframe specific code @phofl (#11688)
- Bump conda-incubator/setup-miniconda from 3.1.0 to 3.1.1 @dependabot[bot] (#11685)
- Fixup dataframe conversion from array methods @phofl (#11684)
- Remove remaining artifacts of fastparquet @phofl (#11682)
- Updated docs on local file system @TomAugspurger (#11677)
- Expose TaskSpec objects for downstream projects @phofl (#11675)
- Rename optimize_slices function @phofl (#11673)
- Fixup changelog entry @phofl (#11674)
- Add changelog for dataframe removal @phofl (#11654)
- Pass read_only properly to zarr stores @phofl (#11668)
- Revert "Revert "Add
scikit-image
nightly back to upstream CI"" @phofl (#11667) - Avoid Dict in fused tasks @hendrikmakait (#11657)
- Fix filtering on parquet file containing a struct column @rjzamora (#11665)
- Fix merge asof simplify after lowering @phofl (#11658)
- Add redirects for groupby docs after dask-expr merge @phofl (#11661)
- Rename remote store to FsspecStore @phofl (#11660)
- Reintroduce slice fusion @phofl (#11638)
- Add
\_\_all\_\_
to init @phofl (#11664) - Expose downstream utilities for dask.dataframe @phofl (#11662)
- Remove IO wrapper functions @phofl (#11649)
- Reroute source of docs for dataframe methods @phofl (#11645)
- Fix projection when columns are numpy scalars @rjzamora (#11656)
- Let vindex accept a Dask Array indexer under certain conditions @phofl (#11635)
- Simplify
_execute_subgraph
@hendrikmakait (#11655) - Rename data_producer and add flag to dataframe io stuff @phofl (#11653)
- Remove subgraph callable @fjetter (#11575)
- Add cached version for
normalize\_chunks
@phofl (#11650) - Fixed mypy config @TomAugspurger (#11651)
- Fixup pickle size test @phofl (#11647)
- Remove unnecessary compat code @phofl (#11644)
- Remove pyarrow installation by default in imports check @phofl (#11646)
- Add data-producer-task property to replace rootish detection mechanism @phofl (#11558)
- Add cupy support for indexed assignment @rjzamora (#11421)
- Merge dask-expr repository into dask @phofl (#11623)
- Avoid rechunking 1D-arrays in cumreduction @Illviljan (#11446)
- Ensure that alias key is not a TaskRef object @phofl (#11639)
- Avoid Tuple in Dict @hendrikmakait (#11634)
- Migrate vindex to TaskSpec @phofl (#11633)
- Reduce graph size for vindex @phofl (#11632)
- Fix Array binary operator priority delegation @j2bbayle (#11611)
- Fix auto-rechunking in einsum @dcherian (#11628)
- Optimize vindex @dcherian (#11625)
- Fix example Actors @isidroas (#11624)
- Avoid concatenate3 in array slicing @hendrikmakait (#11631)
- Avoid
concatenate3
in overlap and rechunking graphs @hendrikmakait (#11621) - Avoid using
concrete
in task graph @hendrikmakait (#11620) - Avoid producing chunks of size 0 when using
dask.array.rechunk
withchunks='auto'
@schlunma (#11622) - Clean up tests after legacy removal @phofl (#11617)
- Remove legacy DataFrame implementation @phofl (#11606)
- Revert "Add
scikit-image
nightly back to upstream CI" @phofl (#11616) - Fix increased memory usage when converting xarray to dataframe @phofl (#11609)
- Avoid overflowing when downcasting shuffle arrays @phofl (#11615)
See the Changelog for more information.
2024.12.1
Changes
- Fix map_overlap bug where rechunking and trim=False caused inconsistent chunkings @phofl (#11605)
- Avoid reference to bound method in NestedContainer @hendrikmakait (#11608)
- Avoid constructing
NestedContainer
s in case of trivial inputs @hendrikmakait (#11600) - Avoid legacy implementation in read-csv @phofl (#11603)
- Remove legacy DataFrame import @phofl (#11604)
- asarray ignores dtype for array inputs @crusaderky (#11586)
- Add back LLM chatbot to Dask docs @dchudz (#11594)
- Avoid creating trivial DataNodes in graph conversion @hendrikmakait (#11598)
- Don't wrap keys in
TaskRef
inAlias
@hendrikmakait (#11597) - Bump JamesIves/github-pages-deploy-action from 4.6.9 to 4.7.2 @dependabot (#11593)
- Migrate dask array creation routines to task spec @jrbourbeau (#11582)
- Migrate most of dask array random to task spec @jrbourbeau (#11581)
- Do not use local function in
array.push
@fjetter (#11576)
See the Changelog for more information.
2024.12.0
Changes
- Revert "Add LLM chatbot to Dask docs (#11556)" @dchudz (#11577)
- Automatically rechunk if array in to_zarr has irregular chunks @phofl (#11553)
- Blockwise uses
Task
class @fjetter (#11568) - Migrate rechunk and reshape to task spec @phofl (#11555)
- Cache svg-representation for arrays @dcherian (#11560)
- Fix empty input for containers @fjetter (#11571)
- Convert
Bag
graphs to TaskSpec graphs during optimization @fjetter (#11569) - add LLM chatbot to Dask docs @dchudz (#11556)
- Add support for Python 3.13 @phofl (#11456)
- Fuse data nodes in linear fusion too @phofl (#11549)
- Migrate slicing code to task spec @phofl (#11548)
- Speed up ArraySliceDep tokenization @phofl (#11551)
- Fix fusing of p2p barrier tasks @phofl (#11543)
- Remove infra/mentions of GPU CI @charlesbluca (#11546)
- Temporarily disable gpuCI update CI job @jrbourbeau (#11545)
- Use BlockwiseDep to implement map_blocks keywords @phofl (#11542)
- Remove optimize_slices @phofl (#11538)
- Make reshape_blockwise a noop if shape is the same @phofl (#11541)
- Remove read-only flag from open_arry in open_zarr @phofl (#11539)
- Implement linear_fusion for task spec class @phofl (#11525)
- Remove recursion from TaskSpec @fjetter (#11477)
- Fixup test after dask-expr change @phofl (#11536)
- Bump codecov/codecov-action from 3 to 5 @dependabot (#11532)
- Create dask-expr frame directly without roundtripping @phofl (#11529)
- Add
scikit-image
nightly back to upstream CI @jrbourbeau (#11530) - Remove
from\_dask\_dataframe
import @phofl (#11528) - Ensure that from_array creates a copy @phofl (#11524)
- Simplify and improve performance of normalize chunks @phofl (#11521)
- Fix flaky nanquantile test @phofl (#11518)
- Fix tests for new
read\_only
kwarg inzarr=3
@phofl (#11516)
See the Changelog for more information.
2024.11.2
Changes
- Remove only_refs parsing option for TaskSpec @fjetter (#11511)
- Fix upstream ci pandas Series repr error @phofl (#11514)
- Implement
nanpercentile
for dask arrays @phofl (#11505) - Bump JamesIves/github-pages-deploy-action from 4.6.8 to 4.6.9 @dependabot (#11512)
- Add fuse method for TaskSpec @fjetter (#11509)
See the Changelog for more information.