8000 Tags · modin-project/modin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tags: modin-project/modin

Tags

0.33.2

Toggle 0.33.2's commit message

Verified

This tag was signed with the committer’s verified signature.
sfc-gh-mvashishtha Mahesh Vashishtha
Modin 0.33.2

This patch release includes some bug fixes.

Key Features and Updates Since 0.33.1
-------------------------------------
* Stability and Bugfixes
  * FIX-#5961: Preserve dtypes when inserting column to empty frame. (#7601)
  * FIX-#7551: Fix name ambiguity for `value_counts()` on Pandas backend (#7585)
  * FIX-#7595: Log backend switching information with the modin logger. (#7597)
* Update testing suite
  * TEST-#7598: Allow xgboost to log to root. (#7599)
  * TEST-#7602: Fix test_pickle by correctly using fixtures. (#7603)
* Uncategorized improvements

Contributors
------------
@sfc-gh-vrpatel
@sfc-gh-mvashishtha

0.33.1

Toggle 0.33.1's commit message

Verified

This tag was signed with the committer’s verified signature.
sfc-gh-mvashishtha Mahesh Vashishtha
Modin 0.33.1

This patch releases fixes a regression introduced in Modin 0.33.0.

Key Features and Updates Since 0.33.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7582: Add copy parameter to __array__ methods. (#7584)

Contributors
------------
@sfc-gh-mvashishtha

0.33.0

Toggle 0.33.0's commit message

Verified

This tag was signed with the committer’s verified signature.
sfc-gh-mvashishtha Mahesh Vashishtha
Modin 0.33.0

This release introduces a set of features for switching Modin execution between
multiple backends (e.g. Ray and local Pandas) manually or automatically. It also
includes several bug fixes.

Key Features and Updates Since 0.32.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7327: Use sort parameter of DataFrame.stack (#7396)
  * FIX-#7346: Handle execution on Dask workers to avoid creating conflicting clients (#7347)
  * FIX-#7375: Fix Series.duplicated dropping name (#7395)
  * FIX-#7381: Fix Series binary operators ignoring fill_value (#7394)
  * FIX-#7383: Avoid broadcast issue in partition manager with custom NPartitions (#7399)
  * FIX-#7404: Implement interchange protocol for datetime columns (#7434)
  * FIX-#7405: Internally sort indices for loc/iloc set (#7440)
  * FIX-#7413: Always use positional index before computing argmin/argmax (#7463)
  * FIX-#7461: Set backend correctly with environment variables. (#7462)
  * FIX-#7465: Properly implement Series.rename_axis (#7466)
  * FIX-#7486: Add support for `.astype(pandas.CategoricalDtype(…))` (#7487)
  * FIX-#7490: Exclude move_to and _update_inplace from casting. (#7491)
  * FIX-#7495: Separate extensions for aliases. (#7496)
  * FIX-#7521: Fix wrong extension being used when backend is pinned (#7546)
  * FIX-#7528: Dispatch module-level extensions to the correct backend (#7529)
  * FIX-#7532: Display choices in error message of environment vars (#7533)
  * FIX-#7536: setuptools / ray version conflict in pkg_resources._vendor  (#7537)
  * FIX-#7538: set_backend should exit early if there is nothing to do (#7539)
  * FIX-#7547: native qc move_to_me_cost does not work with non-subclasses (#7548)
  * FIX-#7553: Fix groupby when AutoSwitchBackend is disabled. (#7554)
  * FIX-#7555: Get the correct extension when AutoSwitchBackend is False. (#7556)
  * FIX-#7559: Create the dummy query compiler just once per backend. (#7560)
  * FIX-#7562: Raise AttributeError for missing extension properties. (#7563)
  * FIX-#7569: Fix handling of pyarrow dtype and empty dataframes (#7570)
  * FIX-#7576: Fix ambiguous AttributeError message (#7577)
  * FIX-#7578: Change groupby extension allow list and fix cached_property extensions (#7579)
* Performance enhancements
  * PERF-#7397: Avoid materializing index/columns in shape checks (#7398)
* Refactor Codebase
  * REFACTOR-#7315: Refactor axis checks in squeeze (#7400)
  * REFACTOR-#7418: Rename internal interchange protocol methods. (#7422)
  * REFACTOR-#7427: Require query compilers to expose engine and storage format. (#7430)
  * REFACTOR-#7470: Combine backend casting and extension code at the API layer. (#7485)
  * REFACTOR-#7493: Improve the clarity of the costing functions (#7494)
  * REFACTOR-#7527: Add more costing logic to the base query compiler. (#7530)
  * REFACTOR-#7534: Provide internal, overridable method for max_shape (#7535)
  * REFACTOR-#7564: Fix docstrings for transfer thresholds. (#7565)
* Update testing suite
  * TEST-#7419: Fix a few errors in CI (#7420)
  * TEST-#7421: Fix unidist with APT-installed MPI (#7423)
  * TEST-#7431: Fix formatting for isort 6 and black 25 (#7432)
  * TEST-#7437: Check execution-filter outputs correctly in CI. (#7438)
  * TEST-#7441: Correctly skip sanity tests if we don't need them. (#7442)
  * TEST-#7457: Fix SSL certificate error in notebooks by using http. (#7458)
  * TEST-#7497: Skip tests requiring lxml on windows. (#7500)
  * TEST-#7571: xfail test_read_csv_s3_issue4658 due to missing s3 bucket (#7572)
* Documentation improvements
  * DOCS-#7566: Add pandas on snowflake + backend pinning to documentation page (#7567)
* New Features
  * FEAT-#7433: Replace NativeDataFrameMode with a complete "native" execution. (#7436)
  * FEAT-#7445: Add metrics interface so third-parties can collect metrics from the modin frontend (#7444)
  * FEAT-#7448: Allow QueryCompilerCaster to apply cost-optimization on automatic casting (#7464)
  * FEAT-#7455: Add Backend config variable as an alias for execution. (#7456)
  * FEAT-#7459: Add methods to get and set backend. (#7460)
  * FEAT-#7468: Add progress bar for engine switch (#7469)
  * FEAT-#7472: Add an option register dataframe and series accessors with a particular backend. (#7473)
  * FEAT-#7474: Register general functions with a particular backend. (#7489)
  * FEAT-#7475: Choose the correct __init__ method from extensions and apply casting to __init__. (#7488)
  * FEAT-#7477: Move the query compiler calculator so it can be used in more places (#7478)
  * FEAT-#7480: Implement max_cost interface (#7481)
  * FEAT-#7482: Add "from_qc" API to QueryCompiler and BackendCostCalculator to handle asymmetric information scenarios (#7483)
  * FEAT-#7492: Allow I/O function accessors. (#7502)
  * FEAT-#7505: Support post-operation automatic backend switch. (#7506)
  * FEAT-#7507: Support pre-operation automatic backend switch. (#7512)
  * FEAT-#7509: Add AutoSwitchBackend configuration variable (#7510)
  * FEAT-#7511: Support pre-operation switch for init by passing arguments to cost functions. (#7531)
  * FEAT-#7521: Support pinning objects to a backend (#7522)
  * FEAT-#7523: Improve formal definition of the automatic switching algorithm (#7524)
  * FEAT-#7540: Ability to configure NativeQueryCompiler AutoSwitch Settings (#7561)
  * FEAT-#7542: Support post-operation backend switch for groupby. (#7545)
  * FEAT-#7543: Let plugins register groupby accessors. (#7575)
  * FEAT-#7549: Emit metrics on auto-switch and casting behavior (#7550)
  * FEAT-#7557: Add operation and size information to backend switch progress (#7558)
  * FEAT-#7573: Dispatch __array_ufunc__ to query compilers (#7574)

Contributors
------------
@CRiddler
@YarShev
@anmyachev
@data-makerman
@devin-petersohn
@emmanuel-ferdman
@mpeleshenko
@noloerino
@sfc-gh-dpetersohn
@sfc-gh-jkew
@sfc-gh-joshi
@sfc-gh-mvashishtha

0.32.0

Toggle 0.32.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.32.0

This release introduces support for Polars API, a new query compiler for small data,
more functions that can use dynamic partitioning, as well as several bug fixes.

Key Features and Updates Since 0.31.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#0000: Fix type hint (#7343)
  * FIX-#7113: Fix docstring overrides for subclasses. (#7354)
  * FIX-#7134: Use a separate docstring class for BasePandasDataset. (#7353)
  * FIX-#7329: Do not sort columns on df.update (#7330)
  * FIX-#7351: Add ipython method calls to non-lookup list (#7352)
  * FIX-#7355: Cpu count would be set incorrectly on a cluster (#7356)
  * FIX-#7357: Fix `NoAttributeError` on `DataFrame.copy` (#7358)
  * FIX-#7371: Fix inserting datelike values into a DataFrame (#7372)
  * FIX-#7373: Try a previous version of `motoserver/moto` service, pin to 5.0.13 (#7374)
  * FIX-#7379: Fix __imul__ performing addition instead of multiplication (#7380)
  * FIX-#7387: Limit the number of pytest workers for tests with Ray engine on Windows (#7388)
  * FIX-#7389: Fix uploading artifacts (#7390)
* Refactor Codebase
  * REFACTOR-#0000: Update copyright date (#7333)
* Documentation improvements
  * DOCS-#0000: Update RunLLM Ask AI widget script path (#7345)
  * DOCS-#7335: Fix borken links in Modin Usage Examples page (#7336)
  * DOCS-#7382: Add documentation on how to use Modin Native query compiler (#7386)
* New Features
  * FEAT-#4605: Add native query compiler (#7259)
  * FEAT-#7308: Interoperability between query compilers (#7376)
  * FEAT-#7331: Initial Polars API (#7332)
  * FEAT-#7337: Using dynamic partitionning in `broadcast_apply` (#7338)
  * FEAT-#7340: Add more granular lazy flags to query compiler (#7348)
  * FEAT-#7368: Add a new environment variable for using dynamic partitioning (#7369)

Contributors
------------
@MortalHappiness
@Retribution98
@YarShev
@ZhipengXue97
@anmyachev
@arunjose696
@devin-petersohn
@likawind
@sfc-gh-joshi
@sfc-gh-mvashishtha

0.31.0

Toggle 0.31.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.31.0

First release compatible with NumPy 2.0.

Key Features and Updates Since 0.30.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7138: Stop reloading modules for custom docstrings. (#7307)
  * FIX-#7263: Empty docstrings should not be inherited (#7264)
  * FIX-#7272: Remove HDK engine (#7275)
  * FIX-#7277: Remove Cudf storage format as unmaintained (#7290)
  * FIX-#7278: Make sure `enable_logging` decorator preserve type hints (#7279)
  * FIX-#7292: Prepare Modin code to NumPy 2.0 (#7293)
  * FIX-#7295: Unpin numexpr to allow versions >= 2.8.4 to match pandas (#7296)
  * FIX-#7309: Update versioneer with `versioneer install --vendor` (#7311)
  * FIX-#7320: Bump the github-actions group with 3 updates (#7319)
  * FIX-#7321: Using 'C' engine instead of 'pyarrow' for getting metadata in 'read_csv' (#7322)
* Performance enhancements
  * PERF-#7299: Avoid using `synchronize_labels` for `combine` function (#7300)
* Refactor Codebase
  * REFACTOR-#7271: Remove `instance_type` attribute of axis partitions (#7268)
  * REFACTOR-#7273: Remove deprecated functions from utils.py, accessor.py and io.py (#7274)
  * REFACTOR-#7285: Remove deprecated configs (#7286)
  * REFACTOR-#7294: Reduce access of methods `_modin_frame` methods from `_query_compiler` (#7297)
  * REFACTOR-#7313: Add similar methods as in #7294 for operating on columns (#7314)
* Update testing suite
  * TEST-#0000: Add a Dependabot config to auto-update GitHub action versions (#7318)
  * TEST-#7316: Run a subset of CI tests with python 3.10 and 3.11 on a scheduled basis (#7289)
* Documentation improvements
  * DOCS-#0000: Adds RunLLM widget to docs (#7326)
  * DOCS-#7287: Update Modin on Dask documentation (#7288)
* New Features
  * FEAT-#6574: UserWarning no longer displayed when Series/DataFrames are small (#7323)
  * FEAT-#7249: Add `reload_modin` feature (#7280)
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)
  * FEAT-#7283: Introduce MinRowPartitionSize and MinColumnPartitionSize (#7284)
  * FEAT-#7310: NumPy 2.0 support (#7312)

Contributors
------------
@Jayson729
@Retribution98
@YarShev
@anmyachev
@arunjose696
@kurtmckee
@sfc-gh-dpetersohn
@vsreekanti

0.30.1

Toggle 0.30.1's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.30.1

This release pins numpy<2.

Key Features and Updates Since 0.30.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7302: Pin numpy<2 (072453b)

Contributors
------------

@anmyachev

0.29.1

Toggle 0.29.1's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.29.1

This release pins numpy<2.

Key Features and Updates Since 0.29.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#7302: Pin numpy<2 (072453b)
* New Features
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)

Contributors
------------

@anmyachev
@sfc-gh-dpetersohn

0.28.3

Toggle 0.28.3's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.28.3

This release pins numpy<2.

Key Features and Updates Since 0.28.2
-------------------------------------
* Stability and Bugfixes
  * FIX-#7302: Pin numpy<2 (072453b)
* New Features
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)

Contributors
------------

@anmyachev
@sfc-gh-dpetersohn

0.27.1

Toggle 0.27.1's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.27.1

This release pins numpy<2.

Key Features and Updates Since 0.27.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#6968: Align API with pandas (#6969)
  * FIX-#7302: Pin numpy<2 (072453b)
* New Features
  * FEAT-#7265: Automatic publication of Modin wheel to PyPI (#7262)

Contributors
------------

@anmyachev
@dchigarev
@sfc-gh-dpetersohn

0.30.0

Toggle 0.30.0's commit message

Verified

This tag was signed with the committer’s verified signature.
anmyachev Anatoly Myachev
Modin 0.30.0

This release introduces support for DataFrame API standard, a distributed implementation for right merge/join,
more efficient implementation of internal operators, which gives a performance boost to almost all distributed Modin functions,
improved compatibility with pandas on pyarrow backend, type hints for pandas API to improve UX.

Key Features and Updates Since 0.29.0
-------------------------------------
* Stability and Bugfixes
  * FIX-#0000: Fix badge in README.md (#7213)
  * FIX-#0000: Make merge tests more stable by sorting results (#7266)
  * FIX-#6967: Remove read_pickle_distributed/to_pickle_distributed functions as deprecated (#7258)
  * FIX-#7093: Make sure 'idxmax' and 'idxmin' can work with string columns (#7193)
  * FIX-#7102: Remove `enable_api_only` mode in modin logging (#7194)
  * FIX-#7103: Move lower-level functionality logging to debug (#7184)
  * FIX-#7143: Constructing a DataFrame from a Modin Series with tuple name should produce MultiIndex columns (#7214)
  * FIX-#7185: Add extra check for some config classes (#7189)
  * FIX-#7201: Update docs on how to enable Modin logs for high-level API and low-level API (#7209)
  * FIX-#7206: Make sure df.melt handle duplicate value_vars correctly (#7208)
  * FIX-#7219: Pin dataframe-api-compat>=0.2.7 (#7220)
  * FIX-#7221: Don't use 'use_legacy_dataset=False' for 'ParquetDataset' (#7222)
  * FIX-#7224: Importing modin.pandas.api.extensions overwrites re-export of pandas.api submodules (#7225)
  * FIX-#7233: Display property name in default_to_pandas error messages (#7269)
  * FIX-#7234: Deprecate HDK engine (#7235)
  * FIX-#7238: Fix docstring inheritance for `cached_property` and use it (#7239)
  * FIX-#7240: Allow `doc_checker.py` works with `functools.cached_property` (#7241)
  * FIX-#7246: Pin pyarrow>=10.0.1 as pandas 2.2.* does (#7247)
  * FIX-#7248: Make sure '_validate_dtypes_sum_prod_mean' works correctly with datetime types (#7237)
  * FIX-#7250: Revert "PERF-#6666: Avoid internal reset_index for left merge" (#7251)
* Performance enhancements
  * PERF-#7227: Call 'modin_frame.combine()' for merge and join only when necessary (#7228)
  * PERF-#7230: Don't preserve bad partition for 'merge' (#7229)
* Refactor Codebase
  * REFACTOR-#7242: Add type hints for `modin/core/dataframe/algebra/` (#7243)
  * REFACTOR-#7260: Use `extract_dtype` internal function in more places (#7261)
* Update testing suite
  * TEST-#7049: Add some sanity tests with pyarrow-backed pandas dataframes (#7199)
  * TEST-#7191: Fix ASV after changing default branch (#7190)
* Documentation improvements
  * DOCS-#0000: Fix a typo with MODIN_CPUS number (#7198)
  * DOCS-#0000: Supplement Optmization Notes with a link to configs (#7197)
  * DOCS-#7217: Update docs as to when Modin operators work best (#7218)
  * DOCS-#7255: Update docs as to from_* functions (#7256)
* New Features
  * FEAT-#5394: Reduce amount of remote calls for Map operator (#7136)
  * FEAT-#5394: Reduce amount of remote calls for TreeReduce and GroupByReduce operators (#7245)
  * FEAT-#6492: Add `from_map` feature to create dataframe (#7215)
  * FEAT-#6498: Make Fold operator more flexible (#7257)
  * FEAT-#6808: Implement '__arrow_array__' for Series (#7200)
  * FEAT-#6890: Modin implementation of DataFrame API standard (#7216)
  * FEAT-#7139: Use ray-core instead of ray-default (#6955)
  * FEAT-#7187: Change "master" branch to "main" (#7188)
  * FEAT-#7202: Use custom resources for Ray (#7205)
  * FEAT-#7203: Make sure Modin works correctly with pandas, which uses pyarrow as a backend (#7204)
  * FEAT-#7207: Add the ability to assing a df to a columns selection without d2p (#7210)
  * FEAT-#7252: Add type hints for `base.py` (#7253)
  * FEAT-#7254: Support right merge/join (#7226)

Contributors
------------
@Retribution98
@YarShev
@anmyachev
@arunjose696
@noloerino
@sfc-gh-jkew
0