8000 Fix Series construction from numpy array with non-native byte order by mroeschke · Pull Request #18151 · rapidsai/cudf · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix Series construction from numpy array with non-native byte order #18151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 4, 2025

Conversation

mroeschke
Copy link
Contributor

Description

closes #18149

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke added bug Something isn't working Python Affects Python cuDF API. non-breaking Non-breaking change labels Mar 3, 2025
@mroeschke mroeschke self-assigned this Mar 3, 2025
@mroeschke mroeschke requested a review from a team as a code owner March 3, 2025 23:57
@mroeschke mroeschke requested review from vyasr and Matt711 March 3, 2025 23:57
@@ -2719,8 +2719,16 @@ def as_column(
return as_column(arbitrary, dtype=dtype, nan_as_null=nan_as_null)
elif arbitrary.dtype.kind in "biuf":
from_pandas = nan_as_null is None or nan_as_null
try:
pa_array = pa.array(arbitrary, from_pandas=from_pandas)
except pa.ArrowNotImplementedError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check if an array is byte swapped? Instead of relying on generic not implemented error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, switch to check if the byte order is non native.

Copy link
Member
@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting on this so quickly! Left two small comments, feel free to ignore.

Note that this doesn't fully resolve #18149 since cupy can also have byteswapped arrays and those still error (though in a different location, as documented in the original issue).

@@ -2719,8 +2731,15 @@ def as_column(
return as_column(arbitrary, dtype=dtype, nan_as_null=nan_as_null)
elif arbitrary.dtype.kind in "biuf":
from_pandas = nan_as_null is None or nan_as_null
return as_column(
if not is_np_native_byteorder(arbitrary.dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just use dtype.isnative here I believe, no need for a helper function: https://numpy.org/doc/2.2/reference/generated/numpy.dtype.isnative.html

Suggested change
if not is_np_native_byteorder(arbitrary.dtype):
if not arbitrary.dtype.isnative:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah great, didn't know about this API. Thanks!

return as_column(
if not is_np_native_byteorder(arbitrary.dtype):
# Not supported by pyarrow
arbitrary = arbitrary.astype(arbitrary.dtype.newbyteorder())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not native byteorder then swapping byteorder will get you native byteorder - I'd find specifying native byteorder explicitly would be a bit more explicit though:

Suggested change
arbitrary = arbitrary.astype(arbitrary.dtype.newbyteorder())
arbitrary = arbitrary.astype(arbitrary.dtype.newbyteorder("="))

@galipremsagar
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 8ff194d into rapidsai:branch-25.04 Mar 4, 2025
106 checks passed
@jcrist
Copy link
Member
jcrist commented Mar 4, 2025

Can someone reopen #18149 (I lack permissions to reopen it)? This PR didn't fully resolve the issue (it fixed it for numpy, but not cupy).

@Matt711
Copy link
Contributor
Matt711 commented Mar 4, 2025

Can someone reopen #18149 (I lack permissions to reopen it)? This PR didn't fully resolve the issue (it fixed it for numpy, but not cupy).

Done.

@mroeschke mroeschke deleted the bug/np/byteswap branch March 4, 2025 19:49
@mroeschke
Copy link
Contributor Author

Oops sorry, followed up with #18164

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[BUG] Cannot create cudf.Series from array with non-native byteorder
4 participants
0