-
Notifications
You must be signed in to change notification settings - Fork 952
Move drop_duplicates
, drop_na
, _gather
, take
to IndexFrame and create their _base_index
counterparts
#9807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move drop_duplicates
, drop_na
, _gather
, take
to IndexFrame and create their _base_index
counterparts
#9807
Conversation
…dropna_list_interface
Co-authored-by: Bradley Dice <bdice@bradleydice.com>
…ropna_list_interface
Co-authored-by: Bradley Dice <bdice@bradleydice.com>
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #9807 +/- ##
================================================
- Coverage 10.49% 10.40% -0.09%
================================================
Files 119 119
Lines 20305 20556 +251
================================================
+ Hits 2130 2139 +9
- Misses 18175 18417 +242
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check the code in IndexedFrame
too carefully
Co-authored-by: Vyas Ramasubramani <vyas.ramasubramani@gmail.com>
…eate _copy_typemetadata for MultiIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small changes left.
Co-authored-by: Vyas Ramasubramani <vyas.ramasubramani@gmail.com>
…ent/remove_keep_index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connected offline with @isVoid and chatted about this a bit. This LGTM.
…ent/remove_keep_index
…ent/remove_keep_index
…ent/remove_keep_index
…ent/remove_keep_index
…ent/remove_keep_index
…f into improvement/remove_keep_index
@vyasr @brandon-b-miller 2 changes since your approval:
|
Gained approval from both reviewers offline. Rerun tests. |
rerun tests |
1 similar comment
rerun tests |
@gpucibot merge |
This PR is a follow up of #9558 (Part 1 of 3)
One remaining problem from #9558 is that
Frame
is index agnostic, however the above functions needs to perform index-aware operations when building the list of columns to pass to libcudf. For example, to remove duplicates ofBaseIndex
, it should only construct the list with all its columns. But in a dataframe, it would need to pass in all data columns plus the index columns, while specifying the indices of the data columns to consider duplicates. This complicates for_gather
which supportskeep_index
argument. This PR moves aforementioned functions toIndexedFrames
, and create its counterparts in_base_index
.A couple noteworthy changes:
l(r)hs_is_index
keep_index
argument is removed. For internal usage it's more advised to use_gather
. (And thus this PR is labeled breaking)