-
Notifications
You must be signed in to change notification settings - Fork 951
Implement lists::stable_sort_lists
for stable sorting of elements within each row of lists column
#9425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement lists::stable_sort_lists
for stable sorting of elements within each row of lists column
#9425
Conversation
stable_segmented_sorted_order
and stable_segmented_sort_by_key
lists::stable_sort_lists
for stable sorting of elements within each row of lists column
From looking at the tests, it looks like we only support stable-sorting of single-level lists. (i.e. nested types aren't supported.) Is that correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just one small comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enum, the renamed common method, and the reordering of the parameters make this an easier read. Thanks, @ttnghia. LGTM.
Rerun tests. |
Codecov Report
@@ Coverage Diff @@
## branch-21.12 #9425 +/- ##
================================================
+ Coverage 10.79% 10.83% +0.04%
================================================
Files 116 117 +1
Lines 18869 19442 +573
================================================
+ Hits 2036 2106 +70
- Misses 16833 17336 +503
Continue to review full report at Codecov.
|
@gpucibot merge |
…ns (#9345) This PR changes the interface of `lists::drop_list_duplicates` such that it may accept a second (optional) input `values` lists column, and returns a pairs of lists columns containing the results of copying the input column without duplicate entries. If the optional `values` column is given, the users are responsible to have the keys-values columns having the same number of entries in each row. Otherwise, the results will be undefined. When copying the key entries, the corresponding value entries are also copied at the same time. A parameter `duplicate_keep_option` reused from stream compaction is used to specify which duplicate keys will be copying. This closes #9124, and blocked by #9425. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Jake Hemstad (https://github.com/jrhemstad) - https://github.com/nvdbaranec URL: #9345
This PR adds
lists::stable_sort_lists
that can sort elements within rows of lists column using stable sort. This is necessary for implementinglists::drop_list_duplicates
that operates on keys-values columns input when we want to remove the values corresponding to duplicate keys withKEEP_FIRST
orKEEP_LAST
option.In order to implement
lists::stable_sort_lists
, stable sort versions for thesegmented_sorted_order
andsegmented_sort_by_key
have also been implemented, which can maintain the order of equally-compared elements within segments.This PR blocks #9345.