8000 sort/ls: implement version cmp matching GNU spec by miDeb · Pull Request #2462 · uutils/coreutils · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

sort/ls: implement version cmp matching GNU spec #2462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 1, 2021

Conversation

miDeb
Copy link
Contributor
@miDeb miDeb commented Jun 27, 2021

This reimplements version_cmp, which is used in sort and ls to sort
according to versions.
However, it is not bug-for-bug identical with GNU's implementation.
I reported a bug with GNU here:
https://lists.gnu.org/archive/html/bug-coreutils/2021-06/msg00045.html
This implementation does not contain the bugs regarding the handling of
file extensions and null bytes.

This reimplements version_cmp, which is used in sort and ls to sort
according to versions.
However, it is not bug-for-bug identical with GNU's implementation.
I reported a bug with GNU here:
https://lists.gnu.org/archive/html/bug-coreutils/2021-06/msg00045.html
This implementation does not contain the bugs regarding the handling of
file extensions and null bytes.
@tertsdiepraam
Copy link
Member
tertsdiepraam commented Jun 28, 2021

Awesome! Could you explain what the difference is with the current ls implementation? I thought I was pretty close to getting it correct there. I think I missed the handling of ~ and the file extensions? Pretty cool that you were able to do this without any allocations too! Semi-related: should we be moving to bstr for this to avoid lossy conversions?

@miDeb
Copy link
Contributor Author
miDeb commented Jun 28, 2021

I also think the current implementation is pretty close, but I wasn't aware of all the differences at the beginning so I rewrote it in a way that is close to what GNU documents. Now that I think about it, it would probably have been possible to add the special cases to the old implementation.
Differences are (afaik):

  • In the lexical comparison:
    • Handling of ~
    • Letters are in front of non-letters
  • Handling of file extensions
  • Handling of "", "." and ".."
  • Handling of files with a leading "."
  • There's also this interesting one: the current implementation falls back (when it can't find a difference) to comparing the amount of leading zeros. This implementation (and GNU's) falls back to the result of a normal string comparison of the whole string, but I think that might actually be the same thing at the end?

should we be moving to bstr for this to avoid lossy conversions

Not sure, it will not avoid lossy conversions on windows. (btw, sort will panic if you input is not utf8, maybe also something i have to change)

@tertsdiepraam
Copy link
Member

Now that I think about it, it would probably have been possible to add the special cases to the old implementation.

I like your implementation better, so no problem there!

There's also this interesting one: the current implementation falls back (when it can't find a difference) to comparing the amount of leading zeros. This implementation (and GNU's) falls back to the result of a normal string comparison of the whole string, but I think that might actually be the same thing at the end?

Interesting, I think you're right. All the tests pass so I think it works :)

@miDeb
Copy link
Contributor Author
miDeb commented Jun 28, 2021

All the tests pass so I think it works :)

I forgot to mention that I used a fuzzer to check if there are differences between GNU's implementation and this one, and the only things it found were the bugs in GNU I reported :)

@tertsdiepraam tertsdiepraam merged commit 6213a2a into uutils:master Jul 1, 2021
@tertsdiepraam
Copy link
Member

I was proud of my initial implementation, but it seems so convoluted now in comparison to this one :) Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0