-
Notifications
You mus 8000 t be signed in to change notification settings - Fork 650
Return full license string instead of SHA256 hash when license string exceeds 64 characters. #3780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
github/go-spdx#8 |
Hi @VictorHuu, thanks for your response, you are very correct that go-spdx fails hard if the string is not a recognized SPDX ID. On this image, there are many packages similar to your example like: $ rpm -qa --qf '%{NAME}: %{LICENSE}\n' | grep -i 'LGPLv2+ and GPLv3+'
libassuan: LGPLv2+ and GPLv3+
gpgme: LGPLv2+ and GPLv3+ That syft then returns as: I am actually particular about Syft’s decision to hash the unrecognized license strings greater than 64 characters. Is it purely a design choice on Syft to keeping it short as a best practice for clarity? As go-spdx does not mandate this limit. Also, could Syft offer like an option to output full license strings >64 chars, or like a flag? Like: instead of hasing it like: I need the full license strings in my workflow even if they’re non-SPDX compliant. |
Hi @Funsho-Agboola,thanks for your insights. For traceability, maybe you can refer to #2724 (comment) and #3450 are about the representations of the full text of a license. And when it comes to the
Sorry, I'm almost a newbie, so my suggestions might need further consultation from the nuclear dev team :(. |
Thanks, @VictorHuu, Making the limit of the |
Thanks @Funsho-Agboola! Working on this now that the full-text pr has merged. |
I see why this isn't getting sent directly to the
The above value isn't coming back as a valid spdx license expression. I used the following program as a quick validator:
I think with the PR referenced on this issue we might be at a place where this gets easier. Before We chose |
Hi @spiffcs Confirmed it here thanks, long license texts now come through intact. Thanks for jumping on this. |
What would you like to be added:
I would like Syft to add a feature that returns the full license string, even when it exceeds 64 characters, instead of hashing it and returning a
LicenseRef-<hash>
This could be done by:
Why is this needed:
Currently, Syft hashes license strings longer than 64 characters using SHA256, replacing license strings with:
This actually limits traceability during license scans and compliance checks. The
redislabs/k8s-controller:7.8.2-6
image is a good example.The following license string was found in the RPM DB on this image using:
rpm -qa --qf '%{NAME}: %{LICENSE}\n' glibc-minimal-langpack: LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL glibc-common: LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL glibc: LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL
Syft hashes the string and returns:
LicenseRef-cedbc2fa4301332b3d3569627696d986a63b3f3a293a2759a611c7c3deebd428
Which I verified on python:
import hashlib print(hashlib.sha256(b"LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL").hexdigest()) cedbc2fa4301332b3d3569627696d986a63b3f3a293a2759a611c7c3deebd428
Additional context:
The behaviour is defined here: https://github.com/anchore/syft/blob/main/syft/format/internal/spdxutil/helpers/license.go
Particularly:
Environment:
Syft version:
syft 1.20.0
The text was updated successfully, but these errors were encountered: