8000 New option hexpm-mirrors, to allow specifying several mirrors and change default builds.hex.pm by pguyot · Pull Request #156 · erlef/setup-beam · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

New option hexpm-mirrors, to allow specifying several mirrors and change default builds.hex.pm #156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from

Conversation

pguyot
Copy link
Contributor
@pguyot pguyot commented Nov 14, 2022

New option hexpm-mirrors to specify the mirror(s) to use. This is particularly useful in cases builds.hex.pm is not reliable as it proved to be recently (see #194 for example)

By default, the list of mirrors is limited to builds.hex.pm, which means this PR does not introduce any change to existing users.

The PR comes with a test of the new option. However, some tests fail, but it looks like CI is broken (several tests are flappy because of recent issues with builds.hex.pm, but test "OTP 20, Elixir v1.4, Gleam , rebar3 3.6.0" looks broken).

Signed-off-by: Paul Guyot pguyot@kallisys.net

@paulo-ferraz-oliveira
Copy link
Collaborator

I'm Ok-ish with this change, though I feel that the action shouldn't care about cache. Specifically, you're doing this for Hex (Elixir), but e.g. not rebar3 (Erlang), and if we go down that road we might end up with something outside of our original scope. I believe we've discussed this with other maintainers, in the past (@starbelly, @ericmj, @wojtekmach) and ended up not including it. I'll search the issues and pull requests.

@paulo-ferraz-oliveira
Copy link
Collaborator

From #45 I can find https://gist.github.com/eproxus/d74315864fd6897bb47741e8de5bc980; and we ended up not including cache-related instructions in the README.

@paulo-ferraz-oliveira
Copy link
Collaborator

I'm also thinking that somebody comes along and wants to add cache for rebar3's PLT management; then somebody feels we should have cache for rustler... you can see where this is going 😄

@pguyot
Copy link
Contributor Author
pguyot commented Nov 14, 2022

The main issue here is that erlef/setup-beam action uses hex.pm a lot (for erlang, elixir and mix packages), and when a matrix strategy is big enough, network errors occur, making the action totally unusable. It's not about performance, it's about getting a consistent CI. There is no point in caching rebar as it's provided by GitHub/AWS S3.

These failures represent most if not all recent failures on the master branch for AtomVM project and a lot of failures on PRs
https://github.com/atomvm/AtomVM/actions/workflows/build-and-test.yaml?query=branch%3Amaster+is%3Afailure
https://github.com/atomvm/AtomVM/actions/workflows/build-and-test-other.yaml?query=branch%3Amaster+is%3Afailure

OTOH, setup-* actions definitely should handle cache, the official ones are automated cache providers for packages.
https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows

So maybe here the name cache-hexpm is misleading as it's about caching downloads, not built items (although arguably, it caches built mix archives for rebar & hex).

@paulo-ferraz-oliveira
Copy link
Collaborator

making the action totally unusable

Are these CDN -related? What types of issues are we talking about? (because the assets we use are behind a CDN, already)

There is no point in caching rebar as it's provided by GitHub/AWS S3

I meant "rebar3 dep.s". Though many are provided (and potentially used from Hex) not all are. But apparently the scope of the pull request is different from what the title an option name suggest.

...

OTOH if what you're providing is caching of the action elements itself, I'm more receptive. I didn't look at the details of the implementation (it's late; this wasn't discussed before, ...) but maybe I was mislead exactly because of the option name. In that case, why would we need an option for this? If cache is working and makes building faster and more consistent, shouldn't it simply be integrated into the action?

@paulo-ferraz-oliveira
Copy link
Collaborator

(although arguably, it caches built mix archives for rebar & hex).

If it does, could we separate these from the previous ones? The action's elements themselves (?)

@pguyot pguyot force-pushed the w46/add-cache-for-hex.pm branch from a64088a to b908607 Compare November 15, 2022 00:03
@pguyot
Copy link
Contributor Author
pguyot commented Nov 15, 2022

I double-checked and actually, this caches the unzipped hex.ez.
Previous code didn't cache properly rebar from hex.pm as downloaded by mix, it now does.

hex.pm is supposedly a CDN, but there have been a lot of issues recently.

@paulo-ferraz-oliveira
Copy link
Collaborator

Great, thanks. I'll wait to hear from others until we decide how to move forward (and then do a proper review).

@pguyot pguyot force-pushed the w46/add-cache-for-hex.pm branch from b908607 to 2791107 Compare November 15, 2022 01:11
@pguyot pguyot changed the title Use cache for hex.pm Allow mirrors for hex.pm and cache downloads Nov 15, 2022
@pguyot
Copy link
Contributor Author
pguyot commented Nov 15, 2022

Drawing from this discussion, let's consider an alternative implementation.
A new option to choose the mirror to use to download the action's assets (erlang for Linux, elixir, rebar) and cache this on GitHub (by default).

@paulo-ferraz-oliveira
Copy link
Collaborator

My question still holds: "... why would we need an option for this? If cache is working [from your PR onward] and makes building faster and more consistent, shouldn't it simply be integrated into the action?" meaning... why does your new behaviour simply not become the default? What are we expected to solve with an option?

@ericmj
Copy link
Collaborator
ericmj commented Nov 15, 2022

The main issue here is that erlef/setup-beam action uses hex.pm a lot (for erlang, elixir and mix packages), and when a matrix strategy is big enough, network errors occur, making the action totally unusable. It's not about performance, it's about getting a consistent CI. There is no point in caching rebar as it's provided by GitHub/AWS S3.

This is strange to me because repo.hex.pm and github assets use the same CDN and will hit the same POPs/data centers.

Are you using github runners or your own? I haven't heard any other reports of github runners having network issues to repo.hex.pm. I also cannot see any substantial amount of errors in our metrics. Are you having more success using mirrors?

@pguyot
Copy link
Contributor Author
pguyot commented Nov 15, 2022

The main issue here is that erlef/setup-beam action uses hex.pm a lot (for erlang, elixir and mix packages), and when a matrix strategy is big enough, network errors occur, making the action totally unusable. It's not about performance, it's about getting a consistent CI. There is no point in caching rebar as it's provided by GitHub/AWS S3.

This is strange to me because repo.hex.pm and github assets use the same CDN and will hit the same POPs/data centers.

Yet it does happen when the action matrix is large enough.

Are you using github runners or your own? I haven't heard any other reports of github runners having network issues to repo.hex.pm. I also cannot see any substantial amount of errors in our metrics. Are you having more success using mirrors?

This is GitHub runners.

I even had the problem running the CI to test my changes. See for example:
https://github.com/pguyot/setup-beam/actions/runs/3465494731
Or just the test that was run for this very PR:
https://github.com/erlef/setup-beam/actions/runs/3466560021/jobs/5796596347

I didn't have any problem with a mirror, but I didn't test it sufficiently. My plan was to use default CDN with GH cache.

@paulo-ferraz-oliveira
Copy link
Collaborator

@pguyot, have you tested the result of this change, with your (currently failing) workflow, to make sure it solves the issues you want it to? (you can replace erlef/setup-beam with pguyot:w46/add-cache-for-hex.pm)

Can you maybe post the results here, so we can check the difference?

Thanks.

@pguyot pguyot force-pushed the w46/add-cache-for-hex.pm branch 2 times, most recently from 5812838 to 7214a8a Compare November 16, 2022 06:48
@pguyot
Copy link
Contributor Author
pguyot commented Nov 16, 2022

FYI, errors also occur with https://cdn.jsdelivr.net/hex without cache.

@pguyot
Copy link
Contributor Author
pguyot commented Nov 16, 2022

My question still holds: "... why would we need an option for this? If cache is working [from your PR onward] and makes building faster and more consistent, shouldn't it simply be integrated into the action?" meaning... why does your new behaviour simply not become the default? What are we expected to solve with an option?

I am confused. The PR code now makes caching of repo.hex.pm the default, I did amend the code from your review.

@paulo-ferraz-oliveira
Copy link
Collaborator

I am confused. The PR code now makes caching of repo.hex.pm the default, I did amend the code from your review.

You're right; my text was probably not super explicit, but I did ask:

  • ... why would we need an option for this?
  • What are we expected to solve with an option?

i.e. why would https://github.com/erlef/setup-beam/pull/156/files#diff-1243c5424efaaa19bd8e813c5e6f6da46316e63761421b3e5f5c8ced9a36e6b6R36 be required as an option. Do you foresee a case where not using the cache, as you propose, would be better?

@pguyot
Copy link
Contributor Author
pguyot commented Nov 16, 2022

i.e. why would https://github.com/erlef/setup-beam/pull/156/files#diff-1243c5424efaaa19bd8e813c5e6f6da46316e63761421b3e5f5c8ced9a36e6b6R36 be required as an option. Do you foresee a case where not using the cache, as you propose, would be better?

This uses some space in the 10GB that GitHub provides for caching. About 100 MB. That's the reason I considered parsing ";cache" at the end of the URL. But this could be removed and the logic simplified if you think it's better, the complexity might indeed not be worth it.

@paulo-ferraz-oliveira
Copy link
Collaborator

Yeah, I believe if we move forward (still waiting on @starbelly and @wojtekmach to say something - hopefully they will, if they have time; don't know if @ericmj has anything else to say) we should remove the option and simplify the change as much as possible. 100 MB, out of the 10 GB, is acceptable, I guess.

@wojtekmach
Copy link
Collaborator

I don't recall previous conversations around this but in my opinion built-in support for caching seems like a good idea, I'd expect most people would like to use it most of the time. I believe I'm caching deps on most if not all of my projects. I'd even say cache "on" by default is a good idea but we need to consider drawbacks. Are there any?

(...) add cache for rebar3's PLT management; then somebody feels we should have cache for rustler.

This is a very fair concern. I'd say the action should be concerned with caching only the things it downloads by itself. So I think we'd cache rebar but not rebar PLTs, one would need to do that themselves, which is not complicated.

I'm not sure about coupling caching with mirroring though, I think these are orthogonal concerns. For example, if we'd ever want to cache the gleam executable, we'd most likely download it from somewhere else than repo.hex.pm.

@paulo-ferraz-oliveira
Copy link
Collaborator

I think we all agree that the action could/should cache the elements it depends on, but stuff like PLTs, consumer-specific dependencies, etc. should not be included (which is what I thought this pull request was for, initially). Is it possible to separate these, @pguyot? Or is it a consequence of your implementation that the Hex cache folder is consumed wholly.

@pguyot pguyot force-pushed the w46/add-cache-for-hex.pm branch from 7214a8a to 443b20a Compare November 17, 2022 21:50
@pguyot
Copy link
Contributor Author
pguyot commented Nov 17, 2022

I think we all agree that the action could/should cache the elements it depends on, but stuff like PLTs, consumer-specific dependencies, etc. should not be included (which is what I thought this pull request was for, initially). Is it possible to separate these, @pguyot? Or is it a consequence of your implementation that the Hex cache folder is consumed wholly.

It does not and never did cache stuff like PLTs and consumer-specific dependencies.
The latest amended version removes the cache option and simplifies as discussed here.

@paulo-ferraz-oliveira
Copy link
Collaborator
paulo-ferraz-oliveira commented Dec 1, 2022

@pguyot, I think this is getting close to ready.

We're just missing some small things:

pguyot and others added 3 commits December 4, 2022 11:12
Caching was only meant to cope with poor performance of `repo.hex.pm`.
Users can simply use the mirrors feature and put `cdn.jsdelivr.net/hex` as a
fallback to get reliable builds.

Also change default mirrors list to only include `repo.hex.pm` as requested
by @ericmj, so default behavior is unaffected by this PR.

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
@pguyot
Copy link
Contributor Author
pguyot commented Dec 4, 2022

@pguyot, I think this is getting close to ready.

We're just missing some small things:

Sorry, this PR is not about retries. We shouldn't retry mirrors.
In fact, I even removed the caching feature as it doesn't make sense for mirrors as well.

  • update suggestion
    Because cache feature has been removed, it is no longer necessary to modify the semantics of this function and therefore I removed the documentation I wrote.
  • comment
    This is how options work with actions. The API provides a multiline parsing. I believe lists are not supported.
  • finally, there's now merge conflicts
    My approach is to rebase and force push, but this is not what you asked for. Please advise.

@pguyot pguyot changed the title Cache downloaded objects + new option hexpm-mirror, to allow changing from default hex.pm New option hexpm-mirrors, to allow specifying several mirrors and change default repo.hex.pm Dec 4, 2022
@paulo-ferraz-oliveira
Copy link
Collaborator
paulo-ferraz-oliveira commented Dec 4, 2022

My approach is to rebase and force push, but this is not what you asked for. Please advise.

We will rebase+force-push once everything's dealt with. I added it to my list of TO-DO because it's something that's missing.

FWIW, this pull request is changing context and scope so often is difficult to keep track of what's requested. I'd advise to have a discussion with the maintainers if you intend to change something else (that's not requested by us) in order to ease the review process.

Hopefully, and if nothing further changes I will have a review result ready soon.

@paulo-ferraz-oliveira
Copy link
Collaborator
paulo-ferraz-oliveira commented Dec 4, 2022

I'd prefer repo.hex.pm to not only be the default, but the fallback in any case - except if it's already part of the chosen mirrors, especially because it's the official place we get the builds from, at the moment. This way we don't need to fail for an empty mirror list, and by renaming the option as alternative-... we make it clear what we're offering.

Edit: if the consumer passes no mirror, but has the option present, add the hex.pm one; if the consumer passes a list of mirrors that does not contain hex.pm, add it to the tail of the mirror list. In both cases, log an info. for what is being done.

@paulo-ferraz-oliveira
Copy link
Collaborator

My approach is to rebase and force push, but this is not what you asked for. Please advise.

We'll do the rebase, resolve conflict, and force push at the end.

pguyot and others added 6 commits December 7, 2022 22:35
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
Signed-off-by: Paul Guyot <pguyot@kallisys.net>
pguyot and others added 2 commits December 13, 2022 08:49
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
Co-authored-by: Paulo F. Oliveira <paulo.ferraz.oliveira@gmail.com>
@paulo-ferraz-oliveira
C95D
Copy link
Collaborator

@pguyot, do you intend to continue working on this?

@pguyot
Copy link
Contributor Author
pguyot commented Jan 7, 2023

@pguyot, do you intend to continue working on this?

I can rebase and fix conflicts to clarify the intent of this PR. It really is about mirroring (pun intended) the behavior of mix's HEX_MIRROR's environment variable to specify a mirror to use instead of default repo.hex.pm.

@paulo-ferraz-oliveira
Copy link
Collaborator

Closing as stale. Feel free to re-open when you want to get back to it.

@ericmj ericmj mentioned this pull request May 1, 2023
@pguyot pguyot changed the title New option hexpm-mirrors, to allow specifying several mirrors and change default repo.hex.pm New option hexpm-mirrors, to allow specifying several mirrors and change default builds.hex.pm May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0