8000 Enable parallel build by ferd · Pull Request #2040 · erlang/rebar3 · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Enable parallel build #2040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 4, 2019
Merged

Conversation

ferd
Copy link
Collaborator
@ferd ferd commented Mar 29, 2019

This PR supercedes #2039, which had the following description:

Support for parallel compilation of *.erl file was dropped before 3.0 release.
However, our tests for a project containing ~500 source files show substantial gain, lowering compilation time from 58 seconds to 18 on a MacBook Pro 15" (4 cores, 8 threads), and to just 10 seconds on Xeon-D machine.

This patch does two things on top of @max-au's PR:

  1. it broadens the interface for the compiler module so that non-first-file modules can possibly be parallelized. This is done by dynamically switching on [ListOfFiles], which remains sequential as
    before, or {[SeqPriority], [Parallel]}, which divides regular files between higher priority ones and those that can be in parallel
  2. implements this mechanism in the rebar compiler, based on the erl file digraph. If a file has an in-neighbour, it is depended on by another file. The mechanism therefore makes it so all files that have dependants get compiled in their strict relative sequential order first, and then the undepended-on files get compiled together in parallel.

By running:

  ./rebar3 ct --suite test/rebar_compile_SUITE.erl --case recompile_when_parse_transform_inline_changes --repeat 50

the previous iteration of this PR would rapidly fail, and this one succeeds every time.

Support for parallel compilation of *.erl file was dropped before 3.0 release.
However, our tests for a project containing ~500 source files show substantial gain, lowering compilation time from 58 seconds to 18 on a MacBook Pro 15" (4 cores, 8 threads), and to just 10 seconds on Xeon-D machine.
@ferd ferd mentioned this pull request Mar 29, 2019
This patch does two things:

1. it broadens the interface for the compiler module so that
   non-first-file modules can possibly be parallelized. This is done by
   dynamically switching on `[ListOfFiles]`, which remains sequential as
   before, or `{[SeqPriority], [Parallel]}`, which divides regular files
   between higher priority ones and those that can be in parallel
2. implements this mechanism in the rebar compiler, based on the erl
   file digraph. If a file has an in-neighbour, it is depended on by
   another file. The mechanism therefore makes it so all files that have
   dependants get compiled in their strict relative sequential order
   first, and then the undepended-on files get compiled together in
   parallel.

By running:

  ./rebar3 ct --suite test/rebar_compile_SUITE.erl --case \
  recompile_when_parse_transform_inline_changes --repeat 50

the previous iteration of this would rapidly fail, and this one succeeds
every time.
@ferd ferd force-pushed the max-au-rebar_compiler_parallel branch from 94201c8 to 9f81a57 Compare March 29, 2019 13:03
@ferd
Copy link
Collaborator Author
ferd commented Mar 29, 2019

Running this branch on Rebar3 itself had low impact on compile performance from scratch (deps cached, but not in _build):

  • this branch: rebar3 escriptize 10.73s user 2.45s system 181% cpu 7.275 total
  • master: rebar3 escriptize 10.48s user 2.53s system 116% cpu 11.171 total

The slight slowdown on a small-enough project might certainly be worth the gain on a large one.

@max-au
Copy link
Contributor
max-au commented Mar 29, 2019

I like this one. Performance results:
time rebar3 compile real 0m14.887s
Before the patch:
time rebar3 compile real 0m45.576s
Thank you for fixing!

PS: for comparison, my naive version is real 0m14.290s.

@ferd
Copy link
Collaborator Author
ferd commented Mar 29, 2019

@tsloughter re: your comment on the other thread. I think things are not too bad considering there's a check for min(length(Files), NumScheduler). The old version always defaulted to 3 workers no matter the scheduler count nor file count: https://github.com/erlang/rebar3/pull/265/files#diff-2015be9e4c4af12364f637580b5450bdL52

I had limited slow downs here as well compared to the massive speedup Max has reported, but we might want to check with smaller projects to find a cutoff point.

@tsloughter
Copy link
Collaborator

Maybe it should not spawn any workers if the min is < NumSchedulers?

@max-au
Copy link
Contributor
max-au commented Mar 29, 2019

Now even cheap desktops have 16+ threads, and parallel build with 8 files (< NumSchedulers) benefits from using 8 workers. In fact, spawning a worker does not not cost much, spawn rate of 8,000 per second is fine on a laptop.

@max-au
Copy link
Contributor
max-au commented Mar 30, 2019

After testing a bit more, I realised that there is also a bottleneck in init_dag, which is single-threaded, does some file i/o, and is quite slow as it performs full-scale *.erl file parsing (rebar_compiler_erl, line 242).
That one would be a bit harder to parallelise, but it's still worth trying.

==
(unrelated - diagnostic message produced when myapp.app file is broken - line 346 of rebar_app_discover, create_app_info - {badmatch, ...} - quite misleading).

@ferd
Copy link
Collaborator Author
ferd commented Apr 4, 2019

Got the OK from Tristan on IRC.

@ferd ferd merged commit 0843c19 into erlang:master Apr 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0