-
Notifications
You must be signed in to change notification settings - Fork 520
Enable parallel build #2039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable parallel build #2039
Conversation
Support for parallel compilation of *.erl file was dropped before 3.0 release. However, our tests for a project containing ~500 source files show substantial gain, lowering compilation time from 58 seconds to 18 on a MacBook Pro 15" (4 cores, 8 threads), and to just 10 seconds on Xeon-D machine.
Interesting. The build fails on OTP-17 and OTP-18 for the test detecting changes in inline declaration of parse transforms ( rebar3/test/rebar_compile_SUITE.erl Lines 2141 to 2188 in cc788f1
I've tried to restart the job one one of the two runs, and it still failed. There's no major reason why that should be, but my guess (and this is only a guess) is that there is an order in which some modules need to be compiled (behaviours and parse transforms before regular modules), and the parallel compilation here is a bit too naive -- in some cases, it is possible that an essential module like a behaviour or a parse transform finishes compiling after a module that depends on them is popped off the queue. Let me explain. To be reliable, this PR would need to be able to define priority steps. For example, it is a possible that we have a dependency chain like The problem is that the current structure you've used is the one used by the sequential compiler which splits things into 'first files' and 'rest of files'. The gotcha is that the 'first files' section you currently use sequentially contains only parse transforms that are defined in compiler options ( Basically the "first files" is the override to mandate a high priority for files that must be declared first because of compiler options invisible to analysis of .erl files, but there is still an important sequential ordering that can exist in the rest of files when you analyze them on their own. Parallelizing module compilation properly would require to not just use a topological sort as in rebar3/src/rebar_compiler_erl.erl Line 57 in 369ff85
Until then, this patch (and this is still just a guess) has a heavy chance of breaking random builds. |
Oh and the real sucky part; it seems like the current flat list approach is part of the new compiler interface, which makes it really annoying to just create more parallel groups. If my hunch on the error is right, it might make sense to instead take the topological sort, and force all the depended-on modules into the first files to turn on parallelism safely. This could be more conservative with more sequential files than the currently proposed approach in this PR, but otherwise safe. I'm thinking that the topological sort's files with no out-neighbours (iirc, the outgoing edges represent "is depended on", but we should double-check) should be safe to parallelize while the others need to keep their relative order. EDIT: that split approach wouldn't work because the compiler options for files built in |
Because there is a slow down if there aren't a lot of files to compile this needs to not do a parallel compile if there are <N to compile. Not sure what a good N is :) |
We have projects with >500 files. I am doing performance run on a fixed (#2040) version, and it seems to be just perfect. And I can't see a visible slow down on smaller projects. |
Support for parallel compilation of *.erl file was dropped before 3.0 release.
However, our tests for a project containing ~500 source files show substantial gain, lowering compilation time from 58 seconds to 18 on a MacBook Pro 15" (4 cores, 8 threads), and to just 10 seconds on Xeon-D machine.