8000 New Build System by pmocz · Pull Request #791 · MESAHub/mesa · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

New Build System #791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 67 commits into
base: main
Choose a base branch
from
Open

New Build System #791

wants to merge 67 commits into from

Conversation

pmocz
Copy link
Member
@pmocz pmocz commented Mar 2, 2025

by Vincent Vanlaer

@warrickball
Copy link
Contributor

I decided that I'll fight with ifort after the build system is refreshed, so I configured our University cluster to run using the SDK. It submitted test results over the course of last night. There are 25 failures but I want to get some big picture input on that result.

tl;dr: We have two options:

  1. We could probably go back to whatever compiler options meant that stuff runs under the current build system.
  2. We can try to fix a lot of issues that have been raised first.

I'm strongly in favour of 2. I think these runs have flagged some genuine issues that are worth fixing now rather than sweeping back under a rug of compiler options. Or maybe some of these are fixed on main or just missing one or two reasonable compiler flags?

Regarding those issues, they actually fall into some large categories.

  • double_bh, 15M_dynamo, accreted_material_j, high_rot_darkening and ppisn fail because of a checksum difference in the final model when it's computed from a restart. This is presumably a symptom of something not being correctly passed into and out of the photo, though I can't tell at a glance if it's the same thing.
  • 1M_thermohaline, 20M_z2m2_high_rotation, conserve_angular_momentum and magnetic_braking hit a shared issue, where it looks like s% D_ST(2) is NaN on restart. I guess this is also something not being restored from a photo correctly.
  • The RSP tests all fail because of the same energy conservation test.
  • The astero failures are all caused by index out of bounds errors in ADIPLS. I'll make investigating these my next priority.

The remaining tests (carbon_kh, relax_composition_j_entropy and simplex_solar_calibration) have unique issues. simplex_solar_calibration has an unusual structure that includes a bit of astero that BillP isolated. I think the build system just needs updating here.

@warrickball
Copy link
Contributor

Also probably worth mentioning that this is a continuation of this original external PR (#724).

VincentVanlaer and others added 16 commits April 16, 2025 16:26
This commit introduces a new build system (still based on make files).
Reasons for replacing the previous build system were:

- the previous build system had a lot of duplication between the
  modules, both in terms of make files and shell scripts.
- parallelisation was limited. The new build system parallelises better
  within a module, and between modules.
- build scripts and output from the build ended up in the same folder.
  With the new system, all build output files are stored in the folder
  ``build`` in the repository root, making cleaning build files
  equivalent to deleting this folder

For the details of the implementation, see the newly added documentation
(Developing -> Build system).

Some further notable changes:

- makedepf90 has been replaced by a perl script that scans the fortran
  source files. This allows for better flexibility in construction the
  make files.
- Currently, `pymesa` does not work with the new build system, as the
  new system does not provide a way to build shared libraries for every
  module.
- Dependencies are being discovered using pkg-config. This change
  makes it easier to build MESA without the SDK, as linux distributions
  typically ship with pkg-config files.
This commit adds the necessary pkg-config files until the SDK ships
them.
Plotter does not work with default inlist (and encounters NaNs in eos)
Work directories require a bit more massaging to work:

- The compilation of the work directory needs to be able to find the
  compiled artefacts in the main mesa build directory, but store its own
  output in the build directory in the work directory. This is
  accomplished by keeping the main build directory pointed to the MESA
  directory, and only set BUILD_DIRECTORY_MODULE to the work dir build
  directory, as only this one is used for the generated files.
- The binary module needs to refer to sources in the MESA directory.
  this is because these sources cannot be compiled in during the
  installation, as they depend on functions in the work directory
  sources.

From a user perspective, the main changes are:

- The removal of the `clean` and `mk` scripts. These are replaced by
  `make clean` and `make` respectively.
- The star executable will always be built before it is ran. This will
  eliminate issues where the one forgets to rebuild the executable.
Otherwise one could get warnings from the compiler
How to actually use the preprocessor is left as an exercise to the
reader. This is to make the bit-rot not be worse than it already is.
@VincentVanlaer VincentVanlaer force-pushed the VincentVanlaer/build-system branch from fecf573 to 971fa3c Compare April 16, 2025 18:42
@VincentVanlaer
Copy link
Collaborator

I have rebased on main cleaned up some things (e.g. extracted #808).

There are some test cases which run into issues. From what I could see running tests locally (i.e. without checksums), the ccsn, all rsp tests, and all adipls tests fail with FPEs. While these can easily be fixed by initializing certain variables to zero at somewhat arbitrary points, it doesn't feel like a clean solution. Moreover, adipls is not failing on some uninitialized variable, but on an actual divide by zero.

I actually prefer going for option one, I have added a commit that disables FPEs by default. Syncing this PR with main was rather painful, so I would rather see this merged sooner than later (as long as there are no outstanding issues of course). Since the option to enabled FPEs still exists, getting the tests/other parts of MESA to cooperate can still be done in the future.

I am running the test suite on the local cluster with the latest changes, and I'll see whether any other issues crop up. This is without the checksums, since they are not in the repository.

@VincentVanlaer
Copy link
Collaborator

Something I forgot to add is that the default insertion of NaNs could have also been a reason for the difference in the photos.

@VincentVanlaer VincentVanlaer force-pushed the VincentVanlaer/build-system branch from ccbe028 to d248833 Compare April 17, 2025 22:12
This suppresses uninitialized warnings, and in some cases, variables
containing garbage instead of zero is load bearing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0