Population solver dynamic ion range #187

fionntancallan · 2025-04-28T12:23:45Z

Add functionality that allows the range of ions that are used in the NLTE solution to be dynamically changed if the NLTE solution fails due to either negative populations or large populati 8000 on inversions:

-Previously when the NLTE population was negative for a level it was directly replaced with the LTE level population. This resulted in cases where the negative excited population was replaced with a value much greater than the NLTE ground pop resulting in extremely large partition function values which in some cases then overflowed the float limit resulting in the simulation failing. Additionally, negative NLTE ground populations were just replaced with the LTE value. However, if the ground population is negative this means the entire solution for that ion is unlikely to be sensible.

-There were previously no checks on population inversions. Large population inversions can lead to partition function values which overflow the float limit causing the simulation to fail.

The functionality added in this pull request should solve issues related to the NLTE solver giving negative populations or large populations inversions for ions which have very low populations (e.g for neutral ion states when the dominant ionisation state is a highly ionised state and vice versa) by allowing the NLTE solution to be recalculated when the solver fails but with the individual ion that is causing the issues excluded from the solution for that element. Previously the NLTE solution could only be calculated including all ions included in the simulation for a given element and there was only functionality to set the populations of the entire element to LTE values of the NLTE solver failed.

fionntancallan · 2025-04-30T11:39:02Z

@lukeshingles, any ideas on why this is failing with a "terminate called after throwing an instance of 'std::bad_array_new_length'"? All I have done is add the nions_used and first_ion_used variables and set them equal to what was previously used in the get_element_nlte_dimension and get_max_nlte_dimension functions.

Is there a way to run the tests locally so that I can get more detailed information to diagnose this?

lukeshingles · 2025-04-30T12:24:41Z

@fionntancallan the most common cause of that would be trying to resize the vector to a negative length. You can look in the ci.yml workflow and run those same commands locally.

fionntancallan · 2025-05-01T14:37:22Z

@lukeshingles in "solve_nlte_pops_element" when assigning memory to the rate matrices, e.g.:

resize_exactly(vec_rate_matrix_rad_bb, max_nlte_dimension * max_nlte_dimension);
rate_matrix_rad_bb = gsl_matrix_view_array(vec_rate_matrix_rad_bb.data(), nlte_dimension, nlte_dimension).matrix;
gsl_matrix_set_all(&rate_matrix_rad_bb, 0.);

Why is max_nlte_dimension used for the resize? Since solve_nlte_pops_element is only called for one element at a time could you not just use nlte_dimension for the resize making max_nlte_dimension redundant? Or am I missing something?

fionntancallan · 2025-05-01T14:44:52Z

Is it because the expensive operation is when you have to resize to a matrix bigger than the memory allocated for the previous iteration of the matrix, so its generally better to allocate the max memory and then reduce what is actually utilised with the gsl_matrix_view_array?

lukeshingles · 2025-05-01T18:26:12Z

Is it because the expensive operation is when you have to resize to a matrix bigger than the memory allocated for the previous iteration of the matrix, so its generally better to allocate the max memory and then reduce what is actually utilised with the gsl_matrix_view_array?

Yes, memory allocations and deallocations are slow and they cause memory fragmentation. It's better to allocate just once if it's possible to know ahead of time how much memory is required. I don't see why you would have to touch that code though, since you're only potentially reducing the size of the matrix, the max_nlte_dimension should not change.

fionntancallan · 2025-05-02T09:51:52Z

Is it because the expensive operation is when you have to resize to a matrix bigger than the memory allocated for the previous iteration of the matrix, so its generally better to allocate the max memory and then reduce what is actually utilised with the gsl_matrix_view_array?

Yes, memory allocations and deallocations are slow and they cause memory fragmentation. It's better to allocate just once if it's possible to know ahead of time how much memory is required. I don't see why you would have to touch that code though, since you're only potentially reducing the size of the matrix, the max_nlte_dimension should not change.

Ok thanks, the output of max nlte_dimension shouldn't change but it works by looping through all the elements and calling get_element_nlte_dimension for each element which now needs two extra arguments which can be applied if the number of ions have been reduced. This change in dimension shouldn't apply to the call of get_element_nlte_dimension from get_max_nlte_dimension provided the arguments passed are consistent with the entire ion being treated in NLTE

fionntancallan · 2025-05-08T09:58:42Z

@lukeshingles do you prefer this branch is brought up to date with the current develop branch with update with rebase or update with merge commit?

lukeshingles · 2025-05-08T10:06:15Z

@lukeshingles do you prefer this branch is brought up to date with the current develop branch with update with rebase or update with merge commit?

Either one is good since it will look the same once it is merged anyway.

…uppermost ion used is based on LTE phi factors

…whole element is treated in NLTE in loop through elements in get_max_nlte_dimension()

…ment when nlte solver fails

This reverts commit a9cd5db.

…n to 0

fionntancallan · 2025-05-14T14:12:05Z

@lukeshingles the new functionality seems to be working for single zone test simulations. What is the best way to proceed from here. Are there larger scale tests you run to make sure everything is behaving? The new functionality changes the results of the automated tests so its probably a good idea to do some more comprehensive testing before we would update the tests.

nltepop.cc

lukeshingles · 2025-05-14T14:29:56Z

@lukeshingles the new functionality seems to be working for single zone test simulations. What is the best way to proceed from here. Are there larger scale tests you run to make sure everything is behaving? The new functionality changes the results of the automated tests so its probably a good idea to do some more comprehensive testing before we would update the tests.

Please add some description of your new options to artisoptions_doc.md. Currently, our two NLTE tests do not pass through the NLTE error condition and recovery steps. I suggest duplicating and modifying input model/time range for the nebular_1d_3dgrid test so that it gets into this kind of state and preserves the functionality of your new options.

…TE solver

tests/setup_nltephotospheric_dynamic_ion_range_1d_1dgrid.sh

nltepop.cc

fionntancallan · 2025-06-04T10:22:02Z

@lukeshingles I want to run some full simulations to test the new functionality in more detail and see if it changes anything in terms of the synthetic observables predicted. Are there specific runs on the ARTIS google drive which you generally rerun after an update to the code?

lukeshingles · 2025-06-04T10:41:29Z

@lukeshingles I want to run some full simulations to test the new functionality in more detail and see if it changes anything in terms of the synthetic observables predicted. Are there specific runs on the ARTIS google drive which you generally rerun after an update to the code?

I think we discussed re-running the W7 and a Shen2018 with your new options enabled to see if it affects the nebular spectra (hopefully not). Google Drive path: ARTIS/artis_runs_published/shinglesetal2022

But you don't necessarily need to do that before we merge the PR. You can always follow up with another PR if you want to make further changes to your new experimental options.

tests/nltephotospheric_dynamic_ion_range_1d_1dgrid_inputfiles/recombrates.txt

tests/nebular_dynamic_ion_range_1d_1dgrid_inputfiles/abundances.txt

artisoptions_christinenonthermal.h

nltepop.cc

fionntancallan · 2025-06-04T13:31:15Z

@lukeshingles I want to run some full simulations to test the new functionality in more detail and see if it changes anything in terms of the synthetic observables predicted. Are there specific runs on the ARTIS google drive which you generally rerun after an update to the code?

I think we discussed re-running the W7 and a Shen2018 with your new options enabled to see if it affects the nebular spectra (hopefully not). Google Drive path: ARTIS/artis_runs_published/shinglesetal2022

But you don't necessarily need to do that before we merge the PR. You can always follow up with another PR if you want to make further changes to your new experimental options.

Yes I can run those after we merge the PR,I was just wondering what the exact versions to compare to are since there are a few different versions of the W7 and Shen model in that folder. I guess its the subchdet_shen2018_1.00_5050_20211111_150_410d_ntfixnewphixsbfest_2e9pkt_virgo and w7_outercut_20211111_150_410d_ntfixnewphixsbfest_ntworkfn_2e9pkt_virgo runs I should be comparing to but wanted to double check

lukeshingles · 2025-06-04T14:55:26Z

@lukeshingles I want to run some full simulations to test the new functionality in more detail and see if it changes anything in terms of the synthetic observables predicted. Are there specific runs on the ARTIS google drive which you generally rerun after an update to the code?

I think we discussed re-running the W7 and a Shen2018 with your new options enabled to see if it affects the nebular spectra (hopefully not). Google Drive path: ARTIS/artis_runs_published/shinglesetal2022
But you don't necessarily need to do that before we merge the PR. You can always follow up with another PR if you want to make further changes to your new experimental options.

Yes I can run those after we merge the PR,I was just wondering what the exact versions to compare to are since there are a few different versions of the W7 and Shen model in that folder. I guess its the subchdet_shen2018_1.00_5050_20211111_150_410d_ntfixnewphixsbfest_2e9pkt_virgo and w7_outercut_20211111_150_410d_ntfixnewphixsbfest_ntworkfn_2e9pkt_virgo runs I should be comparing to but wanted to double check

Yes for the subchdet, but for w7 use w7_outercut_20210331_150_410d_2e9pkt_fixntfrac_juwels.

The artis options have evolved a bit since there, but you should be able to simply take the latest artisoptions_nltenebular.h and edit the packet count. 2e9 packets is probably overkill, and 1e9 packets should look nearly identical anyway. So for 960 cores, set MPKTS to 1042000.

lukeshingles mentioned this pull request May 2, 2025

Add more robust checks on NLTE population solver and safer treatment when NLTE solver fails #159

Closed

fionntancallan added 10 commits May 8, 2025 11:33

Add feature flag for strict NLTE solver population checks

fc44993

Add stricter checks for negative populations in population solver

cd31767

fix clang formatting

2a47f96

Add checks for population inversions

ae3888d

Recall find uppermost ion when setting element pops to LTE to ensure …

3c61198

…uppermost ion used is based on LTE phi factors

Add feature flag for reducing NLTE ion stages if the solver fails

7224d6f

introduce nions_used and first_ion_used

351f75f

update nions_used and first_ion_used to to values applying to if the …

f4ca4f6

…whole element is treated in NLTE in loop through elements in get_max_nlte_dimension()

fix clang-format fail

3ca80d0

Add first_ion_used and nions_used variables throughout population solver

1edf903

fionntancallan force-pushed the population_solver_dynamic_ion_range branch from 56dfb96 to 1edf903 Compare May 8, 2025 10:33

fionntancallan added 4 commits May 9, 2025 11:25

Add while loop which iteratively reduces ionstages treated for an ele…

8999ce2

…ment when nlte solver fails

Update nonthermal.cc

a9cd5db

Revert "Update nonthermal.cc"

002fc6e

This reverts commit a9cd5db.

Set level pops for ions that have been excluded from the NLTE solutio…

8670787

…n to 0

lukeshingles reviewed May 14, 2025

View reviewed changes

nltepop.cc Outdated Show resolved Hide resolved

fionntancallan added 3 commits May 15, 2025 16:38

Update nltepop.cc

0ac05e5

Add options for limits/cuts used for strict population checking in NL…

be4205e

…TE solver

Add photospheric phase NLTE artisoptions file

1401d01

lukeshingles reviewed May 30, 2025

View reviewed changes

tests/setup_nltephotospheric_dynamic_ion_range_1d_1dgrid.sh Outdated Show resolved Hide resolved

fionntancallan commented May 30, 2025

View reviewed changes

nltepop.cc Outdated Show resolved Hide resolved

fionntancallan and others added 5 commits June 2, 2025 15:03

Add more robust checks on when ion stages can be removed from solution

86e8d92

Update nltepop.cc

af5c233

Update setup_nltephotospheric_dynamic_ion_range_1d_1dgrid.sh

9b7b73c

Update nltepop.cc

87b7e1c

Merge branch 'develop' into population_solver_dynamic_ion_range

9d23c37

fionntancallan marked this pull request as ready for review June 4, 2025 10:01