master-thesis-report/implementation.typ

#import "helpers.typ": *

= Implementation of changes <implementation>

This section describes the adaptations that were necessary in order to utilize the individual treatment of halo mass accretion histories in #beorn. We distinguish between necessary changes that were required to implement the underlying model and secondary changes that affect the quality of the simulation outputs indirectly.

== Profile generation taking into account halo mass history
For each halo we require a flux profile that matches the halo properties which now include the accretion rate additionally to the mass and the redshift. The profiles are generated in a preprocessing step following the redshifts of the snapshots and the mass and accretion bins defined in the configuration.
// Maybe reformulate
Since the dynamic range of accretion rates is large, the resulting parameter space rapidly expands. The computation of the profiles therefore utilizes vectorized operations to achieve reasonable runtimes.

// TODO - maybe put somewhere else
// at least explain why it isn't a problem
// Reformulate
Note that this introduces another "second degree" inconsistency: The flux profile attributes a radiative behavior to the halo that is motivated by its history. This is repeated for each snapshot creating possibly conflicting histories. In the case of stable halo growth this is not a problem but in the case of erratic growth (e.g. major mergers) this can lead to unphysical behavior. A more consistent approach would be to assume a more flexible mass growth model that distinguishes different growth modes/regimes.


== Parallel binned painting
Similary to the computation of profiles, the painting step is affected by the increased parameter space. #beorn's fast simulation times revolve around the crucial simplification of the halo model: Halos with the same core properties are treated identically and can be mapped onto the grid in a single operation. Through the addition of the accretion rate as a parameter the degeneracy is reduced. The number of halos that can be treated simultaneously decreases, even though their mass is identical. To mitigate this effect we implement a parallelized version of the painting step that distributes the workload to multiple processes
#footnote[
  A rudimentary parallel implementation using `MPI` already exists. It leverages the fact that each snapshot can processed independently and distributes the snapshots to multiple processes.
].
This implementation utilizes a shared memory approach and uses processes on a single node that share a common memory space to store the grid. This allows for a more efficient usage of node resources since the memory overhead of duplicating the grid for each process is avoided.

Part of the painting procedure remains inherently sequential: The final ionization map requires conservation of the total photon count. This is achieved by distributing duplicate ionizations to neighboring cells.
// Reformulate
a parallel approach cannot guarantee perfect consistency. We aim to keep the single process computations to a minimum.

== Merger tree processing
The central improvement of the simulation procedure is the consideration of the individual halo mass accretion histories during the painting and not just the assumption of a predefined value. As described in @halo_mass_history we utilize the merger trees provided by the #thesan simulation. The inference of the accretion rate is performed at runtime. Further preprocessing of the simulation is not required, only a single step that merges the individual tree files into a single file.

The generated alphas are binned as a result of the painting procedure and the permitted range is restricted as specified in the configuration. For our runs we find that an upper limit of $alpha = 5$ only affects a sub-percent fraction of halos. Many of these halos exhibit erratic growth suggesting that allowing for very high accretion rates is not physical.

The #thesan data provides a convenient way to iterate and refine the above procedure but is not without shortcomings. The merger trees are constructed in post-processing and do not guarantee self-consistency of halo properties accross multiple snapshots. This manifests itself through negative growth rates that cannot be represented in the current model. Furhtermore the mass resolution of the #thesandark simulations is apparently too coarse to accurately resolve halos down to the atomic cooling limit of $M_"h" = 10^8 M_dot.circle$. This is an issue that becomes apparent in @validation where we compare the impact of the different mass resolutions. To account for this we follow the description of star formation efficiency employed by @Schaeffer_2023 picking a "boosted" model for the description of our halos. The resulting parameters for @eq:star_formation_efficiency are $f_(star,0) = 0.1$, $M_p = 2.8 times 10^(10) M_dot.circle$, $g_1 = 0.49$ and $g_2 = -0.61$.


== Secondary changes
Additionally to the changes directly linked to the new accretion model we implement several improvements that allow for better usability and reproducability of the simulation outputs.

We improve the input/output handling by implementing proper `HDF5` support and caching of intermediate results. This allows for a more efficient usage of disk space and faster loading times. It also enables the resumption of interrupted simulations.
The import of data from the original #nbody simulation has been generalized to a reference class to ensure modularity and allow for easier extension to other simulations. This has been part of a larger overhaul of the codebase to improve modularity and readability.
A general speedup from the cumulated effect of the above changes and code optimizations allows for a faster painting procedure. A contribution to that speedup comes from the ussage of `Pylians` by @Pylians. It provides efficient implementations in `C` of of the grid mapping of the individual particles. This additionally allows for a rigorous implementation of redshift space distortions (RSD) by utilizing the exact velocity information of each dark matter particle individually. Previous implementations of RSD in #beorn were based on approximations of the velocity field derived from the density field. The impact of RSD on the 21-cm signal has been discussed e.g. by @Ross_2021 but is not the focus of this work.