Thursday, July 21, 2011

Accuracy and Semi-Analytic Modelling

I submitted an article today (you can find it on arXiv here) which explores how the accuracy of semi-analytic models of galaxy formation is limited by the input dark matter halo merger trees.

Often, merger trees are extracted from N-body simulations of large scale structure. The usual approach in such simulations has been to dump out all of the particle information at a number of "snapshot" times. These are then post-processed to find dark matter halos which are then linked together to make merger trees. Those merger trees are fed into semi-analytic models which populate them with galaxies.

An obvious question is, "How many snapshots should I take?". The equally obvious answer is, "As many as you can!". But, simulations are large, and storing data from many snapshots quickly uses vast amounts of storage space. So, the real question it "How many snapshots do I need to get an accurate answer?" Surprisingly, this question hasn't been answered. There are a few short statements in a few papers that more-or-less say "We checked that we have enough snapshots." but which don't give any details. But, beyond that, nothing.

Fortunately, Galacticus is ideally suited to answering this question. Using Galacticus, I was able to construct merger trees with effectively infinite time resolution, and then degrade these to appear as they would if they'd been extracted from an N-body simulation with a finite number of snapshots. I could then see how the average properties of galaxies changed as I gradually increased the number of snapshots used.

Quantitatively, the answers are that 128 or more snapshots is sufficient to get most average galaxy properties to an accuracy of 5%. Fewer snapshots and errors begin to increase rapidly. For reference, the much-utilized Millennium simulation used 64 snapshots - not too bad, but few enough to lead to 20-30% errors in some quantities.

You might argue that errors at this level aren't too important (certainly there are other uncertainties in the galaxy formation calculation that are more uncertain than this).I would agree that quantitatively, that's true. Of course, we didn't know that the errors were this small until the test was done! Perhaps more importantly, checking for convergence with respect to numerical parameters (such as the number of timesteps) is just good practice in any kind of computational physics and should always be done.

Another nice feature of Galacticus: because it's free, anyone can run similar tests themselves (example parameter files are available here), and figure out how many snapshots they should use before they run an expensive N-body simulation.

No comments:

Post a Comment