Wednesday, August 24, 2011

The Tortoise and the Hare

It's not uncommon that I hear from criticism that "Galacticus is too slow...". I can see this point of view, but I think it's missing something really, really crucial - something that Aesop understand in his fable of "The Tortoise and the Hare". The bottom line: it's better to get there slowly than not at all. Or, more specifically, it's better to get the right answer slowly than the wrong answer quickly.

First of all, we have to define what we mean by "slow" in this case. Typically, it's used to mean that Galacticus is slow compared to other, comparable models of galaxy formation.  That's correct. But the comparison is only really fair if the models being compared are solving the same physics, at the same level of complexity. Galacticus typically runs more complicated models of black hole evolution and star formation than other models for example. Fortunately, Galacticus can mimic the implementations of other models, as I discussed here. So, let's take the Baugh et al. (2005) model and run it within Galacticus, then compare it to the same model run with its original code.

It's already been demonstrated (see this article) that Galacticus produces results which are converged to the level of a percent or better due to the way that it constructs galaxy merger trees and solves the differential equations that describe galaxy formation. To do a fair comparison we need to run the Baugh et al. (2005) model using its original code to a comparable level of accuracy. The accuracy in that case is controlled by the number of steps, Nstep, used in solving the baryonic physics - this is a fixed number in the original Baugh et al. (2005) code, whereas in Galacticus steps are chosen adaptively to preserve a specified tolerance.

A typical number used is Nstep=100. So, we'll compare results for Nstep=100 and Nstep=300. As you can see in the plot below (which shows the mean masses of hot and cold gas and stars in dark matter halos of different masses at z=0), 100 steps is clearly not enough to get a comparable level of convergence. For example, the mean stellar mass in high mass halos is systematically reduced when using 300 steps by around 5%.


OK, so let's try Nstep=1000:
It's an improvement (at least for high-mass halos - not sure what's going on in the low-mas systems....) but still not perfect.

So, how long do these calculations take. Here are benchmarks for running these models (same physics, same implementation, same set of merger trees, same computer - so they're directly comparable):

Galacticus         :  31208 seconds
Original (Nstep= 100):   2070 seconds
Original (Nstep= 300):  22270 seconds
Original (Nstep=1000): 262866 seconds

So, Galacticus is certainly slower than the original model run with Nstep=100, but very similar in speed to the Nstep=300 case and much faster than Nstep=1000.

So, is Galacticus slow? Well, it depends what you want! It's certainly slower than some other models when they're run at their usual level of accuracy. Match accuracies though and it's as fast, if not faster.

Should we care about 5% accuracy though? After all, much of the physics going in to the models is uncertain by significant amounts (certainly much larger than 5%). I think the answer is that we should care. First, in any kind of computational physics we really should absolutely always do convergence tests and get our results to whatever accuracy you want. I'd aim for 1% in these models. Second, Galacticus and other models in its class are being used in large scale, Bayesian searches of the model parameter space to generate constraints on parameters by fitting to observational datasets. Many of those observational data have errors that are much smaller than 5% - we really want the model to be numerically accurate at a higher level than the data. Otherwise, the derived constraints will be subject to biases that arise purely from numerical artifacts that we could have removed by performing a more accurate calculation.

I'm hoping that we'll see more emphasis placed on assessing the accuracy of results from galaxy formation models in the future......

No comments:

Post a Comment