Saturday, October 20, 2012

Galacticus v0.9.1 released

After just over one year of development, v0.9.1 of Galacticus is now released! That means that v0.9.1 is now considered "stable" and will receive only bug fixes (not any new features). It also means that v0.9.0 is now considered obsolete and will no longer be supported.

Most of the work on Galacticus v0.9.1 has been to improve robustness, resulting in a much more stable code. However, there are some new features too:
  • Several new "simplified" implementations of galaxy physics have been added. These are designed to allow Galacticus to run simplistic models, perhaps for comparison to other semi-analytic models, or to explore the importance of including more detailed physics. They also typically result in shorter run times (at the cost of reduced realism). These new additions include:
    • a galactic structure solver in which radii scale linearly with angular momentum;
    • an even simpler galactic structure solver which simply fixes radii of galaxies to the virial radius times the spin parameter (multiplied by a user-specified coefficient);
    •  simple "power-law" scaling algorithms for cooling rates, feedback in disks and star formation timescales in disks;
  • New algorithms have been added to solve the Press-Schechter excursion set problem, allowing for modelling of non-CDM cosmologies (for further details see the previous blog post);
  • The reionization epoch (used to cut off accretion into low mass halos) can now be specified via the optical depth to reionization instead of specifying the redshift directly;
  • The Run_Galacticus.pl script has some new features:
    • Allows for generic analysis scripts to be run on each completed models;
    • Models can be run on a Condor cluster.
  • Merger trees constructed within Galacticus can now be exported into either Galacticus' native file format or the IRATE format.
  • AGN luminosities can now be computed in any filter, using SEDs which depend on the black hole bolometric luminosity, and including the effects of absorption at X-ray wavelengths;
  • The black hole/AGN feedback model has been improved to allow "radio-mode" feedback to occur only when the halo is in the "hot-mode" regime, and to allow energy injected into the hot halo to drive mass out of the halo if the input power exceeds the cooling rate;
  • The Wetzel & White (2010) satellite merger timescale algorithm is now available;
  • It is now possible to construct lightcones:
    • Includes a script which grabs all merger trees in a lightcone from the Millennium Database;
    • New output and output filtering options allow for only galaxies within the lightcone to be output and for their positions to be translated to lightcone coordinates.
Development now moves to v0.9.2 of Galacticus which has an entirely re-written internal class structure for nodes and galaxies. This makes it very easy to add new components and extend existing ones, and does away with the need to write significant amounts of boilerplate for each new component. Other plans for v0.9.2 include doing away with several of the assumptions which are forced on N-body merger trees to allow handling of fly-by encounters, further optimizations, and improvements to the physics in several areas.

Sunday, September 16, 2012

What if dark matter isn't just plain vanilla CDM?

The cold dark matter (CDM) hypothesis has been enormously successful in describing a wide variety of cosmological observations - from the Cosmic Microwave Background to the large scale distribution of galaxies and the properties of galaxies themselves. But, there are a few areas where things aren't so clear cut. These cases are all somewhat ambiguous - because they all involve the complex physics of galaxies and baryons which we don't understand well enough to make really strong statements. But, it's intriguing enough that there's been a lot of interest in alternative types of dark matter recently.

As a result of this, we've just pushed substantial new functionality to Galacticus v0.9.1 (coinciding with a new paper, which appears on arXiv today) which allows it to follow the formation of dark matter structure in non-CDM universes. It's currently set up to work specifically for warm dark matter (WDM) cases, but the underlying algorithms can handle any type of dark matter (for those in the know, I'll add the caveat that it's any type of dark matter whose physics can be described by modifications to the linear theory power spectrum and the barrier for collapse in excursion set theory), due to a neat numerical algorithm figured out by my former student Arya Farahi.

The results look good. Here's a comparison of the dark matter halo mass function with an N-body simulation of warm dark matter by Schneider et al. (2012):

The yellow/red circles were measured from Schneider et al.'s simulation (at low masses the simulation becomes unreliable - shown by the tiny circles - so ignore it in that region). The solid blue line is the result from Galacticus - in excellent agreement with the N-body result and showing the expected cut-off at low masses (well, not that low, Schneider et al. use a relatively light, 0.25keV, WDM particle which isn't consistent with current constraints - but here we're only concerned with comparing the two techniques).

Schneider et al. only include one effect of WDM - the suppression of small scale power in the linear theory power spectrum. A secondary effect, due to the velocity dispersion of the WDM particles, is to make it more difficult for small overdense regions of the Universe to collapse. To make a fair comparison, I ignored that effect in Galacticus also. But, if I add it back in, the result is the green dot-dashed line - it has a very strong effect on the number of low mass halos around the cut-off. (If you want to know what the pink dotted line is all about, check out the preprint!)

The modular nature of Galacticus makes it relatively easy to add in new algorithms such as this - all of its galaxy formation physics will continue to work happily with merger trees built from WDM.

Saturday, May 12, 2012

Galacticus Statistics Page

Sometimes procrastination can have interesting results... I was avoiding diving back into a debugging session earlier in the week and so finally finished putting up a simple web page showing a few interesting statistics on the Galacticus project.

The page currently shows three charts (these are annotated timelines powered by Google - easy to use and a nice way to show this information, even if they are Flash based....).

The first chart shows Galacticus revision number (i.e. the latest entry in its Bazaar archive as a function of time). Turns out that we average about 1.1 commits per day!

The second chart shows source lines of code (measured with the excellent SLOCCount, augmented by my own count of embedded directives for the build system). There's been about 20k of new lines added in the past year, brining in a lot of new functionality. I'm hoping to see this curve drop significantly in the next few months as I merge in some very substantial changes to the way in which Galacticus stores and manipulates the objects that it uses to represent galaxies.

Finally, the third chart shows the total instruction fetch count (measured by callgrind) for a simple Galacticus model that gets run automatically every day to benchmark the code. I've only been saving this data for the past few weeks, but already there's a noticeable decline due to a few recent optimizations. This is another line that I'm hoping to see drop significantly - there are a few patches waiting to merge in that should improve performance (particularly when running under OpenMP).

Wednesday, January 25, 2012

Reproducibility of Galacticus Models (and any other kind of model)

The blog has been a little too quiet recently as I've been focussing on a couple of new science results which should be finished in the next month or two. But, here's something from a side-project that grew out of a conversation with Matthew Turk over on Google+.

I suspect that 99.999% of all scientists have had this experience: You do some calculation and you make a figure showing the results. All good stuff. Then, six months later after you've finished teaching for the year, you come back to that calculation and check that you've remembered how to do it by remaking that figure. But, this time, it doesn't look quite the same........

This is a common example of the problem of reproducibility in science. Often, the final output of a calculation is the result of a long and complicated set of calculations, in which many different choices had to be made at each step. Sure, we should keep careful notes of how every step proceeded and all of the choices made, but..... sometimes we forget.

So, why not have all of the relevant information stored automatically. Seems like an obvious idea. The next question is where to store it? The Google+ discussion I mentioned above lead to the idea of using the fact that many image formats (e.g. JPG, PDF etc.) can store arbitrary metadata alongside their graphical contents. (In jpegs, this is often used to store information about the camera settings used to take the photo for example.) Making use of this facility, it's possible to store all relevant metadata required to recreate a figure derived from Galacticus - and to store it in the same file as the figure. That way, the metadata is never separated from the figure and never gets lost. If those figures form part of a scientific publication that gets uploaded to arXiv.org, then the instructions necessary to reproduce the result are stored forever.

So, today's revision of Galacticus includes some functionality to achieve this. When a figure is created a whole bunch of information is now stored inside of it, including:
  • the specific version and revision of Galacticus used;
  • the entire source changeset (i.e. any differences between the source code used to compile Galacticus for this model and the mainline branch of Galacticus) - this now gets stored in the main Galacticus output file too (thanks to Matt for this suggestion);
  • library and compiler versions, and Makefile options used to build Galacticus;
  • all input parameters passed to Galacticus;
  • the complete analysis script used to create the figure;
  • the UUID of the Galacticus model from which the results were derived.
There's also a tool provided which will extract this information from files, list the commands needed to recreate the exact version of Galacticus used, and generate a suitable input parameter file to reproduce the original model.

I think this is a really useful way to improve the reproducibility of results - and the ideas are applicable to any scientific work. I'll be advocating loudly for adopting a similar approach in other scientific tools!