Table of contents

Back to the main page
1 Non-linear structure-formation

Non-linear structure-formation

Figure: The linear power-spectrum today compared to the result of full non-linear simulations. We see that scales $k\gtrsim 0.1 h/$Mpc have gone non-linear.

Linear perturbation theory is all we need to study the CMB (with some small exceptions, e.g. gravitational lensing of the CMB requires us knowing the non-linear power-spectrum), however for structure formation the density contrast goes non-linear in the late Universe and linear perturbation theory breaks down. To be able to predict clustering on smaller scales the main approach is to do numerical simulations. Our Universe is complicated and there are many levels for which we can perform such simulations: with dark matter and baryonic gas, including effects of supernova explosion and super massive black holes at the center of galaxies, including radiation transfer of photons and so on and so on. We will here focus on the simplest case of pure dark matter simulations where all mass is treated as collisionless. This is a very good approximation for many things, however it has its shortcomings in that we (obviously) cannot form galaxies, but we can form the dark matter halos for which galaxies form inside and we can compute statistical observables like the matter power-spectrum quite accurately. Here we will go through the basic theory and algorithms for performing such simulations from generating initial conditions, performing the numerical simulations and anayzing the output. We will also briefly talk about how one can make it more realistic by adding in more physics and some other approaches to non-linear structure formation than N-body simulations.

The Vlasov-Poisson equations

We already know the equation that we really want to solve to determine the non-linear evolution for a collisionless fluid: the full fucking Boltzmann equation. We have already derived this one in linear perturbation theory, but lets do it again and not discarding any second order terms like we did in our previous derivation. We start with the distribution function $f = f(t,\vec{x},\vec{p})$ and write the Boltzmann equation as $$\frac{df}{dt} = \frac{\partial f}{\partial t} + \nabla_x f \cdot \frac{d\vec{x}}{dt} + \nabla_p f \cdot \frac{d\vec{p}}{dt} = 0$$ where the momentum is defined (slightly differently as we did before) as $$\vec{p} = m a^2 \frac{d\vec{x}}{dt}$$ and the geodesic equation for this quantity gives us $$\frac{d\vec{p}}{dt} = -m\nabla_x \Phi$$ and as usual the Poisson equation gives us $$\nabla_x^2 \Phi = 4\pi G a^2 (\rho - \overline{\rho})$$ where $$\rho = m\int \frac{d^3p}{(2\pi)^3} f$$ The full Boltzmann equation can then be written $$\frac{\partial f}{\partial t} + \nabla_x f \cdot \frac{\vec{p}}{ma^2} - m\nabla_p f \cdot \nabla_x \Phi = 0$$ $$\frac{\partial f}{\partial t} + \nabla_x f \cdot \frac{\vec{p}}{ma^2} - m\nabla_p f \cdot \nabla_x \Phi = 0$$ $$\nabla_x^2 \Phi = 4\pi G a^2 (m\int \frac{d^3p}{(2\pi)^3} f - \overline{\rho})$$ This closed system is the so-called Vlasov-Poisson system and governs the evolution of a self-gravitating collisionless fluid of particles of mass $m$. The only problem with it is that it's $7$ dimensional so terriby expensive to solve numerically. We are not going to attempt this, though some people have solved this equation numerically. Instead we are going to use the N-body technique to approximate the distribution function by using a set of tracer particles to sample it instead. This is the main way of simulating non-linear structure formation of cold dark matter in cosmology today. We take $$f = \sum_{i=1}^N \delta(\vec{x} - \vec{x}_i(t)) \delta(\vec{p} - \vec{p}_i(t))$$ and by substituting this into the Vlasov-Poisson system we get $$\frac{d^2\vec{x}}{d\tau^2} = -\nabla \Phi$$ which is just the usual Newtonian laws of motion (in an expanding Universe, but this is "hidden" by the choice of coordinates $d\tau = \frac{dt}{a^2}$) for how the particles move. The density needed in the Poisson equation can then formally (this is not how its done in practice, but we'll get back to this) be computed as $$\rho = \sum_{i=1}^N m\delta(\vec{x} - \vec{x}_i(t))$$ As long as we have enough particles we will be able to accurately sample the distrbution function. Thus the rough outline of a simulation is as follows

  • Generate a set of particles $\{\vec{x}_i,\vec{v}_i\}_{i=1}^N$ in a periodic box with comoving size $B$.
  • Then start time-stepping. For each time-step we compute the force $\nabla\Phi$, for example by binning the paricles to a grid to get $\rho$ and solving the Poisson equation (for example by Fourier transforms).
  • Then we update the positions and velocities of the particles using some integration scheme for example $$\vec{x}_{i+1} = \vec{x}_i + \Delta \tau \vec{v}_i$$ $$\vec{v}_{i+1} = \vec{v}_i - \Delta \tau (\nabla\Phi)_i$$
  • Do this until we reach the present time.

From the particles we can compute anything we want: power-spectra, locate halos, computing lensing of light around structures etc. etc. etc. In the next section we will go through in more detail how to do these steps (and the various algorithms that have been invented for this purpose): the initial conditions, the time-stepping, how to compute forces and how to analyze the data.

Lagrangian Perturbation Theory

The kind of perturbation theory we go through in this course is Eulerian perturbation theory: we look at perturbations to fluid quantities. Another common way of doing perturbation theory is to consider particles making up the fluid and follow their trajectories. This is called Lagrangian perturbation theory and (among many other things) it's used to generate initial conditions (particle positions and velocities) for doing non-linear simulations of structure formation. On small scales the density contrast $\delta$ will get of the order of unity for which linear perturbation theory breaks down and to be able to simulate structure formation in this regime we need simulations. In this short note we will go through the basics of how one can do this.

We will in this note use a time-coordinate $\tau$ which is defined via ${\rm d}\tau = \frac{{\rm d}t}{a^2}$ and dots below are time-derivatives wrt $\tau$. This is easier since the geodesic equation $$\frac{d^2{\bf x}}{dt^2} + 2H\frac{d{\bf x}}{dt} = - \frac{1}{a^2}\nabla\Phi$$ becomes $$\frac{d^2 {\bf x}}{d\tau^2} = -\nabla\phi$$ where $\phi = a^2\Phi$ so the equation looks just as in Newtonian gravity where $\phi$ is described by the Poisson equation $$\nabla^2 \phi = \beta \delta$$ where $\beta = 4\pi G a^4 \overline{\rho} = \frac{3}{2}\Omega_M a$.

In Lagrangian perturbation theory we describe the position ${\bf x}$ of a particle as a function of its initial position ${\bf q}$ as: $${\bf x} = {\bf q} + {\bf \Psi}({\bf q},\tau)$$ where $\Psi$ is the so-called displacement field (not to be confused with the meric perturbation $\Psi$ - this is an unrelated quantity). Mass conservation gives us $\rho({\bf x,\tau}){\rm d}^3{\bf x} = \rho({\bf q}){\rm d}^3{\bf q}$ so the density contrast is given by $\delta = \frac{1}{J}-1$ where $J = \left|\frac{d^3{\bf x}}{d^3{\bf q}}\right|$ is the determinant of the Jacobian of the transformation between ${\bf q}$ and ${\bf x}$.

The equations of motion are as usual given by the Poisson equation and the geodesic equation $$\nabla^2_{\bf x} \phi = \beta \delta$$ $$\ddot{{\bf x}} = -\nabla_{\bf x} \phi$$ where $\beta = 4\pi G a^4\overline{\rho}$. To find the equations for ${\bf \Psi}$ we take the gradient $\nabla_{\bf x}$ of the last equation to arrive at (we use a subscript ${\bf x}$ to not confuse it with a gradient wrt ${\bf q}$ which is easy to do in these kinds of calculations) $$\nabla_{\bf x} \cdot \ddot{{\bf x}} = -\nabla_{\bf x}^2 \phi = \beta\delta \implies J\nabla_{\bf x} \cdot \ddot{{\bf x}} = \beta (1-J)$$ We now expand in a perturbative series $${\bf \Psi} = \epsilon {\bf \Psi}^{(1)} + \epsilon^2 {\bf \Psi}^{(2)} + \ldots$$ where $\epsilon = 1$ is a parameter there just to keep track of the order or each term.

Let's compute the Jacobian to first order. The matrix has components $\delta_{ij} + {\bf \Psi}_{i,j}$ where a comma denotes a derivative wrt the coordinate that follows (note that these are ${\bf q}$ derivatives and not ${\bf x}$!). We have $$ \begin{align} J = \left|\begin{array}{ccc} 1 + {\bf \Psi}_{1,1} & {\bf \Psi}_{1,2} & {\bf \Psi}_{1,3} \\ {\bf \Psi}_{2,1} & 1 + {\bf \Psi}_{2,2} & {\bf \Psi}_{2,3} \\ {\bf \Psi}_{3,1} & {\bf \Psi}_{3,2} & 1 + {\bf \Psi}_{3,3} \\ \end{array}\right| = 1 + ({\bf \Psi}_{1,1} + {\bf \Psi}_{2,2} + {\bf \Psi}_{3,3}) + \nonumber\\ ({\bf \Psi}_{1,1}{\bf \Psi}_{2,2}+{\bf \Psi}_{2,2}{\bf \Psi}_{3,3}+{\bf \Psi}_{3,3}{\bf \Psi}_{1,1} - {\bf \Psi}_{1,2}{\bf \Psi}_{2,1} - {\bf \Psi}_{2,3}{\bf \Psi}_{3,2} - {\bf \Psi}_{3,1}{\bf \Psi}_{1,3}) + \ldots \end{align} $$ where the omitted terms have three factors of ${\bf \Psi}$ (so will be atleast third order when we do a perturbative expansion). Thus to first order we simply have $J = 1 + \epsilon \nabla_{\bf q}\cdot {\bf \Psi}^{(1)} + \mathcal{O}(\epsilon^2)$. The next term we need is $$\nabla_{\bf x} \cdot \ddot{{\bf x}} = \nabla_{\bf x} \cdot \ddot{{\bf \Psi}}$$ To compute this we can write $$\nabla_{{\bf x}}\cdot \ddot{{\bf \Psi}} = \frac{d{\bf q}_k}{d{\bf x}_i}\frac{d}{d{\bf q}_k}\ddot{{\bf \Psi}}_i$$ To first order we simply get $\epsilon\nabla_{\bf q}\cdot \ddot{{\bf \Psi}}^{(1)}$. This gives us the first order equation $$\nabla_{\bf q}\cdot \ddot{{\bf \Psi}}^{(1)} = \beta \nabla_{\bf q}\cdot {\bf \Psi}^{(1)}$$The initial conditions we have to specify is the initial density. At the initial time $\nabla_{\bf q}\cdot {\bf \Psi}_{\rm ini}^{(1)} = -\delta_{\rm ini}$ where $\delta_{\rm ini}$ is the initial density contrast.

This equation is separable. We can write ${\bf \Psi}^{(1)}({\bf q},\tau) = D_1(\tau) {\bf \Psi}^{(1)}({\bf q},\tau_{\rm ini})$. Assuming the velocity field is irrotational we can write it as the gradient of a scalar-field ${\bf \Psi}^{(1)} = \nabla \phi^{(1)}$ and we arrive at an ODE for the growth factor $D_1$ (this is the same equation as we have for the Eulerian matter density contrast $\delta_m$ on sub-horizon scales): $$\ddot{D_1} = \beta D_1$$ where (by definition) $D_1(\tau_{\rm ini}) = 1$ (and the other initial condition follows from knowing the growing mode solution in the matter dominated era $D_1\propto a$ so $\frac{d\log D_1}{d\log a} = 1$) and a PDE for the displacement-field: $$\nabla^2_{\bf q} \phi^{(1)}_{\rm ini} = -\delta_{\rm ini}$$ This is easily solved using Fourier transforms $$\phi^{(1)}_{\rm ini} = \mathcal{F}^{-1}[\frac{1}{k^2} \mathcal{F}[\delta_{\rm ini}]]$$ This allows us to create particles from a given density-field as follows

  • Generate a density-field at some redshift (say a gaussian random field generated from a linear power-spectrum)
  • Compute the growth-factors and the displacement field by solving the PDE above using Fourier transforms
  • Make a uniform distribution of particles (these are the initial ${\bf q}$ positions)
  • Displace the particles according to ${\bf x} = {\bf q} + {\bf \Psi}$
  • Give the particles a velocity according to $\dot{{\bf x}} = \dot{{\bf \Psi}}$ (notice that to first order this is just $\frac{\dot{D_1}}{D_1} {\bf \Psi}$ where the derivative of the growth-factor is evaluated at the initial redshift)

One can go to further and compute equations for any order you want. The next order is left as an exercise and gives the result ($\Psi^{(2)}({\bf q},\tau) = D_2(\tau)\nabla_{\bf q} \phi^{(2)}_{\rm ini}({\bf q})$) $$\nabla^2_{\bf q} \phi^{(2)}_{\rm ini} = \left(\frac{D_2}{D_1^2}\right)_{\rm ini}[(\phi^{(1)}_{\rm ini,xx})^2 + (\phi^{(1)}_{\rm ini,yy})^2 + (\phi^{(1)}_{\rm ini,zz})^2 - (\phi^{(1)}_{\rm ini,xy})^2 - (\phi^{(1)}_{\rm ini,yz})^2 - (\phi^{(1)}_{\rm ini,zx})^2]$$ $$\ddot{D_2} = \beta(D_2 - D_1^2)$$ where $\phi^{(1)}_{\rm ini,xy} = \frac{\partial^2 \phi^{(1)}_{\rm ini}}{\partial x \partial y}$ and similar for the other terms. The growing mode in a matter dominated Universe has $D_2 = -\frac{3}{7}D_1^2$ so the prefactor $\left(\frac{D_2}{D_1^2}\right)_{\rm ini} = - \frac{3}{7}$ and the initial conditions for $D_2$ are $D_2(\tau_{\rm ini}) = -\frac{3}{7}$ (and $\frac{dD_2}{d\log a} = -\frac{6}{7}D_1^2 = -\frac{6}{7}$). The PDE above is easily solved using Fourier transforms: $$\phi^{(1)}_{\rm ini,xy} = \mathcal{F}^{-1}[-k_xk_y \mathcal{F}[\phi^{(1)}_{\rm ini}]]$$ so from $\phi^{(1)}_{\rm ini}$ in Fourier-space we only need to multiply the grid by $k$'s and transform back to real-space to compute the right hand side in the PDE above (we need to do $6$ Fourier transforms to get all the $xx,yy,zz,xy,yz,zx$ terms needed). Given the right hand-side in real-space we can solve for $\phi^{(2)}_{\rm ini}$ by again using Fourier transforms: $$\phi^{(2)}_{\rm ini} = \mathcal{F}^{-1}[-\frac{1}{k^2} \mathcal{F}[S_{\rm right-hand-side-term}]]$$

See the bottom of page 20 - page 23 of these notes for more details about Lagrangian perturbation theory.

Algorithms for solving the Poisson equation

TODO: Direct summation (Particle-Particle), Fourier transforms (Particle-Mesh), Tree methods, Hybrid methods (Tree-PM, $P^3$M), Relaxation methods, Fast approximate methods

TODO: B-spline interpolations: Nearest Grid Point (NGP), Cloud in Cell (CIC), Triangular Shaped Cloud (TSC), ...

Algorithms for time-integration

TODO: Symplectic integrators, Leapfrog, Yoshida, etc.

Algorithms for power-spectrum estimation

The power-spectrum is just the square of the Fourier components. Starting with an overdensity field $\delta(\vec{x})$ we can compute this by first taking the (discrete) Fourier transform to get $\hat{\delta}(\vec{k})$, define some bins $k_1,k_2,\ldots,k_i = k_1 + i\Delta k$ and estimate the power-spectrum by computing $$P(k) = \left\lt |\delta(\vec{k})|^2\right\gt = \frac{\sum_{k-\Delta k/2 \lt |\vec{k}| \lt k + \Delta k/2}|\delta(\vec{k})|^2}{\sum_{k-\Delta k/2 \lt |\vec{k}| \lt k + \Delta k/2} 1}$$ where the mean is taken over all Fourier modes with $k-\Delta k/2 \lt |\vec{k}| \lt k + \Delta k/2$ where $\Delta k$ defines the size of the binning we use. There are some effects that are important to be aware off that one should correct to get accurate results.

By discretizing the density field on a grid we only sample it at discrete points $(\frac{i}{N},\frac{j}{N},\frac{k}{N})B$ with $i,j,k\in \{0,1,\ldots,N-1\}$ and the resulting Fourier transforms will then only gives us the Fourier modes $(\frac{i}{N},\frac{j}{N},\frac{k}{N})\frac{2\pi}{B}$ with $i,j,k\in\{-N/2,\ldots,-1,0,1,\ldots,N/2\}$. The largest frequency we can sample is to so-called Nyquist frequency $k_{\rm ny} = \frac{N}{2}\frac{2\pi}{B}$ corresponding to a wavelength equal to twice the grid-size $\Delta x = B/N$. The smallest frequency we have access to is the so-called fundamental mode $k_{\rm fund} = \frac{2\pi}{B}$ corresponding to a wavelength the size of the box. These two scales sets the resolution of a numerical power-spectrum.

Sample variance:

Remember cosmic variance? We have the same effect in simulations. If the density field we started off with is a gaussian random field then for a given wavenumber $k$ we only have $N_k \simeq \frac{4\pi k^2\Delta k}{k_{\rm fund}^3}$ modes availiable to estimate the power-spectrum from so there is a sample variance $$\Delta P(k) = \frac{P(k)}{\sqrt{N_k}}$$ associated with this and the power-spectrum will have a big uncertainity on the largest scales. One can beat down this variance by running many simulations with different initial conditions (which effectivey gives us more modes) or use a bigger boxsize such that the modes of interest are well sampled. One "trick" of getting around this almost completely, that turns out to work very well and not give biased results for a wide range of observables even though its not a perfect gaussian random field anymore, is to use so-called amplitude fixed initial conditions (see initial conditions section). This can also be combined with paired simulations: we run two simulations where the second one has all the phases inverted (to an overdensity becomes an underdensity) and compute the mean power-spectra of these two simulations.

Discrete Fourier transform:

The cosmological density field is continuous, while in a simulation we often discretize it on a grid to obtain the density field at $N^3$ points $(i_xB/N_,i_yB/N,i_zB/N)$ with $i_x = 0,1,2,\ldots, N-1$ and similary for $y$ and $z$. We can then perform a discrete Fourier transform by computing the sum $$f(\vec{k}) = \sum_{\vec{x}} f(\vec{x}) e^{-i\vec{k}\cdot\vec{x}}$$ to get the Fourier modes on a grid $\vec{k} = (2\pi i_x/BN, 2\pi i_y/BN, 2\pi i_z/BN)$ with $i_x = -N/2, \ldots, -1,0,1,\ldots, N/2-1$ and similary for $y$ and $z$. The inverse transform is given by $$f(\vec{x}) = \frac{(2\pi)^3}{B^3}\sum_{\vec{k}} f(\vec{k}) e^{i\vec{k}\cdot\vec{x}}$$ where the prefactor is the discrete equivalent of the $\frac{1}{(2\pi)^3}$ factor we need in continuous Fourier transform / inverse transform. There exists fast algorithms for computing such Fourier transforms that can do these sums with a $N\log(N)$ complexity (for example in C/C++ one can download and use the FFTW library).


When we compute the density field by assigning particles to a grid we compute $W*\delta$ where $W$ is our density assignment kernel. By the convolution theorem $\mathcal{F}[A*B] = \mathcal{F}[A]\mathcal{F}[B]$ so the fourier transform so the Fourier transform we compute is really $$\delta_{\rm Computed}(\vec{k}) = W(\vec{k})\delta(\vec{k})$$ The B-spline density assignement kernels are defined as $W = H$ for NGP ($H$ being the tophat), $W = H*H$ for CIC and in general $W = H*\ldots *H$ for a $N$th order B-spline kernel. The Fourier transform of a 1D tophat of size $\Delta x$ (in this case the grid-spacing) is simply $\frac{\sin(k\Delta x/2)}{k\Delta x/2}$ so in 3 dimensions the $N$th B-spline kernel is $$W(\vec{k}) = \left(\frac{\sin(k_x\Delta x/2)}{k_x\Delta x/2} \frac{\sin(k_y\Delta x)}{k_y\Delta x/2}\frac{\sin(k_z\Delta x/2)}{k_z\Delta x/2}\right)^N$$ Thus even though the density assignment for high order looks very complicated it has a simple Fourier space window function. The density field we want, $\delta(\vec{k})$, can be found from the one we compute, $\delta_{\rm Computed}(\vec{k})$, by simply dividing by the window function. This is called deconvolution (we undo the convolution that we did when doing the density field assignement). In the figure below we show the result of this deconvolution.


Aliasing occurs whenever the use of discrete elements to capture or produce a continuous signal causes frequency ambiguity. When a signal being sampled has components at frequencies higher than the Nyquist frequency $k = \pi / B$ (the sampling frequency we use that is defined by the grid-size we use) then every periodic signal with a frequency greater than the Nyquist frequency will look exactly like some other periodic signal at a frequency less than the Nyquist frequency which will be folded into the frequencies we have when we compute the dicrete Fourier transform. This comes into play when we compute the Fourier transform of the density field. Due to the finite sampling of the convolved density field we get this aliasing effect and the density field will get periodic images. The easiest way to get around this issue is to use a large enough grid such that aliasing is not a problem for the modes you are interested in, but that can be computationally expensive. There is luckily a much easier method which is to use a technique called interlacing.

Interacing consists of computing two density fields. One by assigning particles to a grid and one by assigning it to a grid that is displaced by $\Delta x/2$ in each direction. In practice this is done by adding $(\Delta x/2, \Delta x/2, \Delta x/2)$ to the particle positions when assigning this second density field. The Fourier transform of the two density fields are the same up to their aliasing properties and has the nice effect that if we add the two Fourier transforms $\frac{\delta_1(\vec{k})+ \delta_2(\vec{k})}{2}$ then we cancel out the leading alias contribution and therefore allows us to get trustable power-spectra to much larger $k$ (up to the Nyquist frequency) than without.

Shot-noise correction:

The power-spectrum is the Fourier transform of the two-point correlation function which counts how many pairs of particles we have at a given separation. Whenever we work with point-particles then the particle and itself will be at zero separation. Thus the self-correlation of particles leads to $\xi(r=0) \propto N_{\rm particles}$ which is equivalent to a broadband correction $\frac{1}{n}$ to $P(k)$ where $n = N_{\rm particles} / B^3$ is the number density of the particles in our box. The power-spectrum we compute must be corrected for this by subtracting this so-called shot-noise contribution. This is straight forward, we simply subtract off a constant to the power-spectrum we have computed $$P(k) \to P(k) - \frac{1}{n}$$ This is mainly an issue when we work with galaxies (or halos) where the number-density is not very high. In the figure below we show the effect of shot-noise.

Direct summation:

The overdensity field in our simulation box can be exactly represented as $$\frac{\rho}{\overline{\rho}} = 1 + \delta(\vec{x}) = V\frac{1}{N_{\rm particles}}\sum_{i=1}^{N_{\rm particles}} \frac{m_i}{\overline{m}} \delta_D(\vec{x} - \vec{x}_i)$$ where $m_i,\vec{x}_i$ is the mass/position of particle $i$, $\overline{m} = \sum m_i/N_{\rm particles}$ is the mean particle mass, $V=B^3$ is the volume of the box, and $\delta_D$ the Dirac delta function. We can use the disrete Fourier transform of the delta function $$\delta_D(\vec{x}) = \frac{1}{V}\sum_{\vec{k}} e^{i\vec{k}\cdot \vec{x}}$$ to get $$\delta(\vec{k}) = -\frac{V\delta_{\vec{k},0}}{(2\pi)^3} + \frac{V}{(2\pi)^3}\frac{1}{N_{\rm particles}}\sum_{i=1}^{N_{\rm particles}} \frac{m_i}{\overline{m}} e^{-i\vec{k}\cdot \vec{x}_i}$$ This allows us to compute exactly the Fourier components of the density field from which we can estimate the power-spectrum. Doing it this way we have no issue with aliasing and we don't need to do any deconvolution (but shot-noise still needs to be subtracted), however its very expensive as computing this sum for all the same grid-points as we would compute it for by performing the Fourier transform requires $N^3\cdot N_{\rm particles}$ operations! This method is therefore mainly used as a test of other faster methods.

Algorithms for halo-finding

TODO: Friends-of-friends, Phase-space FoF, spherical overdensity, unbounding, ...

Other things

TODO: massive neutrinos, relativistic effects, finite-box effects, mode-coupling, baryonic effects, ...