# Overview: FFTWGrid

The FFTWGrid class is a class for holding either a real or a Fourier grid in the same grid (to save memory) and to perform in-place Fourier transforms of the grid with FFTW. With MPI the grid cells are divided among the tasks accoring to their $x$ coordinate (see Figure 1 below). We also have the possibillity to allocate extra slices on the left and right of the local grid to keep boundary values from neighboring tasks that can be filled whenever we need it. This feature is very useful for several algorithms for example when we assign particles to the grid to create a density field and the kernel we use are broader than 1 grid cell (i.e. anything beyond Nearest-Grid-Point assignment) and when we want to compute derivatives of the grid.

See the headerfile FFTWGrid.h for all availiable methods and for how they are implemented. Below is an examples on how to use the grid which shows how to use it and some more info about the availiable methods.

# How the grid is structured

The grid is stored as a one dimensional array and the index of a real cell in the main grid is $i = iz + 2(N/2+1)(iy + N\cdot (ix + \ldots))$, i.e. the last coordinate runs first and we have some extra padding in the first dimension due to how FFTW works. The index of a fourier (a complex number) cell is $i = ikz + (N/2+1)(iky + N\cdot (ikx+\ldots))$. The $N/2+1$ is because a real-to-complex FFT nessesarily satisfies $f(k) = f^*(-k)$ so only the positive $k_z$ values are stored. This is only important if you want to do the loops over the grid yourself and not use the built in methods which takes care of this for you.

On a given task a cell will have local coordinates $(ix,iy,iz)$ where $ix\in[0,n_{x-\rm local})$ and $iy,iz\in[0,N)$. The relation between local and global coordinates are given by $ix_{\rm global} = n_{x-\rm start-local} + ix_{\rm local}$ (the other coords being the same). All methods asking for coord assume you give it local coordinates.

On top of the regular cells we have extra cells on the left and the right of the grid. These are for holding the value of cells on neighboring tasks and can be filled by running grid.communicate_boundaries(). The cells on the left can be accessed with local coordinates $ix=-1,-2,\ldots$ and so on and the cells on the right can be accessed with $ix = n_{x-\rm local}, \ldots$. The extra cells are laid out in memory as if they were normal cells.

Figure 1: Here we show the full grid. The red cells are the active cells belonging to a given task. The pink cells are extra left and right slices from the neighboring tasks that we also store.

# Internal Types

The grid uses the internal type FloatType for real quantities and ComplexType for fourier quantities. This is just double and std::complex<double> unless we use single or long double precision (see FFTWGlobal.h for type definitions). Gridcell indices (local ones) have the type IndexIntType which is typically set to be long long int (it has to be signed if you want to change it). Coordinates are just normal int's.

# Accessing the raw data

We provide iterators for looping over all the local gridcells in either real-space or fourier-space (and methods for getting the position/fourier wavenumber of the cell in the global grid):

// Loop over all real cells
for(auto && real_index : grid.get_real_range()){
// Get the value in the cell
auto value = grid.get_real_from_index(real_index);

// Assign a value to the cell
grid.set_real_from_index(real_index, value);
}

// Loop over all fourier cells
for(auto && fourier_index : grid.get_fourier_range()){
std::array<double, NDIM> kvec;
double kmag2;

// Get the Fourier wavevector k_vec and the norm |k_vec|^2 of the current cell
grid.get_fourier_wavevector_and_norm2_by_index(fourier_index, kvec, kmag2);

// Fetch the value in the cell
auto value = grid.get_fourier_from_index(fourier_index);

// Set the value in the cell
grid.set_fourier_from_index(fourier_index, value);
}


If you want to parallelize the range based for loop over real cells this is easily done by splitting the loop into a loop over slices plus a loop over cells in the slices and adding the OpenMP pragma to the former loop. For fourier cells there is no padding to worry about so one can just have one raw loop over all the cells and parallelize that or do it the same was for for the real case:

// Loop over all real cells in parallel
// Each tread work on a different x-slice
#pragma omp parallel for
for(int islice = 0; islice < grid.get_local_nx(); islice++){
// Loop over all real cells in a given slice (fixed x-value)
for(auto && real_index : grid.get_real_range(islice, islice+1)){
...
}
}

// Loop over all Fourier cells in parallel
// Each tread work on a different kx-slice
#pragma omp parallel for
for (int islice = 0; islice < grid.get_local_nx(); islice++) {
// Loop over all fourier cells in a given slice (fixed kx value)
for (auto && fourier_index : grid.get_fourier_range(islice, islice + 1)) {
...
}
}


If you want access to the raw data there are methods for getting that:

FloatType *get_real_grid_left();               // The left most slice (start at slice ix = -nleft_extra)
FloatType *get_real_grid();                    // The main grid       (start at slice ix = 0)
FloatType *get_real_grid_right();              // The right grid      (start at slice ix = NLocal_x)
FloatType *get_real_grid_by_slice(int slice);  // Get the ix'th slice (i.e. -nleft_extra <= ix < NLocal_x+nright_extra)
ComplexType *get_fourier_grid();               // The main Fourier grid


So for example if you insist on writing explicit loops for going through the grid and not use the built in methods then you can do it like this (here for a 3D grid):

FFTWGrid<3> grid(Nmesh);
auto local_nx = grid.get_local_nx();
auto local_x_start = grid.get_local_x_start();

// Loop over all the local real grid cells
auto *r = grid.get_real_grid();
for(int i = 0; i < local_nx; i++){
int iglobal = i + local_x_start;
for(int j = 0; j < Nmesh; j++){
for(int k = 0; k < Nmesh; k++){
// The current cell has global coordinates (iglobal,j,k)

// Index of the cell in the real grid
long long int real_index = (i*Nmesh + j)*2*(Nmesh/2+1) + k;

// Set the value in the cell
r[real_index] = 0.0;
}
}
}

// Loop over the local fourier grid cells
auto *f = grid.get_fourier_grid();
for(int i = 0; i < local_nx; i++){
int iglobal = i + local_x_start;
for(int j = 0; j < Nmesh; j++){
for(int k = 0; k < Nmesh/2+1; k++){
// Index of the cell in the fourier grid
long long int fourier_index = (i*Nmesh + j)*(Nmesh/2+1) + k;

// The Fourier wavevector corresponding to the current cell
double kx = iglobal <= Nmesh/2 ? iglobal : iglobal - Nmesh;
double ky = j       <= Nmesh/2 ? j       : j       - Nmesh;
double kz = k       <= Nmesh/2 ? k       : k       - Nmesh;

// Set the value in the cell
f[fourier_index] = 0.0;
}
}
}


We provide some simple function of filling the grid with data:

// Fill the whole grid with a constant value
void fill_real_grid(const FloatType val);
void fill_fourier_grid(const ComplexType val);

// Fill the main grid from a function specifying the value at a given position
void fill_real_grid(std::function<FloatType(std::array<double,NDIM>&)> & func);
void fill_fourier_grid(std::function<ComplexType(std::array<double,NDIM>&)> & func);

// For example fill the grid with sin(2pi(x+y))
std::function<FloatType(std::array<double,NDIM>&)> & func) = [](std::array<double,NDIM>& pos){
return std::sin(2*M_PI*(pos[0] + pos[1]));
}
grid.fill_real_grid(func);


# Fourier transforms

The code uses the same convention as FFTW, here in 1D and non-discrete for simplicity: $$f(k) = \int f(x)e^{-ikx}dx$$ $$f(x) = \frac{1}{2\pi}\int f(k)e^{ikx}dk$$ So for example with derivatives $\nabla f \to +ik f$. The normalization factor we get from taking a transform and then the inverse transform, $\frac{1}{N^{\rm NDIM}}$, is automaticallly taken into account when we do a complex-to-real transform. You can transform the fields as follows:

// Transform to Fourier space
grid.fftw_r2c();

// Transform back to real space (grid is now the same as we started with up to numerical errors - normalization is taken care of)
grid.fftw_c2r();


To save a grid to file and later load it you can use the class methods:

void load_from_file(std::string fileprefix);
void dump_to_file(std::string fileprefix);


This will create files fileprefix.X with X being the task number (FML::ThisTask). There is no support for reading in a grid run on a different number of tasks.

# Status of the grid

If you want to keep control over if the grid is in real or fourier space you can specify this:

// Create a grid and set the status of grid to be in real space
FFTWGrid<3> grid(Nmesh);
grid.set_grid_status_real(true);

// Perform a Fourier transform
grid.fftw_r2c();

// The status of the grid is now in Fourier space
assert( grid.get_grid_status_real() == false );


Whenever we do a transform the status of the grid is updated. If you have DEBUG_FFTWGRID defined then if you try to do a Fourier transform to real space and the status is already real then you will get a warning.

# Planning and FFTW wisdom

The default FFTW planning flag is FFTW_ESTIMATE as this does not need to touch the data in the grid (otherwise we would have to make copies). If you want to make better plans you can use the class methods below for creating FFTW wisdom:

void create_wisdow(int planner_flag, int use_nthreads);
void save_wisdow(std::string filename) const;

// Make a plan
FFTWGrid<3> grid(Nmesh);


Here planner_flag is FFTW_MEASURE, FFTW_PATIENT or FFTW_EXCHAUSTIVE. These methods take care of communicating wisdom to all tasks. Note that you have to create wisdom before filling the grid with data otherwise it will get destroyed. use_nthreads is how any threads we want to use in the FFT and only relevant if USE_FFTW_THREADS is set.

# Memory Logging

If you want to keep track of how much memory are allocated by the grids at any given time one can use USE_MEMORYLOG. This makes a map of all big allocations that can be printed out. For the memory logger to output to you exactly what grids are allocated you must provide a label for this:

// Make a grid
FFTWGrid<3> grid(Nmesh);



# Communicate boundaries

To get the boundary values from neighboring tasks in the extra slices we simply call:

grid.communicate_boundaries();


# Example: Use the class to solve the Poisson equation

Create a grid with Nmesh gridcells per dimension having n_left and n_right extra x-slices on the left and right for holding the boundary from the neighboring tasks (needed by some agorithms, in this example its not needed):

const int NDIM = 3;
const int Nmesh = 16;
const int n_left = 1;
const int n_right = 1;
FML::GRID::FFTWGrid<NDIM> grid(Nmesh, n_left, n_right);

// Number of x-slices on current task
auto local_nx = grid.get_local_nx();

// The cell with local x-coord=0 has x-coord of this in the global grid
auto local_x_start = grid.get_local_x_start();
Loop through all the real grid cells and set the value of the grid cells to be $\phi(x,y,z) = \sin(2\pi x)$:
for(auto && real_index : grid.get_real_range()){
// A vector with the local coordinates (ix,iy,iz) of the cell
auto coord = grid.get_coord_from_index(real_index);

// x position of the cell in the global grid
double x_global = (coord[0] + local_x_start) / double(Nmesh);

// Get the value in the cell
auto value = grid.get_real_from_index(real_index);

// Assign a value to the cell: sin(2pi x)
value = std::sin(2.0 * M_PI * x_global );
grid.set_real_from_index(real_index, value);
}

Perform a Fourier transform of the grid to Fourier space:

grid.fftw_r2c();

Loop over all the Fourier grid cells and divide by $-k^2$:

std::array<double,NDIM> kvec;
double kmag2;
for(auto && fourier_index : grid.get_fourier_range()){
// Get the Fourier wavevector k_vec and the norm |k_vec|^2 of the current cell
grid.get_fourier_wavevector_and_norm2_by_index(fourier_index, kvec, kmag2);

// Fetch the value in the cell
auto value = grid.get_fourier_from_index(fourier_index);

// Divide the cell by -k^2 and assign it and set DC mode to 0
if(kmag2 > 0.0)
grid.set_fourier_from_index(fourier_index, -value / kmag2);
else
grid.set_fourier_from_index(fourier_index, 0.0);
}

Perform a Fourier transform of the grid back to real space:

grid.fftw_c2r();

We have now solved the Poisson equation $\nabla^2 \phi = \sin(2\pi x)$ and have the solution $\phi(x,y,z) = -\frac{1}{4\pi^2}\sin(2\pi x)$ in real space in the grid. Note that the code above works just as well in serial as in parallel as the parallelization is taken into account behing the scenes.