Knowing in detail the structure of the source code is not necessary for the average user. However, the directories where most of the practical work is done are:
*`run`: this is where the compiled executables and lookup tables will reside after compilation.
*`test`: this contains several subdirectories, each of which refers to a specific compilation mode. For instance, compiling WRF for large-eddy simulation will link some executables in `em_les`, while compiling WRF for real-case simulations will link some other executables and lookup tables in `em_real`. Most of the test subdirectories refer to simple idealized simulations, some of which are two-dimensional. These test cases are used to valide the model's dynamical core (e.g., check if it correctly reproduces analytical solution of the Euler or Navier-Stokes equations).
In some cases, editing the model source code is necessary. This mostly happens in these directories:
*`dyn_em`: this contains the source code of the dynamical core of the model ("model dynamics") and of part of the initialization programmes.
*`phys`: this contains the source code of parameterizion schemes ("model physics").
*`Registry`: large chunks of the WRF source code are generated automatically at compile time, based on the information contained in a text file called `Registry`. This file specifies for instance what model variables are saved in the output, and how.
### Compiling the model
WRF is written in compiled languages (mostly Fortran and C++), so it needs to be compiled before execution. It relies on external software libraries at compilation and runtime, so these libraries have to be available on the system where WRF runs.
In general, compiled WRF versions are already available on all of our servers (SRVX1, JET, VSC4, VSC5) from the expert users. So, the easiest way of getting started is to copy a compiled version of the code from them (see below).
However, we describe the typical workflow of the compilation, for anyone that wishes to try it out. There are three steps: (i) make libraries available, (ii) configure, (iii) compile.
#### Make the prerequisite libraries available
In most cases, precompiled libraries can be made available to the operating system using environment modules. Environment modules modify the Linux shell environment so that the operating system is aware of where to find specific executable files, include files, software libraries, documentation files. Each server has its own set of available modules. As of 1.3.2023, WRF is known to compile and run with the following module collections.
Load modules with `module load LIST-OF-MODULE-NAMES`, unload them one by one with `module unload LIST-OF-MODULE-NAMES`, unload all of them at the same time with `module purge`, get information about a specific module with `module show MODULE_NAME`. Modules may depend on each other. If the system is set up properly, a request to load one module will automatically load any other prerequisite ones.
After loading modules, it is also recommended to set the `NETCDF` environment variable to the root variable of the netcdf installation. On srvx1, jet and VSC4, use `module show` to see which directory is correct. For instance:
module-whatis {NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.}
Important note: **The environment must be consistent between compilation and runtime. If you compile WRF with a set of modules loaded, you must run it with the same set of modules**.
#### Configure WRF for compilation
This will test the system to check that all libraries can be properly linked. Type `./configure`, pick a generic dmpar INTEL (ifort/icc) configuration (usually 15), answer 1 when asked if you want to compile for nesting, then hit enter. "dmpar" means "distributed memory parallelization" and enables running WRF in parallel computing mode. For test compilations or for a toy setup, you might also choose a "serial" configuration.
If all goes well, the configuration will end will a message like this:
This is actually a misleading error message. The problem has nothing to do with NETCDF4 not being available, but with the operating system not detecting correctly all the dependencies of the NETCDF libraries. Solving this problem requires manually editing the configuration files (see below).
The configure script stores the model configuration to a file called `configure.wrf`. This is specific to the source code version, to the server where the source code is compiled, and to the software environment. If you a have a working `configure.wrf` file for a given source code/server/environment, back it up.
To solve the NETCDF4 error on srvx1: first, run `configure` and interrupt the process (`Ctrl+C`) before it raises the NetCDF warning; so, `configure.wrf` will not be deleted. Then, make the following changes to the automatically-generated `configure.wrf`:
The first file, `configure.wrf`, is the result of the (wrong) automatic configuration. The second file, `configure.wrf.dmpar` is the manually fixed one. In the latter, additional library link directives (`-lnetcdf` and `-lhdf5`) are added to the variable `LIB_EXTERNAL`, and the full paths to these extra libraries are added to the variable `DEP_LIB_PATH`.
#### Compile WRF
You always compile WRF for a specific model configuration. The ones we use most commonly are `em_les` (for large-eddy simulation), `em_quarter_ss` (for idealized mesoscale simulations), `em_real` (for real-case forecasts). So type either of the following, depending on what you want to get:
```sh
./compile em_les > compile.log 2>&1 &
./compile em_quarter_ss > compile.log 2>&1 &
./compile em_real > compile.log 2>&1 &
```
The `> compile.log` tells the operating system to redirect the output stream from the terminal to a file called `compile.log`. The `2>&1` tells the operating system to merge the standard and error output streams, so `compile.log` will contain both regular output and error messages. The final `&` tells the operating system to run the job in the background, and returns to the terminal prompt.
The compiled code will be created in the `run` directory, and some of the compiled programs will be linked in either of the `test/em_les`, `test/em_quarter_ss` or `test/em_real` directories. Executable WRF files typically have names ending with `.exe` (this is just conventional; it is actually not necessary for them to run).
Compilation may take half an hour or so. A successful compilation ends with:
then you have a problem, and there is no unique solution. Take a closer look at `compile.log` and you might be able to diagnose it.
### Copying compiled WRF code
### Running WRF in a software container
### Running an idealized simulation
### Running a real-case simulation
### Output and restart files
incl. how to modify output paths
### Suggested workflow
### Analysing model output
Things to remember:
* staggered grid (Arakawa-C)
* mass-based vertical coordinate (level height AGL is time-dependent)
* terrain-following coordinate system (curvilinear)
* in the model output, some variables are split into base state + perturbation
[Python interface to WRF](https://wrf-python.readthedocs.io/en/latest/)
Example of a very basic Python class to create an object from a WRF run, initialized with only some basic information:
```py
class wrfrun:
def __init__(self, filename):
self.filename = filename
self.nc = netCDF4.Dataset(filename)
self.dx = self.nc.DX
self.dy = self.nc.DY
self.nx = self.nc.dimensions['west_east'].size
self.ny = self.nc.dimensions['south_north'].size
self.x = np.arange(0,self.nx*self.dx,self.dx)
self.y = np.arange(0,self.ny*self.dy,self.dy)
self.valid_times = self.nc['XTIME'][:]*60
self.current_time = 0
def set_time(self,step):
self.current_time = step
def add_wind(self):
udum = self.nc['U'][self.current_time,:,:,:]
vdum = self.nc['V'][self.current_time,:,:,:]
wdum = self.nc['W'][self.current_time,:,:,:]
self.u = 0.5*(udum[:,:,:-1]+udum[:,:,1:])
self.v = 0.5*(vdum[:,:-1,:]+vdum[:,1:,:])
self.w = 0.5*(wdum[:-1,:,:]+wdum[1:,:,:])
del udum,vdum,wdum
```
The last function adds 3D wind variables at a specific time, after destaggering.
The `wrfrun` class is then used as follows:
```py
wrf = wrfrun("./wrfout_d01_0001-01-01_00:00:00")
wrf.set_time(36)
wrf.add_wind()
```
Variables are then accessible as `wrf.u`, `wrf.v` etc.
### Important namelist settings
## Advanced usage
### Changing the source code
### Conditional compilation
Most Fortran compilers allow passing the source code through a C preprocessor (CPP; sometimes also called the Fortran preprocessor, FPP) to allow for conditional compilation. In the C programming language, there are some directives that make it possible to compile portions of the source code selectively.
In the WRF source code, Fortran files have an .F extension. cpp will parse these files and create corresponding .f90 files. The .f90 files will then be compiled by the Fortran compiler.
This means:
1. When editing the source code, always work on the .F files, otherwise changes will be lost on the next compilation.
2. In the .F files, it is possible to include `#ifdef` and `#ifndef` directives for conditional compilation.
For instance, in `dyn_em/module_initialize_ideal.F`, the following bits of code define the model orography for idealized large-eddy simulation runs. Four possibilities are given: `MTN`, `EW_RIDGE`, `NS_RIDGE`, and `NS_VALLEY`. If none is selected at compile time (select by adding `!` in front of #ifdef and #endif), none of these code lines is compiled and `grid%ht(i,j)` (the model orography) is set to 0:
```fortran
#ifdef MTN
DO j=max(ys,jds),min(ye,jde-1)
DO i=max(xs,ids),min(xe,ide-1)
grid%ht(i,j) = mtn_ht * 0.25 * &
( 1. + COS ( 2*pi/(xe-xs) * ( i-xs ) + pi ) ) * &
( 1. + COS ( 2*pi/(ye-ys) * ( j-ys ) + pi ) )
ENDDO
ENDDO
#endif
#ifdef EW_RIDGE
DO j=max(ys,jds),min(ye,jde-1)
DO i=ids,ide
grid%ht(i,j) = mtn_ht * 0.50 * &
( 1. + COS ( 2*pi/(ye-ys) * ( j-ys ) + pi ) )
ENDDO
ENDDO
#endif
#ifdef NS_RIDGE
DO j=jds,jde
DO i=max(xs,ids),min(xe,ide-1)
grid%ht(i,j) = mtn_ht * 0.50 * &
( 1. + COS ( 2*pi/(xe-xs) * ( i-xs ) + pi ) )
ENDDO
ENDDO
#endif
#ifdef NS_VALLEY
DO i=ids,ide
DO j=jds,jde
grid%ht(i,j) = mtn_ht
ENDDO
ENDDO
xs=ids !-1
xe=xs + 20000./config_flags%dx
DO j=jds,jde
DO i=max(xs,ids),min(xe,ide-1)
grid%ht(i,j) = mtn_ht - mtn_ht * 0.50 * &
( 1. + COS ( 2*pi/(xe-xs) * ( i-xs ) + pi ) )
ENDDO
ENDDO
#endif
```
To control conditional compilation:
1. Search for the variable `ARCHFLAGS` in `configure.wrf`
2. Add the desired define statement at the bottom. For instance, to selectively compile the `NS_VALLEY` block above, do the following:
### Running LES with online computation of resolved-fluxes turbulent fluxes
WRFlux
## Data assimilation (DA)
### Observation nudging
### Variational DA
WRFDA
### Ensemble DA
We cover this separately. See DART-WRF.
## Specific tasks
### Before running the model
#### Defining the vertical grid
#### Customizing model orography
#### Defining a new geographical database
#### Using ECMWF data as IC/BC
The long story made short is: you should link grib1 files and process them with `ungrib.exe` using `Vtable.ECMWF_sigma`.
More in detail, since a few years ECMWF has been distributing a mixture of grib2 and grib1 files. Namely:
* grib1 files for surface and soil model levels.
* grib2 files for atmospheric model levels.
The WPS has a predefined Vtable for grib1 files from ECMWF, so the easiest way to process ECMWF data is to:
1. convert model-level grib2 files to grib1
2. if necessary, for every time stamp, concatenate the model-level and surface grib1 files into a single file. This is only necessary if the grib1 and grib2 data were downloaded as separate sets of GRIB files.
3. process the resulting files with ungrib after linking `ungrib/Variable_Tables/Vtable.ECMWF_sigma` as `Vtable`
In detail:
1. Conversion to grib1 (needs the grib_set utility from eccodes):
```sh title="convert to grib1"
for i in det.CROSSINN.mlv.20190913.0000.f*.grib2;
do
j=`basename $i .grib2`;
grib_set -s deletePV=1,edition=1 ${i} ${j};
done
```
2. Concatenation of grib files (two sets of files, `*mlv*` and `*sfc*`, with names ending with "grib1" yield a new set of files with names ending with "grib"; everything is grib1):
An alternative procedure would be to convert everything to grib2 instead of grib1. Then, one has to use a Vtable with grib2 information for the surface fields, for instance the one included here at the bottom. But: Data from the bottom soil level will not be read correctly with this Vtable, because the Level2 value for the bottom level is actually MISSING in grib2 files (at the moment of writing, 6 May 2022; this may be fixed in the future).
GRIB1| Level| From | To | metgrid | metgrid | metgrid |GRIB2|GRIB2|GRIB2|GRIB2|
Param| Type |Level1|Level2| Name | Units | Description |Discp|Catgy|Param|Level|
1. In the code snippet above, `-remapnn` specifies the interpolation engine, in this case nearest-neighbour. See alternatives here: <https://code.mpimet.mpg.de/projects/cdo/wiki/Tutorial#Horizontal-fields>
1. File gridfile.lonlat.txt contans the grid specifications, e.g.:
gridtype = lonlat
gridsize = 721801
xsize = 1201
ysize = 601
xname = lon
xlongname = "longitude"
xunits = "degrees_east"
yname = lat
ylongname = "latitude"
yunits = "degrees_north"
xfirst = 5.00
xinc = 0.01
yfirst = 43.00
yinc = 0.01
#### Subsetting model output
#### Further compression of model output (data packing)
#### 3D visualization
For 3D visualization of WRF output, it is recommended to use either [Paraview](https://www.paraview.org/) or [Mayavi](https://docs.enthought.com/mayavi/mayavi/).
* Both softwares are based on the Visualization Toolkit ([VTK](https://vtk.org/)) libraries, so visualizations are rather similar in the end.
* Both sotwares can be used interactively from a graphical user interface or in batch mode (i.e., writing the visualization directives in a Python script).
* While Paraview requires converting model data into one of a few supported formats, Mayavi supports direct rendering of Numpy objects, so it is easier to integrate it into Python code.
* It is recommended to run 3D visualization software on GPUs. Running on a CPU (e.g., own laptop) is possible, but will be extremely slow. CPU is not the only bottleneck, because visualization software uses a lot of computer memory. Rendering 3D fields, in particular, is out of reach for normal laptops with 8GB or 16GB of RAM. Paraview is available on VSC5 and should be available soon on srvx8. Currently, Mayavi must be installed by individual users as a Python package.
**Notes for readers/contributors: (1) Mayavi is untested yet. (2) It would be useful to add example batch scripts for both Paraview and Mayavi.**
##### Paraview workflow
1. Pre-requisite: [download](https://www.paraview.org/download/) and install the Paraview application on your computer.
1. Log in to VSC5 in a terminal window.
1. On VSC5, convert the WRF output in a format that Paraview can ingest. One option is to use [siso](https://github.com/TheBB/SISO).
The first and second statements handle respectively 3D and 2D WRF output. They process the native output from WRF in netcdf format and return collections of files in VTS format (the VTK format for structured grids). There will be two independent datasets (for 3D and 2D output).
1. In the VSC5 terminal, request access to a GPU node. One of the private IMGW nodes has a GPU, and can be accessed with specific account/partition/quality of service directives.
salloc: job 233600 queued and waiting for resources
salloc: job 233600 has been allocated resources
salloc: Granted job allocation 233600
salloc: Waiting for resource configuration
salloc: Nodes n3072-006 are ready for job
```
1. Once the GPU node becomes available, open up a new terminal session on your local machine, and set up an ssh tunnel to the GPU node through the login node.
This will redirect TCP/IP traffic from port 11111 of your local machine to port 11112 of the VSC5 GPU node, through the VSC5 login node. Port numbers are arbitary, but the remote port (11112) needs to match the Paraview server settings (see below).
1. In the VSC5 terminal, log in to the GPU node:
```sh
(zen3) [sserafin4@l50 ~]$ ssh n3072-006
Warning: Permanently added 'n3072-006,10.191.72.6' (ECDSA) to the list of known hosts.
sserafin4@n3072-006's password:
(zen3) [sserafin4@n3072-006 ~]$
```
1. In the VSC5 terminal on the GPU node, load the Paraview module and start the Paraview server:
1. On your local machine, open the Paraview client (graphical user interface, GUI). Then select File > Connect and enter the url of the Paraview server (`localhost:11111`). Select the datasets you want to display and work on them in the GUI. Save the Paraview state to avoid repeating work at the next session. Paraview has extensive [documentation](https://docs.paraview.org/en/latest/UsersGuide/index.html), tutorials ([one](https://docs.paraview.org/en/latest/Tutorials/SelfDirectedTutorial/index.html), [two](https://public.kitware.com/Wiki/The_ParaView_Tutorial) and [three](https://public.kitware.com/Wiki/images/b/bc/ParaViewTutorial56.pdf)) and a [wiki](https://public.kitware.com/Wiki/ParaView).
##### Mayavi workflow
Not tested yet.
##### Creating a video
Whether done with Paraview or with Mayavi, the visualization will result in a collection of png files, e.g., `InnValley.%04d.png`. There are several tools to convert invidual frames into movies. Among them, `ffmpeg` and `apngasm`. At the moment neither of them is available on IMGW servers (precompiled binaries are available through `apt-get` for Ubuntu).