All GCHP run directories have default simulation-specific run-time settings that are set when you create a run directory. You will likely want to change these settings. This page goes over how to do this.
Table of contents
GCHP is controlled using a set of configuration files that are included in the GCHP run directory. Files include:
Several run-time settings must be set consistently across multiple files.
Inconsistencies may result in your program crashing or yielding unexpected results.
To avoid mistakes and make run configuration easier, bash shell script
runConfig.sh is included in all run directories to set the most commonly changed config file settings from one location.
Sourcing this script will update multiple config files to use values specified in file.
runConfig.sh is done automatically prior to running GCHP if using any of the example run scripts, or you can do it at the command line.
Information about what settings are changed and in what files are standard output of the script.
To source the script, type the following:
$ source runConfig.sh
You may also use it in silent mode if you wish to update files but not display settings on the screen:
$ source runConfig.sh --silent
runConfig.sh to configure common settings makes run configure much simpler, it comes with a major caveat.
If you manually edit a config file setting that is also set in
runConfig.sh then your manual update will be overrided via string replacement.
Please get very familiar with the options in
runConfig.sh and be conscientious about not updating the same setting elsewhere.
You generally will not need to know more about the GCHP configuration files beyond what is listed on this page. However, for a comprehensive description of all configuration files used by GCHP see the last section of this user manual.
To change the number of nodes and cores for your run you must update settings in two places: (1)
runConfig.sh, and (2) your run script.
runConfig.sh file contains detailed instructions on how to set resource parameter options and what they mean.
Look for the Compute Resources section in the script.
Update your resource request in your run script to match the resources set in
It is important to be smart about your resource allocation. To do this it is useful to understand how GCHP works with respect to distribution of nodes and cores across the grid. At least one unique core is assigned to each face on the cubed sphere, resulting in a constraint of at least six cores to run GCHP. The same number of cores must be assigned to each face, resulting in another constraint of total number of cores being a multiple of six. Communication between the cores occurs only during transport processes.
While any number of cores is valid as long as it is a multiple of six (although there is an upper limit per resolution), you will typically start to see negative effects due to excessive communication if a core is handling less than around one hundred grid cells or a cluster of grid cells that are not approximately square.
You can determine how many grid cells are handled per core by analyzing your grid resolution and resource allocation.
For example, if running at C24 with six cores each face is handled by one core (6 faces / 6 cores) and contains 576 cells (24x24).
Each core therefore processes 576 cells. Since each core handles one face, each core communicates with four other cores (four surrounding faces). Maximizing squareness of grid cells per core is done automatically within
runConfig.sh if variable
NXNY_AUTO is set to
Further discussion about domain decomposition is in
There is an option to split up a single simulation into separate serial jobs. To use this option, do the following:
runConfig.shwith your full simulation (all runs) start and end dates, and the duration per segment (single run). Also update the number of runs options to reflect to total number of jobs that will be submitted (
NUM_RUNS). Carefully read the comments in
runConfig.shto ensure you understand how it works.
Optionally turn on monthly diagnostic (
Monthly_Diag). Only turn on monthly diagnostics if your run duration is monthly.
gchp.multirun.runas your run script, or adapt it if your cluster does not use SLURM. It is located in the runScriptSamples subdirectory of your run directory. As with the regular
gchp.run, you will need to update the file with compute resources consistent with
runConfig.sh. Note that you should not submit the run script directly. It will be done automatically by the file described in the next step.
gchp.multirun.shto submit your job, or adapt it if your cluster does not use SLURM. It is located in the
runScriptSamples/subdirectory of your run directory. For example, to submit your series of jobs, type:
There is much documentation in the headers of both
gchp.multirun.sh that is worth reading and getting familiar with, although not entirely necessary to get the multi-run option working.
If you have not done so already, it is worth trying out a simple multi-segmented run of short duration to demonstrate that the multi-segmented run configuration and scripts work on your system.
For example, you could do a 3-hour simulation with 1-hour duration and number of runs equal to 3.
The multi-run script assumes use of SLURM, and a separate SLURM log file is created for each run.
There is also log file called
multirun.log with high-level information such as the start, end, duration, and job ids for all jobs submitted.
If a run fails then all scheduled jobs are cancelled and a message about this is sent to that log file.
Inspect this and your other log files, as well as output in the
OutputDir/ directory prior to using for longer duration runs.
For runs at very high resolution or small number of processors you may run into a domains stack size error.
This is caused by exceeding the domains stack size memory limit set at run-time and the error will be apparent from the message in your log file.
If this occurs you can increase the domains stack size in file
input.nml. The default is set to 20000000.
GCHP uses a cubed sphere grid rather than the traditional lat-lon grid used in GEOS-Chem Classic. While regular lat-lon grids are typically designated as ΔLat ⨉ ΔLon (e.g. 4⨉5), cubed sphere grids are designated by the side-length of the cube. In GCHP we specify this as CX (e.g. C24 or C180). The simple rule of thumb for determining the roughly equivalent lat-lon resolution for a given cubed sphere resolution is to divide the side length by 90. Using this rule you can quickly match C24 with about 4x5, C90 with 1 degree, C360 with quarter degree, and so on.
To change your grid resolution in the run directory edit the
CS_RES integer parameter in
Internal Cubed Sphere Resolution to the cube side length you wish to use.
To use a uniform global grid resolution make sure that
STRETCH_GRID is set to
GCHP has the capability to run with a stretched grid, meaning one portion of the globe is stretched to fine resolution.
Set stretched grid parameter in
runConfig.sh section Internal Cubed Sphere Resolution.
See instructions in that section of the file.
You can toggle all primary GEOS-Chem components, including type of mixing, from within
The settings in that file will update
Look for section
Turn Components On/Off, and other settings in
Other settings in this section beyond component on/off toggles using CH4 emissions in UCX, and initializing stratospheric H2O in UCX.
Model timesteps, both chemistry and dynamic, are configured within
They are set to match GEOS-Chem Classic default values for low resolutions for comparison purposes but can be updated, with caution.
Timesteps are automatically reduced for high resolution runs.
Read the documentation in
Timesteps for setting them.
Set simulation start and end in
Simulation Start, End, Duration, # runs.
Read the comments in the file for a complete description of the options.
Typically a “CAP” runtime error indicates a problem with start, end, and duration settings.
If you encounter an error with the words “CAP” near it then double-check that these settings make sense.
All GCHP run directories come with symbolic links to initial restart files for commonly used cubed sphere resolutions.
The appropriate restart file is automatically chosen based on the cubed sphere resolution you set in
You may overwrite the default restart file with your own by specifying the restart filename in
Initial Restart File.
Beware that it is your responsibility to make sure it is the proper grid resolution.
Unlike GEOS-Chem Classic, HEMCO restart files are not used in GCHP. HEMCO restart variables may be included in the initial species restart file, or they may be excluded and HEMCO will start with default values. GCHP initial restart files that come with the run directories do not include HEMCO restart variables, but all output restart files do.
Because file I/O impacts GCHP performance it is a good idea to turn off file read of emissions that you do not need.
You can turn emissions inventories on or off the same way you would in GEOS-Chem Classic, by setting the inventories to true or false at the top of configuration file
All emissions that are turned off in this way will be ignored when GCHP uses
ExtData.rc to read files, thereby speeding up the model.
For emissions that do not have an on/off toggle at the top of the file, you can prevent GCHP from reading them by commenting them out in
No updates to
ExtData.rc would be necessary.
If you alternatively comment out the emissions in
ExtData.rc but not
HEMCO_Config.rc then GCHP will fail with an error when looking for the file information.
Another option to skip file read for certain files is to replace the file path in
However, if you want to turn these inputs back on at a later time you should preserve the original path by commenting out the original line.
There are two steps for adding new emissions inventories to GCHP:
Add the inventory information to
Add the inventory information to
To add information to
HEMCO_Config.rc, follow the same rules as you would for adding a new emission inventory to GEOS-Chem Classic. Note that not all information in
HEMCO_Config.rcis used by GCHP. This is because HEMCO is only used by GCHP to handle emissions after they are read, e.g. scaling and applying hierarchy. All functions related to HEMCO file read are skipped. This means that you could put garbage for the file path and units in
HEMCO_Config.rcwithout running into problems with GCHP, as long as the syntax is what HEMCO expects. However, we recommend that you fill in
HEMCO_Config.rcin the same way you would for GEOS-Chem Classic for consistency and also to avoid potential format check errors.
Staying consistent with the information that you put into
HEMCO_Config.rc, add the inventory information to
ExtData.rc following the guidelines listed at the top of the file and using existing inventories as examples.
You can ignore all entries in
HEMCO_Config.rc that are copies of another entry since putting these in
ExtData.rc would result in reading the same variable in the same file twice.
HEMCO interprets the copied variables, denoted by having dashes in the
HEMCO_Config.rc entry, separate from file read.
A few common errors encountered when adding new input emissions files to GCHP are:
Your input file contains integer values. Beware that the MAPL I/O component in GCHP does not read or write integers. If your data contains integers then you should reprocess the file to contain floating point values instead.
Your data latitude and longitude dimensions are in the wrong order. Lat must always come before lon in your inputs arrays, a requirement true for both GCHP and GEOS-Chem Classic.
Your 3D input data are mapped to the wrong levels in GEOS-Chem (silent error). If you read in 3D data and assign the resulting import to a GEOS-Chem state variable such as
State_Met, then you must flip the vertical axis during the assignment. See files
You have a typo in either
ExtData.rc. Error in
HEMCO_Config.rctypically result in the model crashing right away. Errors in
ExtData.rctypically result in a problem later on during ExtData read. Always try running with the MAPL debug flags on
runConfig.sh(maximizes output to
gchp.log) and Warnings and Verbose set to 3 in
HEMCO_Config.rc(maximizes output to
HEMCO.log) when encountering errors such as this. Another useful strategy is to find config file entries for similar input files and compare them against the entry for your new file. Directly comparing the file metadata may also lead to insights into the problem.
See documentation in the
HISTORY.rc config file for instructions on how to output diagnostic collection on lat-lon grids.
The MAPL component in GCHP has the option to output restart files (also called checkpoint files) prior to run end. The frequency of restart file write may be at regular time intervals (regular frequency) or at specific programmed times (irregular frequency). These periodic output restart files contain the date and time in their filenames.
Enabling this feature is a good idea if you plan on doing a long simulation and you are not splitting your run into multiple jobs. If the run crashes unexpectedly then you can restart mid-run rather than start over from the beginning.
Update settings for checkpoint restart outputs in
Instructions for configuring both regular and irregular frequency restart files are included in the file.
To turn diagnostic collections on or off, comment (“#”) collection names in the “COLLECTIONS” list at the top of file
Collections cannot be turned on/off from
All diagnostic collections that come with the run directory have frequency, duration, and mode auto-set within
The file contains a list of time-averaged collections and instantaneous collections, and allows setting a frequency and duration to apply to all collections listed for each.
Output Diagnostics within
To avoid auto-update of a certain collection, remove it from the list in
If adding a new collection, you can add it to the file to enable auto-update of frequency, duration, and mode.
Adding a new diagnostics collection in GCHP is the same as for GEOS-Chem Classic netcdf diagnostics.
You must add your collection to the collection list in
HISTORY.rc and then define it further down in the file.
Any 2D or 3D arrays that are stored within GEOS-Chem objects
State_Diag, may be included as fields in a collection.
State_Met variables must be preceded by “Met_”,
State_Chm variables must be preceded by “Chem_”, and
State_Diag variables should not have a prefix.
HISTORY.rc file for examples.
Once implemented, you can either incorporate the new collection settings into
runConfig.sh for auto-update, or you can manually configure all settings in
Output Diagnostics section of
runConfig.sh for more information.
There is an option to automatically generate monthly diagnostics by submitting month-long simulations as separate jobs.
Splitting up the simulation into separate jobs is a requirement for monthly diagnostics because MAPL History requires a fixed number of hours set for diagnostic frequency and file duration.
The monthly mean diagnostic option automatically updates
HISTORY.rc diagnostic settings each month to reflect the number of days in that month taking into account leap years.
To use the monthly diagnostics option, first read and follow instructions for splitting a simulation into multiple jobs (see separate section on this page).
Prior to submitting your run, enable monthly diagnostics in
runConfig.sh by searching for variable “Monthly_Diag” and changing its value from 0 to 1.
Be sure to always start your monthly diagnostic runs on the first day of the month.
Besides compiling with
CMAKE_BUILD_TYPE=Debug, there are a few settings you can configure to boost your chance of successful debugging.
All of them involve sending additional print statements to the log files.
Set Turn on debug printout? in input.geos to T to turn on extra GEOS-Chem print statements in the main log file.
runConfig.shto 1 to turn on extra MAPL print statements in ExtData, the component that handles input.
Set the Verbose and Warnings settings in
HEMCO_Config.rcto maximum values of 3 to send the maximum number of prints to
None of these options require recompiling. Be aware that all of them will slow down your simulation. Be sure to set them back to the default values after you are finished debugging.