11) Internals
In this tutorial, you will learn how to use internals within experiments
for
WRF, a free and open-source
application for atmospheric research and operational forecasting applications.
This tutorial builds off of concepts introduced in previous tutorials. Please make sure you review those before starting with this tutorial’s content.
Create a Workspace
To begin with, you need a workspace to configure the experiments. This can be created with the following command:
$ ramble workspace create internals_wrf
Activate the Workspace
Several of Ramble’s commands require an activated workspace to function properly. Activate the newly created workspace using the following command: (NOTE: you only need to run this if you do not currently have the workspace active).
$ ramble workspace activate internals_wrf
Configure Experiment Definitions
To being with, you need to configure the workspace. The workspace’s root
location can be seen under the Location output of:
$ ramble workspace info
Alternatively, the files can be edited directly with:
$ ramble workspace edit
Within the ramble.yaml file, write the following contents, which is the
final configuration from a previous tutorial.
NOTE: This workspace utilizes the spack package manager. As a result, it
requires spack is installed and available in your path. Modifications to
the package_manager variant will change this behavior.
# Copyright 2022-2025 The Ramble Authors
# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
# https://www.apache.org/licenses/LICENSE-2.0> or the MIT license
# <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your
# option. This file may not be copied, modified, or distributed
# except according to those terms.
ramble:
variants:
package_manager: spack
env_vars:
set:
OMP_NUM_THREADS: '{n_threads}'
variables:
processes_per_node: 16
n_ranks: '{processes_per_node}*{n_nodes}'
batch_submit: '{execute_experiment}'
mpi_command: mpirun -n {n_ranks}
applications:
wrfv4:
workloads:
CONUS_12km:
experiments:
scaling_{n_nodes}:
variables:
n_nodes: [1, 2]
software:
packages:
gcc9:
pkg_spec: gcc@9.4.0
intel-mpi:
pkg_spec: intel-oneapi-mpi@2021.11.0
compiler: gcc9
wrfv4:
pkg_spec: wrf@4.2 build_type=dm+sm compile_type=em_real nesting=basic ~chem
~pnetcdf
compiler: gcc9
environments:
wrfv4:
packages:
- intel-mpi
- wrfv4
The above configuration will execute 2 experiments, comprising a basic scaling study on 2 different sets of nodes. This is primarily defined by the use of vector experiments, which are documented in the vector logic portion of the workspace configuration file documentation. Vector experiments were also introduced in the vector and matrix tutorial.
Experiment Internals
In Ramble, the concept of internals allows a user to override some aspects
of a workload within the workspace configuration file. More information about
internals can be seen at Controlling Internals.
The internals block within a workspace configuration file can be used to
define custom executables, and control the order of executables within an
experiment.
In this tutorial, you will define new executables for tracking the start and end timestamp of each experiment, and properly inject these into the experiment order.
Define New Executables
The definition of a new executable lives within an internals block. Below
is an example of defining a new executable called start_time which time in
seconds since 1970-01-01 00:00 UTC:
internals:
custom_executables:
start_time:
template:
- 'date +%s'
use_mpi: false
redirect: '{experiment_run_dir}/start_time'
Within this start_time definition, the template attribute takes a list
of strings which will be injected as part of this executable. The use_mpi
attribute tells Ramble if this executable apply the mpi_command variable
definition as a prefix to every entry of the template attribute. The
redirect attribute defines the file each portion of template should be
redirected into.
Not shown above is the output_capture attribute, which defines the operator
used for capturing the output from the portions of template (the default is
&>).
By default, this would define the actual command to be:
date +%s &> {experiment_run_dir}/start_time
Edit your workspace configuration file using:
$ ramble workspace edit
Within this file, use the example above to define two new executables
start_time and end_time. Make sure you change the value of redirect
in the end_time executable definition. The resulting file should look like
the following:
# Copyright 2022-2025 The Ramble Authors
# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
# https://www.apache.org/licenses/LICENSE-2.0> or the MIT license
# <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your
# option. This file may not be copied, modified, or distributed
# except according to those terms.
ramble:
variants:
package_manager: spack
env_vars:
set:
OMP_NUM_THREADS: '{n_threads}'
variables:
processes_per_node: 16
n_ranks: '{processes_per_node}*{n_nodes}'
batch_submit: '{execute_experiment}'
mpi_command: mpirun -n {n_ranks}
applications:
wrfv4:
workloads:
CONUS_12km:
experiments:
scaling_{n_nodes}:
internals:
custom_executables:
start_time:
template:
- date +%s
redirect: '{experiment_run_dir}/start_time'
use_mpi: false
end_time:
template:
- date +%s
redirect: '{experiment_run_dir}/end_time'
use_mpi: false
variables:
n_nodes: [1, 2]
software:
packages:
gcc9:
pkg_spec: gcc@9.4.0
intel-mpi:
pkg_spec: intel-oneapi-mpi@2021.11.0
compiler: gcc9
wrfv4:
pkg_spec: wrf@4.2 build_type=dm+sm compile_type=em_real nesting=basic ~chem
~pnetcdf
compiler: gcc9
environments:
wrfv4:
packages:
- intel-mpi
- wrfv4
Defining Executable Order
At this point, start_time and end_time are defined as new executables,
however they are not added to your experiments. To verify this, execute:
$ ramble workspace setup --dry-run
and examine the execute_experiment scripts in your experiment directories.
date +%s should not be present in any of these. To fix this issue, we need
to modify the order of the executables for the workload your experiments are
using.
Currently, when controlling the order of executables, the entire order of executables must be defined. To see the current list of executables for your experiments, execute:
$ ramble info --attrs workloads -v -p "CONUS_12km" wrfv4
This prints the information for the CONUS_12km workload in the wrfv4 application definition.
Executables: definition lists the order of executables used for this
workload. As an example, you might see the following:
Executables: ['cleanup', 'copy', 'fix_12km', 'execute']
Some executables are provided through the builtin functionality. These are
executable commands that are injected by default from the object definitions.
To be able to see these, you can execute:
$ ramble info --attrs builtins -v wrfv4
This command should print something like the following:
############
# builtins #
############
builtin::env_vars:
name: env_vars
required: True
injection_method: prepend
depends_on: []
Now, edit the workspace configuration file with:
$ ramble workspace edit
And define the order of the executables for your experiments to include
start_time and end_time in the correct locations. To do this, add a
executables attribute to the internals dictionary. The contents of
executables are a list of executable names provided in the order you
want them to be executed.
For the purposes of this tutorial, add start_time directly before
execute and end_time directly after execute. The resulting
configuration file should look like the following:
# Copyright 2022-2025 The Ramble Authors
# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
# https://www.apache.org/licenses/LICENSE-2.0> or the MIT license
# <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your
# option. This file may not be copied, modified, or distributed
# except according to those terms.
ramble:
variants:
package_manager: spack
env_vars:
set:
OMP_NUM_THREADS: '{n_threads}'
variables:
processes_per_node: 16
n_ranks: '{processes_per_node}*{n_nodes}'
batch_submit: '{execute_experiment}'
mpi_command: mpirun -n {n_ranks}
applications:
wrfv4:
workloads:
CONUS_12km:
experiments:
scaling_{n_nodes}:
internals:
custom_executables:
start_time:
template:
- date +%s
redirect: '{experiment_run_dir}/start_time'
use_mpi: false
end_time:
template:
- date +%s
redirect: '{experiment_run_dir}/end_time'
use_mpi: false
executables:
- builtin::env_vars
- package_manager_builtin::spack::spack_source
- package_manager_builtin::spack::spack_activate
- cleanup
- copy
- fix_12km
- start_time
- execute
- end_time
variables:
n_nodes: [1, 2]
software:
packages:
gcc9:
pkg_spec: gcc@9.4.0
intel-mpi:
pkg_spec: intel-oneapi-mpi@2021.11.0
compiler: gcc9
wrfv4:
pkg_spec: wrf@4.2 build_type=dm+sm compile_type=em_real nesting=basic ~chem
~pnetcdf
compiler: gcc9
environments:
wrfv4:
packages:
- intel-mpi
- wrfv4
NOTE Omitting any executables from the executables list will
prevent it from being used in the generated experiments.
Execute Experiments
Now that you have made the appropriate modifications, set up, execute, and analyze the new experiments using:
$ ramble workspace setup
$ ramble on
$ ramble workspace analyze
This creates a results file in the root of the workspace that contains
extracted figures of merit. If the experiments were successful, this file will
show the following results:
Average Timestep Time: Time (in seconds) on average each timestep takes
Cumulative Timestep Time: Time (in seconds) spent executing all timesteps
Minimum Timestep Time: Minimum time (in seconds) spent on any one timestep
Maximum Timestep Time: Maximum time (in seconds) spent on any one timestep
Number of timesteps: Count of total timesteps performed
Avg. Max Ratio Time: Ratio of Average Timestep Time and Maximum Timestep Time
Examining the experiment run directories, you should see start_time and
end_time files which contain the output of our custom executables.
Using Executable Injection
In addition to the full explicit method of injecting an executable shown above, you can inject executables relative to existing executables in the experiment’s executable list, this is documented in the internals config section and workspace internals documentation sections.
As an example, the following YAML could replace the executables section of
your existing configuration with the following:
executable_injection:
- name: start_time
order: before
relative_to: execute
- name: end_time
order: after
relative_to: execute
Go ahead and edit the workspace configuration file with:
$ ramble workspace edit
Replace the executables block with the executable_injection block
presented above. The resulting configuration file should look like the
following:
# Copyright 2022-2025 The Ramble Authors
# Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
# https://www.apache.org/licenses/LICENSE-2.0> or the MIT license
# <LICENSE-MIT or https://opensource.org/licenses/MIT>, at your
# option. This file may not be copied, modified, or distributed
# except according to those terms.
ramble:
variants:
package_manager: spack
variables:
processes_per_node: 4
n_ranks: '{processes_per_node}*{n_nodes}'
batch_submit: '{execute_experiment}'
mpi_command: mpirun -n {n_ranks}
applications:
wrfv4:
workloads:
CONUS_12km:
experiments:
scaling_{n_nodes}:
internals:
custom_executables:
start_time:
template:
- date +%s
redirect: '{experiment_run_dir}/start_time'
use_mpi: false
end_time:
template:
- date +%s
redirect: '{experiment_run_dir}/end_time'
use_mpi: false
executable_injection:
- name: start_time
order: before
relative_to: execute
- name: end_time
order: after
relative_to: execute
variables:
n_nodes: [1, 2, 4]
software:
packages:
gcc9:
pkg_spec: gcc@9.4.0
intel-mpi:
pkg_spec: intel-oneapi-mpi@2021.11.0
compiler: gcc9
wrfv4:
pkg_spec: wrf@4.2 build_type=dm+sm compile_type=em_real nesting=basic ~chem
~pnetcdf
compiler: gcc9
environments:
wrfv4:
packages:
- intel-mpi
- wrfv4
Execute Experiments
Now that you have made the appropriate modifications, set up, execute, and analyze the new experiments using:
$ ramble workspace setup
$ ramble on
$ ramble workspace analyze
This creates a results file in the root of the workspace that contains
extracted figures of merit. If the experiments were successful, this file will
show the following results:
Average Timestep Time: Time (in seconds) on average each timestep takes
Cumulative Timestep Time: Time (in seconds) spent executing all timesteps
Minimum Timestep Time: Minimum time (in seconds) spent on any one timestep
Maximum Timestep Time: Maximum time (in seconds) spent on any one timestep
Number of timesteps: Count of total timesteps performed
Avg. Max Ratio Time: Ratio of Average Timestep Time and Maximum Timestep Time
Examining the experiment run directories, you should see start_time and
end_time in the same places as they were when you ran the explicitly
defined order experiments.
Clean the Workspace
Once you are finished with the tutorial content, make sure you deactivate your workspace:
$ ramble workspace deactivate
Additionally, you can remove the workspace and all of its content with:
$ ramble workspace remove internals_wrf