.. Copyright 2022-2025 The Ramble Authors Licensed under the Apache License, Version 2.0 or the MIT license , at your option. This file may not be copied, modified, or distributed except according to those terms. .. _variable_expansion_and_indirection_and_stack_parameterization_tutorial: ======================================================================= 8) Variable Expansion, Indirection, and Software Stack Parameterization ======================================================================= In this tutorial, you will learn how to use variable expansion, indirection, and software stack parameterization when generating experiments. For this tutorial, we will use `WRF `_, a free and open-source application for atmospheric research and operational forecasting applications. This tutorial builds off of concepts introduced in previous tutorials. Please make sure you review those before starting with this tutorial's content. **NOTE:** In this tutorial, you will encounter expected errors when copying and pasting the commands. This is to help show situations you might run into when trying to use Ramble on your own, and illustrate how you might fix them. Create a Workspace ------------------ To begin with, you need a workspace to configure the experiments. This can be created with the following command: .. code-block:: console $ ramble workspace create var_expansion_and_indirection Activate the Workspace ---------------------- Several of Ramble's commands require an activated workspace to function properly. Activate the newly created workspace using the following command: (NOTE: you only need to run this if you do not currently have the workspace active). .. code-block:: console $ ramble workspace activate var_expansion_and_indirection Configure Experiment Definitions -------------------------------- To being with, you need to configure the workspace. The workspace's root location can be seen under the ``Location`` output of: .. code-block:: console $ ramble workspace info Additionally, the files can be edited directly with: .. code-block:: console $ ramble workspace edit Within the ``ramble.yaml`` file, write the following contents, which are the final configuration from the previous tutorial. .. literalinclude:: ../../../../examples/tutorial_8_base_config.yaml :language: YAML The above configuration will execute 4 experiments, comprising a basic scaling study on three different sets of nodes across two different platforms. You will expand this definition to perform the same sweep over multiple MPI implementations. Over the course of this tutorial, you will learn how to use variable expansion and indirection to construct more complex experiments. Define Additional MPI and Parameterize Software Environments ------------------------------------------------------------ To begin with, you will parameterize the software stack definitions to generate experiments using both IntelMPI and OpenMPI. For this section, you can focus on the ``software`` portion of the ``ramble.yaml`` configuration file. For more information on how this section is constructed, see the :ref:`Software config section` documentation. To start with, you will create an OpenMPI package definition. This might look like the following: .. code-block:: YAML packages: openmpi: pkg_spec: openmpi@3.1.6 +orterunprefix In the definition of the Intel MPI package above, you'll see we originally specified a ``compiler`` attribute (with the value of ``gcc9``). This can be explicitly selected if you like, however when using Spack, Ramble generates Spack environments with ``unify: true`` (See `Spack's environment documentation `_ for more details). As a result, OpenMPI should be compiled with the same compiler used for WRF. We also need to generate additional software environments, however we will parameterize the generation of these using a new variable definition. .. code-block:: YAML environments: wrfv4-{mpi_name}: packages: - {mpi_name} - wrfv4 variables: mpi_name: ['intel-mpi', 'openmpi'] Will create two software environments. One named ``wrfv4-intel-mpi`` and another named ``wrfv4-openmpi``. However, the definition of ``mpi_name`` can be hoisted to the workspace level because we need to include it in the experiment generation as well. The result might look like the following: .. literalinclude:: ../../../../examples/tutorial_8_mpi_config.yaml :language: YAML **NOTE** The reference to ``{mpi_name}`` within the environment package list is escaped using single quotes. This is to prevent YAML from parsing this as a dictionary. At this point, executing: .. code-block:: console $ ramble workspace info Should result in the following error: .. code-block:: console ==> Error: Experiment wrfv4.CONUS_12km.scaling_1_platform1 is not unique. As you have implicitly defined 8 experiments (2 from ``n_nodes``, times 2 from ``platform_config``, times another 2 from ``mpi_name``), but you haven't updated the experiment name template. To resolve this, add ``{mpi_name}`` into the experiment name template. Additionally, you may explicitly add ``mpi_name`` into the matrix. The result might look like the following: .. literalinclude:: ../../../../examples/tutorial_8_mpi_matrix_config.yaml :language: YAML Variable Expansion and Indirection ---------------------------------- At this stage, you have defined a workspace that will execute 8 experiments. It is important to point out that different MPI implementations have different command line flags for controlling their behavior. The existing ``mpi_command`` should work fine with both Intel MPI, and OpenMPI but to illustrate how variable expansion and indirection can be used you will now add a flag to control the number of MPI ranks per compute node. For Intel MPI this is: .. code-block:: console -ppn {processes_per_node} While in OpenMPI this is: .. code-block:: console --map-by ppr:{processes_per_node}:node One way to define this is to define ``mpi_command`` as a list variable, with the appropriate MPI command line arguments. Then you can define an explicit zip that combines ``mpi_command`` and ``mpi_name``. However, for the purposes of this tutorial you will instead use variable expansion and indirection to lookup variable definitions. In Ramble, every variable can be defines as a combination of other variables. For example: .. code-block:: YAML variables: processes_per_node: 4 n_nodes: 2 n_ranks: '{processes_per_node}*{n_nodes}' Would result in ``n_ranks`` having a value of 8, as each of the variable references are expanded and then the math is evaluated. Additionally, variable references are allowed to be nested to parameterize which variables you want to use. For example: .. code-block:: YAML variables: openmpi_args: '--np {n_ranks} --map-by ppr:{processes_per_node}:node -x OMP_NUM_THREADS' intel-mpi_args: '-n {n_ranks} -ppn {processes_per_node}' mpi_command: 'mpirun {{mpi_name}_args}' Allows the ``mpi_command`` definition to change based on the definition of ``mpi_name``. This is called variable indirection. If we employ variable indirection to help parameterize the MPI arguments as shown above, the resulting configuration might look like the following: .. literalinclude:: ../../../../examples/tutorial_8_expansion_indirection_config.yaml :language: YAML **NOTE** The arguments for the various MPI implementations may not run on your system if you require additional arguments. To be able to execute these on your system, make sure you modify these appropriately. At this point, you have described the 8 experiments you want to run, however they are still not completely defined. Running: .. code-block:: console $ ramble workspace setup --dry-run Should result in the following error: .. code-block:: console ==> Error: Environment wrfv4 is not defined. This is because the default software environment every application uses is named the same as the application (in this case, both would be named ``wrfv4``). You changed the name of the software environment, but didn't connect each experiment to the proper environment. Controlling Experiment Software Environments -------------------------------------------- To control the software environment used within an experiment, Ramble allows you to use the ``env_name`` variable definition. Because ``mpi_name`` is a list variable, you might want ``env_name`` to be a list that is zipped with ``mpi_name`` to make sure they are iterated over together. However, you may also utilize variable indirection / expansion to fix this issue. For the purposes of this tutorial, we will use indirection instead of explicit zips. The resulting configuration file might look like the following: .. literalinclude:: ../../../../examples/tutorial_8_software_environments_config.yaml :language: YAML In this case, we defined ``env_name`` to be ``wrfv4-{mpi_name}`` which matches the definition of the software environments. Dry Run Setup ------------- Before executing the experiments, you can perform: .. code-block:: console $ ramble workspace setup --dry-run And examine the contents of the rendered ``execute_experiment`` scripts in some experiment directories. Looking at these, you should see the correct MPI arguments within the relevant experiments. .. include:: shared/wrf_execute.rst Clean the Workspace ------------------- Once you are finished with the tutorial content, make sure you deactivate your workspace: .. code-block:: console $ ramble workspace deactivate Additionally, you can remove the workspace and all of its content with: .. code-block:: console $ ramble workspace remove var_expansion_and_indirection