Success Criteria

Ramble supports various methods of defining success criteria, which mark experiments as successfully completed, or failed when analyzing a workspace.

This document describes the supported functionality, and how to use success criteria in general.

Types of Success Criteria

Success criteria can be defined in two places: * An application.py file * A ramble.yaml workspace config file

Success criteria defined in an application.py file applies to all experiments this application generates. These are intended to provide more of a “did my experiment complete” success criteria.

Success criteria defined in a workspace config file only applies to those specific experiments. These criteria can evaluate if an experiment performed adequately.

The application.py interface is documented in the application developers guide.

The rest of this document focuses on how to write success criteria in a workspace configuration file. But it will describe general aspects of success criteria that can apply to either definition.

Defining Success Criteria

Success criteria defined in a workspace config file are all merged together, following the logic for optional sections in the workspace config file.

A block of success criteria is defined as a list of independent criteria that should be evaluated.

success_criteria:
- name: criteria1
  mode: 'string'
  match: 'Completed'
- name: criteria2
  mode: 'application_function'

Success criteria can mix modes, and are all evaluated. If any success criteria returns false, the experiment is marked as FAILED in the output of ramble workspace analyze.

Success Criteria Modes

Each success criteria has a mode. Currently, there are three supported modes.

'application_function'
'fom_comparison'
'string'

We will cover each of these supported modes in more detail below.

Mode: Application Function

A success criteria with mode='application_function' will use an application.py’s defined evaluate_success function to test for experiment success. This hook allows an application developer to perform more complicated actions that are specific to the experiment.

NOTE: A success criteria with mode='application_function' is added by default to all experiments. The base evaluate_success function always returns True.

Mode: FOM Comparison

Success criteria with mode='fom_comparison' allow experiments to define complex mathematical logic to evaluate success of an experiment. Defining these criteria within a workspace configuration file is as follows:

success_criteria:
- name: my_criteria
  mode: fom_comparison
  fom_name: 'application_fom_*'
  [fom_context: 'fom_context_*']
  formula: '{value} > 1.0'

The above example shows the full interface for defining success criteria with mode='fom_comparison'. The available attributes are:

name: The name of the specific criteria
mode: The mode of the specific criteria
fom_name: The name of the FOM to use for evaluating the formula
formula: The formula to evaluate for success.
fom_context: (Optional) The context fom_name should exist in. Defaults to null.

Both fom_name and fom_context support `python style globbing<https://docs.python.org/3/library/fnmatch.html>`_.

When using the globbing functionality, all contexts that match the fom_context argument are searched. Within each context, all FOMs that match fom_name are tested with formula.

The formula attribute can access any variables defined within an experiment. Additionally, an extra value variable is defined, which takes the value of the FOM as extracted from the output.

Mode: String

Success criteria with mode='string' allow experiments to define expected string regular expression matching. This criteria mode is useful if an application prints a string whenever it successfully completes. Defining these criteria within a workspace configuration file is as follows:

success_criteria:
- name: my_criteria
  mode: 'string'
  match: '\s+Completed\s+'
  [file: '{log_file}]

The above shows the full interface for degining success criteria with mode='string'. The available attributes are:

name: The name of the specific criteria.
mode: The mode of the specific criteria.
match: The regular expression used to test for success.
file: (Optional) The file (or variable that refers to a file) to test for the match in.

If match is found inside file this criteria is marked as success. If it is not found, then the criteria is marked as failed.