Success Criteria
Ramble supports various methods of defining success criteria, which mark experiments as successfully completed, or failed when analyzing a workspace.
This document describes the supported functionality, and how to use success criteria in general.
Types of Success Criteria
Success criteria can be defined in two places: * An application.py file * A ramble.yaml workspace config file
Success criteria defined in an application.py file applies to all experiments this application generates. These are intended to provide more of a “did my experiment complete” success criteria.
Success criteria defined in a workspace config file only applies to those specific experiments. These criteria can evaluate if an experiment performed adequately.
The application.py interface is documented in the
application developers guide.
The rest of this document focuses on how to write success criteria in a workspace configuration file. But it will describe general aspects of success criteria that can apply to either definition.
Defining Success Criteria
Success criteria defined in a workspace config file are all merged together, following the logic for optional sections in the workspace config file.
A block of success criteria is defined as a list of independent criteria that should be evaluated.
success_criteria:
- name: criteria1
mode: 'string'
match: 'Completed'
- name: criteria2
mode: 'application_function'
Success criteria can mix modes, and are all evaluated. If any success
criteria returns false, the experiment is marked as FAILED in the output of
ramble workspace analyze.
Success Criteria Modes
Each success criteria has a mode. Currently, there are three supported modes.
'application_function''fom_comparison''string'
We will cover each of these supported modes in more detail below.
Mode: Application Function
A success criteria with mode='application_function' will use an
application.py’s defined evaluate_success function to test for experiment
success. This hook allows an application developer to perform more complicated
actions that are specific to the experiment.
NOTE: A success criteria with mode='application_function' is added by
default to all experiments. The base evaluate_success function always
returns True.
Mode: FOM Comparison
Success criteria with mode='fom_comparison' allow experiments to define
complex mathematical logic to evaluate success of an experiment. Defining
these criteria within a workspace configuration file is as follows:
success_criteria:
- name: my_criteria
mode: fom_comparison
fom_name: 'application_fom_*'
[fom_context: 'fom_context_*']
formula: '{value} > 1.0'
The above example shows the full interface for defining success criteria with
mode='fom_comparison'. The available attributes are:
name: The name of the specific criteriamode: The mode of the specific criteriafom_name: The name of the FOM to use for evaluating the formulaformula: The formula to evaluate for success.fom_context: (Optional) The contextfom_nameshould exist in. Defaults tonull.
Both fom_name and fom_context support
`python style globbing<https://docs.python.org/3/library/fnmatch.html>`_.
When using the globbing functionality, all contexts that match the
fom_context argument are searched. Within each context, all FOMs that match
fom_name are tested with formula.
The formula attribute can access any variables defined within an
experiment. Additionally, an extra value variable is defined, which takes
the value of the FOM as extracted from the output.
Mode: String
Success criteria with mode='string' allow experiments to define expected
string regular expression matching. This criteria mode is useful if an
application prints a string whenever it successfully completes. Defining these
criteria within a workspace configuration file is as follows:
success_criteria:
- name: my_criteria
mode: 'string'
match: '\s+Completed\s+'
[file: '{log_file}]
The above shows the full interface for degining success criteria with
mode='string'. The available attributes are:
name: The name of the specific criteria.mode: The mode of the specific criteria.match: The regular expression used to test for success.file: (Optional) The file (or variable that refers to a file) to test for thematchin.
If match is found inside file this criteria is marked as success. If it
is not found, then the criteria is marked as failed.