yaecs

yaecs.config_history module

class ConfigHistory(folder_path, config_filters=None, difference_processor=None, ignore_for_graphing=None, add_relevant_edges=False, tolerance=0, group_by=None, metrics=None, display_span=False, config_class=<class 'yaecs.config.config.Configuration'>)

Bases: object

Configuration history manager.

compute_span()

Computes span of parameter in experiments.

compute_metrics(metrics)

Compute all defined metrics to prepare for printing

compute_graph(add_relevant_edges=False, tolerance=0)

Computes the graph itself before filling it in.

compute_colors(scheme='date', fill='top', legend=True, base_color='white', class_scheme='/set312/', number_scheme='/blues9/')

Gets the colour scheme and interpolates it properly.

draw_graph(path='graph.png', scheme='date', fill='top', legend=True)

Function called to draw the computed graph.

format_metrics(index)

Format computed metrics for printing.

format_span()

Format the span of a param over experiments for printing.

compute_difference_matrix()

Compute difference matrix to prepare for printing.

static get_experiment_name_from_file(file, folder, name=None)

Extracts the experiment name from the name of a file saved in an experiment folder.

static format_list(list_to_format)

Formats list for printing

static make_processor(argument)

Returns a processor created from input.

yaecs.experiment module

class Experiment(config, main_function, experiment_name=None, run_name=None, params_filter_fn=None, log_modified_params_only=True, only_params_to_log=None, params_not_to_log=None, description_formatter=None)

Bases: object

Class automating tracking using different tracking packages.

Creates an instance of the Experiment class, which wraps around a main function.

Parameters:
  • config (Configuration) – config used for the experiment

  • main_function (Callable) – function to run to perform the experiment

  • experiment_name (Optional[str]) – name of the experiment, defaults to name of the folder set as experiment path in the config

  • run_name (Optional[str]) – name of the run, defaults as the index of the run in the experiment folder

  • params_filter_fn (Optional[Callable[[Configuration], List[str]]]) – function to use instead of the default filter to get the list of the names of the parameters to log to the tracker from the config. If this is used, then ‘log_modified_params_only’, ‘only_params_to_log’ and ‘params_not_to_log’ are ignored.

  • log_modified_params_only (bool) – whether the parameters to filter using the other arguments are the parameters that changed compared to the default config (True) or only those of the whole config (False)

  • only_params_to_log (Optional[List[str]]) – if provided, only the parameters whose names are given will be filtered and logged

  • params_not_to_log (Optional[List[str]]) – if provided, parameters whose names are given will be filtered out

  • description_formatter (Optional[Callable[[Optional[str]], str]]) – optional function to use to format the provided run description instead of the default formatter self.default_formatter

default_formatter(description)

This function formats the provided description before passing it to the trackers.

Parameters:

description (Optional[str]) – provided description to format. You can use the tag %h once in the description. Everything before this tag will be considered the header of the description

Raises:

RuntimeError – when more than one header tag is detected in the description

Return type:

str

Returns:

formatted description

run(run_description=None, **kwargs)

Creates all variations of the config and starts a run for each of them.

Parameters:
  • run_description (Optional[str]) – if passed, will serve as a description for the purpose of the current run. You can use the tag %h once in the description. Everything before this tag will be considered the header of the description

  • kwargs – arguments to pass to the main function aside from the config and the tracker

Return type:

Any

Returns:

whatever the main function returns

run_single(run_description=None, **kwargs)

Runs a single experiment with the defined config. Does not check the config variations.

Parameters:
  • run_description (Optional[str]) – if passed, will serve as a description for the purpose of the current run. You can use the tag %h once in the description. Everything before this tag will be considered the header of the description

  • kwargs – arguments to pass to the main function aside from the config and the tracker

Return type:

Any

Returns:

whatever the main function returns

class Tracker(tracker_config, experiment, experiment_name=None, run_name=None, params_filter_fn=None, log_modified_params_only=True, only_params_to_log=None, params_not_to_log=None)

Bases: object

Class created by Experiment to log values.

Reads the tracker config from the general config to create a Tracker object used for logging during the run.

Parameters:
  • tracker_config (Dict[str, Any]) – tracker config from the general config

  • experiment (Experiment) – passed automatically from the instance of Experiment this tracker originates from

  • experiment_name (Optional[str]) – name for the experiment (inferred from the experiment path if not provided)

  • run_name (Optional[str]) – name for the run (inferred from the experiment path if not provided)

  • params_filter_fn (Optional[Callable]) – function to use instead of the default filter to get the list of the names of the parameters to log to the tracker from the config. If this is used, then ‘log_modified_params_only’, ‘only_params_to_log’ and ‘params_not_to_log’ are ignored.

  • log_modified_params_only (bool) – whether the parameters to filter using the other arguments are the parameters that changed compared to the default config (True) or only those of the whole config (False)

  • only_params_to_log (Optional[List[str]]) – if provided, only the parameters whose names are given will be filtered and logged

  • params_not_to_log (Optional[List[str]]) – if provided, parameters whose names are given will be filtered out

default_filter(config)

Default parameters filtering function. Uses the pre-defined ‘log_modified_params_only’, ‘only_params’ and ‘except_params’ attributes to filter out the parameters of the config. It starts with a list of all the params which were modified from the default config, except if log_modified_params_only was set to False in which case it starts from all the parameters in the config. Then, it filters out all hooks, then all parameters in except_params, and finally only keeps those that are also in only_params.

Parameters:

config (Configuration) – instance of Configuration from which to filter the parameters

Return type:

List[str]

Returns:

the list of the filtered parameters names

extract_names()

Gets experiment and run names, using either names given when instantiating the Experiment (if provided) or inferring them from the config’s experiment path. If nothing is provided and no experiment path is defined, raises an error.

Raises:

RuntimeError – if no experiment path is defined in the config and no experiment or run name is provided

Return type:

Tuple[str, str]

Returns:

the experiment name and the run name

start_run(description=None)

Initialises the configured trackers, which most of the time means preparing their logger is self.loggers.

Return type:

None

log_scalar(name, value, step=None, sub_logger=None, description=None, main_process_only=False)

Logs the given value under the given name at given step in the configured trackers. The description is optional and can only be used if the tracker is tensorboard.

Parameters:
  • name (str) – name for the value to be logged

  • value (Union[float, int]) – value to be logged

  • step (Optional[int]) – step at which the value is logged. If not provided, will default to 0 for the tensorboard tracker, will default to -1 for the basic tracker and will be logged as a “single value” for the clearml tracker

  • sub_logger (Optional[str]) – if specified, logs to corresponding sub-logger. Can be interpreted as a sub-folder for the scalar name most of the time, but in the case of tensorboard will actually use a different summary writer

  • description (Optional[str]) – only used for the tensorboard tracker, corresponds to a short description of the value

  • main_process_only (bool) – do not try to log in pytorch-lightning sub-processes

Return type:

None

log_scalars(dictionary, step=None, sub_logger=None, main_process_only=False)

Logs several values contained in a dictionary, one by one using Tracker.log_scalar.

Parameters:
  • dictionary (Dict[str, Any]) – dictionary containing the (name, value) pairs to be logged

  • step (Optional[int]) – step at which the value is logged. If not provided, will default to 0 for the tensorboard tracker, will default to -1 for the basic tracker and will be logged as a “single value” for the clearml tracker

  • sub_logger (Optional[str]) – if specified, logs to corresponding sub-logger. Can be interpreted as a sub-folder for the scalar name most of the time, but in the case of tensorboard will actually use a different summary writer

  • main_process_only (bool) – do not try to log in pytorch-lightning sub-processes

Return type:

None

class BasicTrackerContext(logger_path, runs, current)

Bases: object

Class used to set up the context for YAECS’ basic tracker

Initialises a context used to declare the loggers required by the basic tracker.

Parameters:
  • logger_path (str) – path used by the basic tracker to log

  • runs (Optional[int]) – number of runs in the experiment

  • current (Optional[int]) – index of current run from 0 to runs-1

class MLFContext

Bases: object

Class used to set up the context for mlflow’s tracker

class CMLContext(tracker)

Bases: object

Class used to set up the context for ClearML’s tracker

Initialises a context used to close the ClearML runs when they are done.

Parameters:

tracker (Tracker) – tracker object where to find the ClearML runs

yaecs.pytorch_lightning_utils module

yaecs.user_utils module

tqdm_file()

Utility function which returns a file to which users can log their TQDM bars to make them YAECS-friendly.

Returns:

TQDM file

get_template_class(default_config_path=None, pre_processing_dict=None, post_processing_dict=None, experiment_path=None, tracker_config=None, additional_configs_suffix=None, additional_configs_prefix=None, variations_suffix=None, variations_prefix=None, grids_suffix=None, grids_prefix=None)

Creates a template Configuration subclass to use in a small project where little customisation is needed.

Parameters:
  • default_config_path (Union[str, dict, None]) – path to the default config to use for the template

  • pre_processing_dict (Optional[Dict[str, Callable]]) – pre-processing dict to use for the template. If this gets large, consider implementing the subclass yourself as this will be clearer and more flexible.

  • post_processing_dict (Optional[Dict[str, Callable]]) – post-processing dict to use for the template. If this gets large, consider implementing the subclass yourself as this will be clearer and more flexible.

  • experiment_path (Optional[str]) – automatically adds relevant pre-processing rules to consider given parameter name the experiment path

  • tracker_config (Optional[str]) – automatically adds relevant pre-processing rules to consider given parameter name the tracker config

  • additional_configs_suffix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names ending with ‘additional_configs_suffix’ as paths to additional config files

  • additional_configs_prefix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names starting with ‘additional_configs_prefix’ as paths to additional config files

  • variations_suffix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names ending with ‘variations_suffix’ as config variations

  • variations_prefix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names starting with ‘variations_prefix’ as config variations

  • grids_suffix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names ending with ‘grids_suffix’ as grids

  • grids_prefix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names starting with ‘grids_prefix’ as grids

Return type:

Type[Configuration]

Returns:

a template Configuration subclass

make_config(*configs, config_class=None, pre_processing_dict=None, post_processing_dict=None, experiment_path=None, tracker_config=None, additional_configs_suffix=None, additional_configs_prefix=None, variations_suffix=None, variations_prefix=None, grids_suffix=None, grids_prefix=None, fallback='{}', pattern='--config', **class_building_kwargs)

One-liner wrapper to create a config from dicts/strings without the need for declaring a subclass. Useful for scripts or jupyter notebooks. Impractical/hacky for larger projects.

Parameters:
  • configs (Union[str, dict, List[Union[str, dict]]]) – dicts and strings defining a config

  • config_class (Optional[Type[Configuration]]) – class to use to build the configuration. If not provided, use a template instead.

  • pre_processing_dict (Optional[Dict[str, Callable]]) – pre-processing dict to use for the template. If this gets large, consider implementing the subclass yourself as this will be clearer and more flexible. Only used if config_class is not provided.

  • post_processing_dict (Optional[Dict[str, Callable]]) – post-processing dict to use for the template. If this gets large, consider implementing the subclass yourself as this will be clearer and more flexible.

  • experiment_path (Optional[str]) – automatically adds relevant pre-processing rules to consider given parameter name the experiment path. Only used if config_class is not provided.

  • tracker_config (Optional[str]) – automatically adds relevant pre-processing rules to consider given parameter name the tracker config. Only used if config_class is not provided.

  • additional_configs_suffix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names ending with ‘additional_configs_suffix’ as paths to additional config files. Only used if config_class is not provided.

  • additional_configs_prefix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names starting with ‘additional_configs_prefix’ as paths to additional config files. Only used if config_class is not provided.

  • variations_suffix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names ending with ‘variations_suffix’ as config variations. Only used if config_class is not provided.

  • variations_prefix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names starting with ‘variations_prefix’ as config variations. Only used if config_class is not provided.

  • grids_suffix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names ending with ‘grids_suffix’ as grids. Only used if config_class is not provided.

  • grids_prefix (Optional[str]) – automatically adds relevant pre-processing rules to consider parameter names starting with ‘grids_prefix’ as grids. Only used if config_class is not provided.

  • fallback (Union[List[Union[str, dict]], str, dict, None]) – if provided, use this as a fallback when no config is provided in argv. The default value “{}” stands for “by default do not merge anything if no merge pattern is found in argv”

  • pattern (str) – pattern to use to find the config in argv.

  • class_building_kwargs – same kwargs as those used in all Configuration constructors. Only used if config_class is not provided.

Return type:

Configuration

Returns:

config object

yaecs.yaecs_utils module

add_to_csv(csv_path, name, value, step)

Adds a logged value to the csv containing previously logged values

Parameters:
  • csv_path (str) – path to the csv containing the logged values

  • name (str) – name of the value to log

  • value (Any) – value of the value to log

  • step (int) – step for which to log the value

Return type:

None

assign_order(order=0)

Decorator used to give an order to a processing function. If several processing functions would be called at a given step, they are called in increasing order.

Parameters:

order (Union[Real, Priority]) – order to give the function

Return type:

Callable[[Callable], Callable]

Returns:

decorated function

assign_yaml_tag(processor_tag, processor_type, replacement_type_hint='Any')

Decorator used to mark a function as a processor added automatically as pre or post processing function (as defined by processor_type) to parameters tagged with !type:<processor_tag>. Their type hint will be replaced by the type hint defined as replacement_type_hint.

Parameters:
  • processor_tag (str) – tag to use to mark a param in YAML as auto-processed by this function

  • processor_type (str) – ‘pre’ or ‘post’, type of processing function to add

  • replacement_type_hint (str) – type hint to use for any param tagged with this auto-processor

Return type:

Callable[[Callable], Callable]

Returns:

decorated function

compare_string_pattern(name, pattern)

Returns True when string ‘name’ matches string ‘pattern’, with the ‘*’ character matching any number of characters.

Parameters:
  • name (str) – name to compare

  • pattern (str) – pattern to match

Return type:

bool

Returns:

result of comparison

compose(*functions)

Returns the composition of the functions given as argument. Functions are applied from left to right, ie : compose(f, g, h)(x) = h(g(f(x))).

Parameters:

functions (Callable) – all functions to compose, applied from left to right

Return type:

Callable

Returns:

the composed function

dict_apply(dictionary, function)

Returns a copy of dict ‘dictionary’ where function ‘function’ was applied to all values.

Parameters:
  • dictionary (dict) – dictionary to copy

  • function (Callable) – function to map

Return type:

dict

Returns:

copied dictionary

escape_symbols(string_to_escape, symbols)

Take a string ‘string_to_escape’ as input and escapes characters as defined in ‘symbols’.

Parameters:
  • string_to_escape (str) – string where the escaping operation takes place

  • symbols (Union[List[str], str]) – list of strings to escape or string containing the characters to escape

Return type:

str

Returns:

escaped string

format_str(config_path_or_dictionary, size=200)

Format helper to shorten configs to display depending on logging level.

Parameters:
  • config_path_or_dictionary (Union[str, dict]) – config to display

  • size (int) – number of characters allowed to display

Return type:

str

Returns:

the formatted string

get_config_from_argv(pattern, fallback=None)

Get a configuration from the command line arguments.

Parameters:
Return type:

List[str]

Returns:

the configuration

get_quasi_bash_sys_argv(string_to_convert)

If a string is passed as input, process it as sys.argv would in a bash shell It gives exactly what sys.argv would if the script was used in a bash terminal, except that escaped ‘!’ in quotes are properly escaped and the escape symbol is removed, contrary to bash (which would keep the escape for some obscure reason).

Parameters:

string_to_convert (str) – string to process

Return type:

List[str]

Returns:

the list of strings that sys.argv would give

get_order(func)

If input function has an “order” attribute, returns it. Otherwise, returns Priority.INDIFFERENT.

Parameters:

func (Callable) – function to get the order of

Return type:

Union[Real, Priority]

Returns:

the order value

get_param_as_parsable_string(param)

Gets given value as a string that can be parsed by the Configuration. The string is formatted so as to be either used as is in a bash shell (ie., python main.py –param_name string), or with merge_from_command_line (ie., config.merge_from_command_line(f”–param_name {string}”)

Parameters:

param (Any) – parameter value to be returned as a valid string

Raises:

TypeError – if the type of ‘param’ cannot be enforced

Return type:

str

Returns:

string usable in the command line to reproduce the value of param

hook(hook_name)

Decorator used to keep track of registered params.

Parameters:

hook_name (str) – name of the hook to store

Return type:

Callable[[Callable], Callable]

Returns:

decorated function

is_type_valid(value, config_class)

Checks whether input ‘value’ can be saved in a YAML file by Configuration’s YAML Dumper.

Parameters:
  • value (Any) – value to check the type of

  • config_class (type) – Configuration class, which must be passed as argument to avoid circular imports :(

Return type:

bool

Returns:

result of the test

lazy_import(name)

Imports a module in such a way that it is only loaded in memory when it is actually used. Implementation from https://docs.python.org/3/library/importlib.html#implementing-lazy-imports.

Parameters:

name (str) – name of the module to load

Return type:

ModuleType

Returns:

the loaded module

new_print(*args, sep=' ', end='', file=None, **keywords)

Replaces the builtin print function during an experiment run such that printed messages are also logged. Please note that the default file (None) logs to logging’s root logger which will always go to the next line after each message. Therefore, the ‘end’ param does not replace n as usual, but adds a suffix after the message and before the n.

Parameters:
  • args – objects to print

  • sep (str) – how to separate the different objects

  • end (str) – suffix to add after the message

  • file (Optional[TextIOWrapper]) – file to print to, defaults to a logging to logging’s root logger with level logging.INFO

  • keywords – might contain ‘flush’, in which case raise an error

Raises:

TypeError – when the keyword arguments contain ‘flush’

Return type:

None

parse_type(string_to_process)

Parses an input string containing the type info for a parameter into a complex type as understood by the Configuration.check_type function.

Parameters:

string_to_process (str) – string to parse for type

Return type:

Union[type, tuple, list, dict, set, int]

Returns:

complex type

class Priority(value)

Bases: Enum

Define priority levels which can be used to qualify when a processing function should be performed.

ALWAYS_FIRST = -20
OFTEN_FIRST = -10
INDIFFERENT = 0
SITUATIONAL = 0
OFTEN_LAST = 10
ALWAYS_LAST = 20
recursive_set_attribute(obj, key, value)

Recursively gets attributes of ‘obj’ until object.__setattr__ can be used to force-set parameter ‘key’ to value ‘value’.

Parameters:
  • obj (Any) – object where to set the key to the value

  • key (str) – attribute of the object to set recursively

  • value (Any) – value to set

Return type:

None

set_function_attribute(func, attribute_name, value)

Adds an attribute to a function object.

Parameters:
  • func (Callable) – function to add the attribute to

  • attribute_name (str) – name of the attribute to add

  • value (Any) – value of the attribute

Return type:

None

update_state(state_descriptor)

Decorator used to store useful information in Configuration._state when using some recursive functions. Kind of a hack, but very useful to keep track of the loading state and also to debug.

Parameters:

state_descriptor (str) – string indicating what to store in Configuration._state

Return type:

Callable[[Callable], Callable]

Returns:

decorated function

class TqdmLogFormatter(logger)

Bases: object

Context setting formatters used in logging handlers for tqdm bars. See https://github.com/tqdm/tqdm/issues/313

class TqdmLogger(logger)

Bases: StringIO

File to use in tqdm to make it log its bars to a logger. See https://github.com/tqdm/tqdm/issues/313

write(buffer)

Write string to file.

Returns the number of characters written, which is always equal to the length of the string.

flush()

Flush write buffers, if applicable.

This is not implemented for read-only and non-blocking streams.