enb.config package

Submodules

enb.config.aini module

Automatic file-based configuration login based on the INI format.

The enb framework supports configuration files with .ini extension and format compatible with python’s configparser (https://docs.python.org/3/library/configparser.html#module-configparser), e.g., similar to Window’s INI files.

File-based configuration is used to determine the default value of enb.config.options and its CLI. Furthermore, users may easily extend file-based configuration to their own needs.

When enb is imported, the following configuration files are read, in the given order. Order is important because read properties overwrite any previously set values.

  1. The enb.ini file provided with the enb library installation.

  2. The enb.ini at the user’s enb configuration dir. This path will be determined using the appdirs library, and will depend on the OS. In many linux boxes, this dir is ~/.config/enb.

  3. All *.ini files defined in the same folder as the called script, in lexicographical, case ignorant, order. No recursive folder search is performed.

class enb.config.aini.AdditionalIniParser

Bases: ArgumentParser

__init__()
get_extra_ini_paths()
class enb.config.aini.Ini(*args, **kwargs)

Bases: object

Class of the enb.config.ini object, that exposes file-defined configurations.

__init__()
property all_ini_paths

Get a list of all used ini paths.

extra_ini_paths = []
get_key(section, name)

Return a read key value in the given section (if existing), after applying ast.literal_eval on the value to recognize any valid python literals (essentially numbers, lists, dicts, tuples, booleans and None).

global_ini_path = '/data/Dropbox/desarrollo/experiment-notebook.git/enb/config/enb.ini'
local_ini_paths = []
property sections_by_name

Get a list of all configparser.Section instances, including the default section.

update_from_path(ini_path)

Update the current configuration by reading the contents of ini_path.

user_ini_path = '/home/miguelinux/.config/enb/enb.ini'
enb.config.aini.managed_attributes(cls)

Decorator for classes so that their (class) attributes are set based on the .ini files found. Attributes starting with _ are not considered.

Values are read from the section titled as the classes fully qualified name (e.g., using the [enb.aanalysis.ScalarValueAnalyzer] header in one of the .ini files).

Note that adding keys to that section corresponding to attributes not present in the definition of cls are ignored, i.e., new attributes are not added to cls.

enb.config.aoptions module

Implementation of the classes for the enb.config.options and CLI interface

Option configuration in enb is centralized through enb.config.options. Several key aspects should be highlighted:

  • Properties defined in enb.config.options are used by enb modules, and can also be used by scripts using enb (host scripts).

  • Many core enb functions have optional arguments with default None values. Those functions will often substitute None for the corresponding value in enb.config.options, e.g., to locate the plot output directory.

  • Scripts using enb (host scripts) may alter values in enb.config.options, e.g., before calling enb methods. Properties are accessed and modified with enb.config.options.property and enb.config.property = value, respectively. You may want to use the from enb.config import options line in your host scripts to enable less verbosity.

  • The CLI can be used to set initial values of enb.config.options properties using -* and –* arguments. Running with -h any script that imports enb will show you detailed help on all available options and their default values.

  • The default values for enb.config.options and its CLI is obtained through enb.config.ini, described below.

An important note should be made about the interaction between enb.config.options and ray. When ray spawns new (local or remote) processes to serve as workers, the Options singleton is initialized for each of those process, with the catch that ray does not pass the user’s CLI parameters. Therefore, different enb.config.option values would be present in the parent script and the ray workers. To mitigate this problem, the @`enb.parallel_ray.remote` decorator is provided in substitution of ray.remote() so that options at the time of calling the remote method are available to that method at its regular location (enb.config.options).

class enb.config.aoptions.DirOptions

Bases: object

Options regarding default data directories.

analysis_dir(value)

Directory to store analysis results.

base_dataset_dir(value)

Directory to be used as source of input files for indices in the get_df method of tables and experiments.

It should be an existing, readable directory.

base_tmp_dir(value)

Temporary dir used for intermediate data storage.

This can be useful when experiments make heavy use of tmp and memory is limited, avoiding out-of-RAM crashes at the cost of potentially slower execution time.

The dir is created when defined if necessary.

base_version_dataset_dir(value)

Base dir for versioned folders.

default_external_binary_dir = None
default_tmp_dir = '/dev/shm'
external_bin_base_dir(value)

External binary base dir.

In case a centralized repository is defined at the project or system level.

persistence_dir(value)

Directory where persistence files are to be stored.

plot_dir(value)

Directory to store produced plots.

project_root(value)

Project root path. It should not normally be modified.

reconstructed_dir(value)

Base directory where reconstructed versions are to be stored.

class enb.config.aoptions.ExecutionOptions

Bases: object

General execution options.

chunk_size(value)

Chunk size used when running ATable’s get_df(). Each processed chunk is made persistent before processing the next one. This parameter can be used to control the trade-off between error tolerance and overall speed.

cpu_limit(value)

Maximum number of CPUs to use for computation in this machine See https://miguelinux314.github.io/experiment-notebook/cluster_setup.html for details on how to set the resources employed in remote computation nodes.

disable_progress_bar(value)

If this flag is enabled, no progress bar is employed (useful to minimize the stdout volume of long-running experiments).

force(value)

Force calculation of pre-existing results, if available?

Note that should an error occur while re-computing a given index, that index is dropped from the persistent support.

force_sanity_checks(value)

If this flag is used, extra sanity checks are performed by enb during the execution of this script. The trade-off for rare error condition detection is a slower execution time.

no_new_results(value)

If True, ATable’s get_df method relies entirely on the loaded persistence data, no new rows are computed. This can be useful to speed up the rendering process, for instance to try different aesthetic plotting options. Use this option only if you know you need it.

progress_report_period(value)

Default minimum time in seconds between progress report updates, when get_df() is invoked and computation is being processed in parallel.

quick(value)

Perform a quick test with a subset of the input samples?

If specified q>0 times, a subset of the first q target indices is employed in most get_df methods from ATable instances

repetitions(value)

Number of repetitions when calculating execution times.

This value allows computation of more reliable execution times in some experiments, but is normally most representative in combination with -s to use a single execution process at a time.

report_wall_time(value)

If this flag is activated, the wall time instead of the CPU time is reported by default by tcall.get_status_output_time.

selected_columns(value)

List of selected column names for computation.

If one or more column names are provided, all others are ignored. Multiple columns can be expressed, separated by spaces.

class enb.config.aoptions.GeneralOptions

Bases: object

Group of uncategorized options.

extra_ini_paths(value)

Additional .ini files to be used to attain file-based configurations, in addition to the default ones (system, user and project). If defined more than once, the last definition sets the list instead of appending to a common list of extra ini paths.

verbose(value)

Be verbose? Repeat for more. Change at any time to increase the logger’s verbosity.

class enb.config.aoptions.LoggingOptions(*args, **kwargs)

Bases: OptionsBase

Options controlling what and how is printed and/or logged to files.

default_print_level(value)

Selects the default log level equivalent to a regular print-like message. It is most effective when combined with log_print set to True.

log_level_prefix(value)

If True, logged messages include a prefix, e.g., based on their priority.

selected_log_level(value)

Maximum log level / minimum priority required when printing messages.

show_prefix_level(value)
class enb.config.aoptions.Options(*args, **kwargs)

Bases: GeneralOptions, ExecutionOptions, DirOptions, RayOptions, LoggingOptions

Class of the enb.config.options object, which exposes options for all modules, allowing CLI-based parameter setting.

Classes wishing to expand the set of global options can be defined above, using the @OptionsBase.property decorator for new properties. Making Options inherit from those classes is optional, but allows IDEs to automatically detect available properties in enb.config.options.

Parameters in this class should defined so that no positional or otherwise mandatory arguments. This is due to interactions with ray for parallelization purposes, which results in sys.argv differing in the orchestrating and host processes.

class enb.config.aoptions.OptionsBase(*args, **kwargs)

Bases: SingletonCLI

Global options for all modules, without any positional or required argument.

property non_default_properties
normalize_dir_value(value)
class enb.config.aoptions.RayOptions

Bases: object

Options related to the ray library, used for parallel/distributed computing only when –ssh_cluster_csv_path (or, equivalently –ssh_csv) are employed.

disable_swap(value)

If this flag is used, then swap memory will not be allowed by ray. By default, swap memory is enabled. Note that your system may become unstable if swap memory is used (specially a big portion thereof).

no_remote_mount_needed(value)

If this flag is used, the calling script’s project root path is assumed to be valid AND synchronized (e.g., via NFS). By default, remote mounting via sshfs and vde2 is employed.

preshutdown_wait_seconds(value)

A wait period can be held before shutting down ray. This allows displaying messages produced by child processes (e.g., stack traces) in case of abrupt termination of enb client code.

ray_port(value)

Ray port and first port that need to be open in case a cluster is to be set up. Refer to https://miguelinux314.github.io/experiment-notebook/installation.html for further information on this.

ray_port_count(value)

Total number of consecutive ports that can be assumed to be open after ray_port. For instance, if ray_port is 11000 and ray_port_count is 1000, then ports 11000-11999 will be used for parallelization and (if so-configured) enb clusters.

ssh_cluster_csv_path(value)

Path to the CSV file containing a enb ssh cluster configuration. See https://miguelinux314.github.io/experiment-notebook/installation.html.

worker_script_name(value)

Base name of ray’s worker scripts, invoked to run tasks in parallel processes. You don’t need to change this unless you want to use custom ray workers.

enb.config.aoptions.get_options(from_main=False)

Deprecated - use from enb.config import options.

Deprecated since version 0.2.7: This will be removed in 0.3.1.

enb.config.aoptions.propagates_options(f)

Decorator for local (as opposed to ray.remote) functions so that they propagate options properly to child workers. The decorated function must accept an “options” argument. Furthermore, the current working dir is set to the project root so that any relative paths stored are correctly handled.

enb.config.aoptions.set_options(new_option_dict)

Update global options with a dictionary of values

enb.config.singleton_cli module

Module to define global option classes that can be instantiated only once, and that can semi-automatically create command-line interfaces based on the user’s definition of configurable variables.

Basic usage:

`options = GlobalOptions()`

Properties are added by decorating functions. Multiple inheritance is possible with classes that decorate CLI properties, just make sure to subclass from GlobalOptions.

class enb.config.singleton_cli.ExistingDirAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: PathAction

ArgumentParser action that verifies that argument is an existing dir

classmethod assert_valid_value(target_dir)

Assert that target_dir is a readable dir

class enb.config.singleton_cli.ListAddOptionsAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ValidationAction

classmethod assert_valid_value(value)
class enb.config.singleton_cli.NonnegativeFloatAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ValidationAction

Check that a numerical value is greater or equal than zero.

classmethod assert_valid_value(value)
class enb.config.singleton_cli.PathAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ValidationAction

classmethod modify_value(value)
class enb.config.singleton_cli.PositiveFloatAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ValidationAction

Check that a numerical value is greater than zero.

classmethod assert_valid_value(value)
class enb.config.singleton_cli.PositiveIntegerAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: PositiveFloatAction

Check that value is an integer and greater than zero.

classmethod assert_valid_value(value)
class enb.config.singleton_cli.ReadableDirAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ExistingDirAction

ArgumentParser action that verifies that argument is an existing, readable dir

classmethod assert_valid_value(target_dir)

Assert that target_dir is a readable dir

class enb.config.singleton_cli.ReadableFileAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: PathAction

Validate that an argument is an existing file.

classmethod assert_valid_value(value)
class enb.config.singleton_cli.ReadableOrCreableDirAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ExistingDirAction

classmethod assert_valid_value(target_dir)

Assert that target_dir is a readable dir, or its parent exists and is writable.

class enb.config.singleton_cli.SingletonCLI(*args, **kwargs)

Bases: object

Singleton class that holds a set of CLI options.

When instantiated first, it reads CLI options, retaining class defaults when not specified. When instantiated next, a reference to the instance created first is returned. Therefore, every module share the same options instance (beware of this when modifying values).

New CLI properties can be defined using the enb.singleton_cli.SingletonCLI.property() decorator. Properties are defined using the decorated function’s name.

The following internal attributes control the class’ behavior:

  • _parsed_properties: a live dictionary of stored properties

  • _setter_functions: a dictionary storing the decorated functions that server play a role as setters of new variable values.

__init__()

Initializer guaranteed to be called once thanks to the Singleton metaclass.

classmethod assert_setter_signature(f)

Assert that f has a valid setter signature, or raise a SyntaxError.

items()
print_help()
classmethod property(*aliases, group_name=None, group_description=None, **kwargs)

Decorator for (optional) properties that can be automatically parsed using argparse, and also programmatically (the setter is created by default when the getter is defined).

Automatic CLI interface help is produced based on the docstring (sets the help= argument) and kargs (these may overwrite the help string).

Functions being decorated play a role similar to the @x.setter-decorated function in the regular @property protocol, with the following important observations:

  • Decorated functions are called whenever options.property_name = value is used, where options is a SingletonCLI instance and property_name is one of its defined properties.

  • The decorated functions’ docstrings are used as help for those arguments.

  • If a None value is returned, the property is updated (e.g., defining a function with a single pass line) with the original value without any transformation. No need to update the enb.config.options instance directly.

  • If a non-None value is returned, that value is used instead. To set a property value to None, self._parsed_properties dict must be updated manually by the decorated function.

  • Subclasses may choose to raise an exception if a read-only property is trying to be set.

  • CLI validation capabilities are provided by the argparse.Action subclasses defined above.

Note that modules and client code may choose to act differently than these options are intended.

Default values are taken from the file-based configuration proxy in ainit

Parameters:
  • aliases – a list of aliases that can be used for the property in the command line.

  • group_name – the name of the group of parameters to be used. If None, the defining classe’s name is adapted. If unavailable, the last employed group is assumed.

  • group_name – the description of the group of parameters to be used. If None, the defining classe’s docstring is used. If unavailable, the last employed group is assumed.

  • group_description – description of the current group of parameters. If none, it is taken from the calling class’ docstring.

  • kwargs – remaining arguments to be passed when initializing argparse.ArgumentParser instances. See that class for detailed help on available parameters and usage.

update(other, trigger_events=False)

Update self with other, using None value items from other’s items() method.

Parameters:
  • other – dict-like object with key-value pairs to be used to update self.

  • trigger_events – if True, the setter functions are used to assign any items found. If alse, self’s attributes are updated directly without using those methods.

class enb.config.singleton_cli.ValidationAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: Action

Base class for defining custom parser validation actions.

classmethod assert_valid_value(value)
classmethod check_valid_value(value)
classmethod modify_value(value)
class enb.config.singleton_cli.ValidationTemplateNameAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ValidationAction

Validate that a name for a template is propper

classmethod assert_valid_value(value)
class enb.config.singleton_cli.WritableDirAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ExistingDirAction

ArgumentParser action that verifies that argument is an existing, writable dir

classmethod assert_valid_value(target_dir)

Assert that target_dir is a readable dir

class enb.config.singleton_cli.WritableOrCreableDirAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)

Bases: ExistingDirAction

ArgumentParser action that verifies that argument is either an existing writable dir or a writable parent exists.

classmethod assert_valid_value(target_dir)

Assert that target_dir is a writable dir, or its parent exists and is writable.

enb.config.singleton_cli.property_class(base_option_cls: SingletonCLI)

Decorator for classes solely intended to define new properties to base_option_cls.

Decorated classes can still make use of @base_option_cls.property normally. Properties defined like that are regrouped into the appropriate argument groups.

Any non decorated method that accepts exactly one argument is assumed to be a property with None as default value.

Parameters:

base_option_cls – all properties defined in the decorated class are added or updated in the property definition of this class

Module contents

# The config module

## Introduction

The config module deals with two main aspects:

  1. It provides the enb.config.options object with global configurations shared among enb modules and accessible to scripts using enb. These options can be accessed and set programmatically (e.g., enb.config.options.verbose += 1), and also through the CLI (see details below, or run with -h a python script that imports enb).

  2. It provides the enb.config.ini object to access properties defined in .ini files. These affect the default CLI values and can be easily extended by users to support file-based configuration. See below for details on this part.

Both enb.config.options and enb.config.ini are argparse.Namespace instances. After a more detailed description of these two tools, a summary of configuration setting priority is also provided.

## enb.config.options and CLI interface

Option configuration in enb is centralized through enb.config.options. Several key aspects should be highlighted:

  • Properties defined in enb.config.options are used by enb modules, and can also be used by scripts using enb (host scripts).

  • Many core enb functions have optional arguments with default None values. Those functions will often substitute None for the corresponding value in enb.config.options, e.g., to locate the plot output directory.

  • Scripts using enb (host scripts) may alter values in enb.config.options, e.g., before calling enb methods. Properties are accessed and modified with enb.config.options.property and enb.config.property = value, respectively. You may want to use the from enb.config import options line in your host scripts to enable less verbosity.

  • The CLI can be used to set initial values of enb.config.options properties using -* and –* arguments. Running with -h any script that imports enb will show you detailed help on all available options and their default values.

  • The default values for enb.config.options and its CLI is obtained through enb.config.ini, described below.

An important note should be made about the interaction between enb.config.options and ray. When ray spawns new (local or remote) processes to serve as workers, the Options singleton is initialized for each of those process, with the catch that ray does not pass the user’s CLI parameters. Therefore, different enb.config.option values would be present in the parent script and the ray workers. To mitigate this problem, an options parameter is defined and passed to many these functions, e.g., with f.remote(options=ray.put(enb.config.options)) if f is your @enb.parallel.parallel-decorated function. The @enb.config.propagates_options decorator provides a slightly cleaner way of automating this mitigation.

## enb.config.ini file-based configuration

See the enb.ini module for further information on how this is handled.

## Effective parameter values

Based on the above description and references, the values in enb.config.options will be given by the first of these options:

  1. Programmatically set properties, e.g., enb.config.options.verbose += 2. The last set value is used.

  2. Parameters -* and -** passed directly to the invoked script.

  3. Default CLI parameters specified in any *.ini files in the same folder as the invoked script (this can be empty).

  4. Default CLI parameters specified in any *.ini files in enb’s configuration file (e.g., ~/.config/enb/enb.ini).

From there on, many enb functions adhere to the following principle:

  1. If a parameter is set to a non-None value, that value is used.

  2. If a parameter with default value None is set to None or not specified, its value is set based on the properties in enb.config.options.

enb.config.report_configuration()

Return a string describing the current configuration status.