The enb command-line tools

enb is friendly with the command line interface (CLI). Two main ways of using the CLI with enb:

  • Using the enb program from the command line

  • Setting the values of enb.config.options from the command line when invoking your enb-using scripts.

Each way is explored in the following sections

The enb program

If you installed enb, most likely you can run enb from your command line and access its main CLI. An example output of running enb help (to display usage help) is shown next:

usage: enb [-h] {plugin,show,help} ...

CLI to the experiment notebook (enb) framework (see https://github.com/miguelinux314/experiment-notebook).

Several subcommands are available; use `enb <subcommand> -h` to show help about any specific command.

options:
  -h, --help          show this help message and exit

subcommands:
  Available enb CLI commands.

  {plugin,show,help}
    plugin            Install and manage plugins.
    show              Show useful information about enb and enb projects.
    help              Show this help and exit.

Listing available plugins and templates

The enb library comes packed with several plugins and templates. To access them, one can use enb pluginp. In particular, enb plugin -h shows all available options.

Use enb plugin list to get a list of all available plugins and templates. You can add extra parameters for filtering and/or -v for extra details.

Showing 49 plugins.
You can add arguments to filter this list, and/or use the --exclude argument.
Add -v for extra information on the listed plugins.

         analysis-gallery :: Self-contained gallery of data analysis and plotting examples.
         arithmetic_codec :: Arithmetic codec (8 bit).
           basic-workflow :: Basic, self-contained example of enb's workflow.
                      bwt :: Application of the Burrows-Wheeler Transform (BWT).
               ccsds122x1 :: Wrappers for CCSDS 122.1 (privative).
               ccsds124x0 :: Wrappers for CCSDS 124.0-B-1 (privative).
           cluster-config :: Template of a CSV configuration file for enb clusters.
              colors-dark :: Color template for dark background terminals.
           colors-default :: Default color template for light or dark background terminals.
                  emporda :: Wrapper for the emporda codec (AC and CCSDS versions).
                  enb.ini :: Copy a full enb.ini configuration in the destination project folder.
 entropy-codec-comparison :: Template for the comparison of entropy codecs.
               experiment :: Generic experiment template. Run `enb plugin list experiment` for specific experiment templates.
                    fapec :: Wrappers for FAPEC (privative).
     file_version_example :: Template and usage example for the FileVersionTable class for dataset curation.
                     flif :: Wrapper for the FLIF compressor.
                    fpack :: Wrapper for the FPACK compressor.
                      fpc :: FPC codec wrappers.
                    fpzip :: Wrappers for FPZIP codecs.
                      fse :: Wrappers for FSE codecs. Includes a Huffman-only entropy codec.
                     hdf5 :: Codec wrappers for the h5py library.
                     hevc :: Wrapper for the HEVC / H.265 codec.
                  huffman :: Huffman codec (8 bit).
          iraf_photometry :: Extra photometry information from image files.
                     jpeg :: Reference JPEG and JPEG-LS implementation.
                   jpegxl :: Wrapper for a reference JPEG-XL implementation.
                   kakadu :: Wrappers for Kakadu JPEG 2000 (privative).
                       lc :: LC Framework codec generation and application tools.
                     lcnl :: Wrappers for LCNL CCSDS 123.0-B-2 (privative).
     lossless-compression :: Template for lossless compression experiments.
        lossy-compression :: Template for lossy compression experiments.
                    lpaq8 :: Implementation of the LPAQ8 algorithm.
                      lz4 :: Wrapper for a LZ4 codec.
                   marlin :: Marlin-based image compressor.
             matplotlibrc :: Copy matplotlib's default rc file into the destination directory.
                   mcalic :: Wrapper for E. Magli et al.'s M-CALIC codec.
            montecarlo-pi :: Demo project that approximates pi in a distributed way.
                  montsec :: Wrapper for the montsec codec.
                    ndzip :: Wrapper for ndzip.
  port-experiment-example :: Self-contained port scanning experiment example.
                      rle :: Run-Length Encoding codec.
                     spdp :: Wrapper for the spdp codec.
                    speck :: Wrapper for the SPECK codec.
              test-codecs :: Install all codec plugins verify their availability.
                      v2f :: Wrapper of a codec based on V2F forests (privative).
                      vvc :: Wrapper for the VVC / H.266 codec.
                      zfp :: Wrapper for the zfp library.
                      zip :: Assortment of lz-based and bzip-based codecs.
                     zstd :: Zstandard library wrapper.

The following plugin tags have been defined and can be used for filtering:

  - documentation    (  8 plugins) Documentation examples referenced in the user manual.
  - codec            ( 33 plugins) Data compression/decompression class definitions.
  - data compression ( 38 plugins) Data compression tools.
  - template         ( 14 plugins) Templates formatteable into the installation dir.
  - project          ( 11 plugins) Project templates, including configuration files.
  - test             (  1  plugin) Plugins for testing purposes.
  - image            ( 19 plugins) Tools for image processing (including compression and analysis).
  - privative        (  6 plugins) Plugins requiring additional privative software.

Run with -v for authorship and additional information, with -h for full help.

Plugin and template installation

To install a plugin or template, use the following syntax

enb plugin install <plugin-name> <destination-folder>

For example, enb plugin install zip ./codecs/zip_codecs should produce something similar to:

Installing zip into [...]/codecs/zip_codecs...
Building zip into [...]/codecs/zip_codecs...
Plugin 'zip' successfully installed into './codecs/zip_codecs'.

Note

You can add calls to the enb.plugins.install() method within your scripts to automatically install (unless already installed) and import a plugin, e.g.,

enb.plugins.install("zip")
enb.plugins.install(name, target_dir=None, overwrite=False, automatic_import=True)

Install an Installable by name into target_dir.

Parameters:
  • name – name of the installable (e.g., plugin) to be installed. Run enb plugin list in the CLI to get a list of all available installables.

  • target_dir – If target_dir is None, it is set to plugins/<plugin_name> by default.

  • overwrite – If overwrite is False and target_dir already exists, no action is taken.

  • automatic_import – If True, the installable is imported as a module.

CLI with scripts using enb

If your script includes import enb, then you can pass –option=value arguments when invoking it to set the values of enb.config.options.

In addition, to get a list of all of all available options, you can run your script with the -h options. An example option of this is shown next:

usage: basic_workflow.py [-h] [-v] [--ini [INI ...]] [--cpu_limit CPU_LIMIT]
                         [-f] [-q] [--render_only] [--chunk_size CHUNK_SIZE]
                         [--repetitions REPETITIONS] [--report_wall_time]
                         [--force_sanity_checks]
                         [--selected_columns SELECTED_COLUMNS [SELECTED_COLUMNS ...]]
                         [--progress_report_period PROGRESS_REPORT_PERIOD]
                         [--disable_progress_bar]
                         [--project_root PROJECT_ROOT]
                         [--base_dataset_dir BASE_DATASET_DIR]
                         [--persistence_dir PERSISTENCE_DIR]
                         [--reconstructed_dir RECONSTRUCTED_DIR]
                         [--base_version_dataset_dir BASE_VERSION_DATASET_DIR]
                         [--base_tmp_dir BASE_TMP_DIR]
                         [--external_bin_base_dir EXTERNAL_BIN_BASE_DIR]
                         [--plot_dir PLOT_DIR] [--analysis_dir ANALYSIS_DIR]
                         [--ssh_csv SSH_CSV] [--disable_swap]
                         [--worker_script_name WORKER_SCRIPT_NAME]
                         [--preshutdown_wait_seconds PRESHUTDOWN_WAIT_SECONDS]
                         [--ray_port RAY_PORT]
                         [--ray_port_count RAY_PORT_COUNT]
                         [--no_remote_mount_needed]
                         [--selected_log_level {core,error,warning,message,verbose,informative,debug}]
                         [--default_print_level {core,error,warning,message,verbose,informative,debug}]
                         [--log_level_prefix {True,False}]
                         [--show_prefix_level {core,error,warning,message,verbose,informative,debug}]

A number of options can be set via the command line interface, then accessed
via enb.config.options.property_name. All of them are optional, and may be
interpreted differently by enb core modules and client code.

options:
  -h, --help            show this help message and exit

General Options:
  Group of uncategorized options.

  -v, --verbose         Be verbose? Repeat for more. Change at any time to
                        increase the logger's verbosity. (default: 0)
  --ini [INI ...], --extra_ini_paths [INI ...]
                        Additional .ini files to be used to attain file-based
                        configurations, in addition to the default ones
                        (system, user and project). If defined more than once,
                        the last definition sets the list instead of appending
                        to a common list of extra ini paths. (default: [])

Execution Options:
  General execution options.

  --cpu_limit CPU_LIMIT
                        Maximum number of CPUs to use for computation in this
                        machine See
                        https://miguelinux314.github.io/experiment-
                        notebook/cluster_setup.html for details on how to set
                        the resources employed in remote computation nodes.
                        (default: None)
  -f, --force, --overwrite
                        Force calculation of pre-existing results, if
                        available? Note that should an error occur while re-
                        computing a given index, that index is dropped from
                        the persistent support. (default: 0)
  -q, --quick           Perform a quick test with a subset of the input
                        samples? If specified q>0 times, a subset of the first
                        q target indices is employed in most get_df methods
                        from ATable instances (default: 0)
  --render_only, --no_new_results
                        If True, ATable's get_df method relies entirely on the
                        loaded persistence data, no new rows are computed.
                        This can be useful to speed up the rendering process,
                        for instance to try different aesthetic plotting
                        options. Use this option only if you know you need it.
                        (default: False)
  --chunk_size CHUNK_SIZE
                        Chunk size used when running ATable's get_df(). Each
                        processed chunk is made persistent before processing
                        the next one. This parameter can be used to control
                        the trade-off between error tolerance and overall
                        speed. (default: None)
  --repetitions REPETITIONS
                        Number of repetitions when calculating execution
                        times. This value allows computation of more reliable
                        execution times in some experiments, but is normally
                        most representative in combination with -s to use a
                        single execution process at a time. (default: 1)
  --report_wall_time    If this flag is activated, the wall time instead of
                        the CPU time is reported by default by
                        tcall.get_status_output_time. (default: False)
  --force_sanity_checks
                        If this flag is used, extra sanity checks are
                        performed by enb during the execution of this script.
                        The trade-off for rare error condition detection is a
                        slower execution time. (default: False)
  --selected_columns SELECTED_COLUMNS [SELECTED_COLUMNS ...]
                        List of selected column names for computation. If one
                        or more column names are provided, all others are
                        ignored. Multiple columns can be expressed, separated
                        by spaces. (default: None)
  --progress_report_period PROGRESS_REPORT_PERIOD
                        Default minimum time in seconds between progress
                        report updates, when get_df() is invoked and
                        computation is being processed in parallel. (default:
                        1)
  --disable_progress_bar
                        If this flag is enabled, no progress bar is employed
                        (useful to minimize the stdout volume of long-running
                        experiments). (default: False)

Dir Options:
  Options regarding default data directories.

  --project_root PROJECT_ROOT
                        Project root path. It should not normally be modified.
                        (default: /data/Dropbox/desarrollo/experiment-notebook
                        .git/enb/plugins/template_basic_workflow_example)
  --base_dataset_dir BASE_DATASET_DIR
                        Directory to be used as source of input files for
                        indices in the get_df method of tables and
                        experiments. It should be an existing, readable
                        directory. (default:
                        /data/Dropbox/desarrollo/experiment-notebook.git/enb/p
                        lugins/template_basic_workflow_example/datasets)
  --persistence_dir PERSISTENCE_DIR
                        Directory where persistence files are to be stored.
                        (default: /data/Dropbox/desarrollo/experiment-notebook
                        .git/enb/plugins/template_basic_workflow_example/persi
                        stence_basic_workflow.py)
  --reconstructed_dir RECONSTRUCTED_DIR
                        Base directory where reconstructed versions are to be
                        stored. (default: None)
  --base_version_dataset_dir BASE_VERSION_DATASET_DIR
                        Base dir for versioned folders. (default:
                        /data/Dropbox/desarrollo/experiment-notebook.git/enb/p
                        lugins/template_basic_workflow_example/datasets)
  --base_tmp_dir BASE_TMP_DIR
                        Temporary dir used for intermediate data storage. This
                        can be useful when experiments make heavy use of tmp
                        and memory is limited, avoiding out-of-RAM crashes at
                        the cost of potentially slower execution time. The dir
                        is created when defined if necessary. (default:
                        /dev/shm)
  --external_bin_base_dir EXTERNAL_BIN_BASE_DIR
                        External binary base dir. In case a centralized
                        repository is defined at the project or system level.
                        (default: None)
  --plot_dir PLOT_DIR   Directory to store produced plots. (default:
                        /data/Dropbox/desarrollo/experiment-notebook.git/enb/p
                        lugins/template_basic_workflow_example/plots)
  --analysis_dir ANALYSIS_DIR
                        Directory to store analysis results. (default:
                        /data/Dropbox/desarrollo/experiment-notebook.git/enb/p
                        lugins/template_basic_workflow_example/analysis)

Ray Options:
  Options related to the ray library, used for parallel/distributed
  computing only when --ssh_cluster_csv_path (or, equivalently --ssh_csv)
  are employed.

  --ssh_csv SSH_CSV, --ssh_cluster_csv_path SSH_CSV
                        Path to the CSV file containing a enb ssh cluster
                        configuration. See
                        https://miguelinux314.github.io/experiment-
                        notebook/installation.html. (default: None)
  --disable_swap        If this flag is used, then swap memory will not be
                        allowed by ray. By default, swap memory is enabled.
                        Note that your system may become unstable if swap
                        memory is used (specially a big portion thereof).
                        (default: False)
  --worker_script_name WORKER_SCRIPT_NAME
                        Base name of ray's worker scripts, invoked to run
                        tasks in parallel processes. You don't need to change
                        this unless you want to use custom ray workers.
                        (default: default_worker.py)
  --preshutdown_wait_seconds PRESHUTDOWN_WAIT_SECONDS
                        A wait period can be held before shutting down ray.
                        This allows displaying messages produced by child
                        processes (e.g., stack traces) in case of abrupt
                        termination of enb client code. (default: 0.5)
  --ray_port RAY_PORT   Ray port and first port that need to be open in case a
                        cluster is to be set up. Refer to
                        https://miguelinux314.github.io/experiment-
                        notebook/installation.html for further information on
                        this. (default: 11000)
  --ray_port_count RAY_PORT_COUNT
                        Total number of consecutive ports that can be assumed
                        to be open after `ray_port`. For instance, if
                        `ray_port` is 11000 and `ray_port_count` is 1000, then
                        ports 11000-11999 will be used for parallelization and
                        (if so-configured) enb clusters. (default: 500)
  --no_remote_mount_needed
                        If this flag is used, the calling script's project
                        root path is assumed to be valid AND synchronized
                        (e.g., via NFS). By default, remote mounting via sshfs
                        and vde2 is employed. (default: False)

Logging Options:
  Options controlling what and how is printed and/or logged to files.

  --selected_log_level {core,error,warning,message,verbose,informative,debug}
                        Maximum log level / minimum priority required when
                        printing messages. (default: message)
  --default_print_level {core,error,warning,message,verbose,informative,debug}
                        Selects the default log level equivalent to a regular
                        print-like message. It is most effective when combined
                        with log_print set to True. (default: message)
  --log_level_prefix {True,False}
                        If True, logged messages include a prefix, e.g., based
                        on their priority. (default: True)
  --show_prefix_level {core,error,warning,message,verbose,informative,debug}

Note

*.ini files in the project root are searched for to look for default values for the attributes of enb.config.options. You can copy and modify the default enb.ini template to your project if you would like to use file-based configuration.

Note

You can also modify enb.config.options in your code as in enb.config.options.verbose = 5. If you do, this is applied after the CLI parameter recognition, and therefore overwrites any values set via –option=value.

A set of predefined parameters are available in enb.config.options, the single instance of the enb.config.AllOptions class. When running a script that import enb.config, “-h” can be passed as argument to show a help message with all available options.

All these options (whether provided via the CLI or not) can be read and/or changed with code like the following:

from enb.config import options
print(f"Verbose level: {options.verbose}")