The enb command-line tools
enb
is friendly with the command line interface (CLI). Two main ways of using the CLI
with enb:
Using the enb program from the command line
Setting the values of enb.config.options from the command line when invoking your enb-using scripts.
Each way is explored in the following sections
The enb program
If you installed enb, most likely you can run enb from your command line and access its main CLI. An example output of running enb help (to display usage help) is shown next:
usage: enb [-h] {plugin,show,help} ...
CLI to the experiment notebook (enb) framework (see https://github.com/miguelinux314/experiment-notebook).
Several subcommands are available; use `enb <subcommand> -h` to show help about any specific command.
options:
-h, --help show this help message and exit
subcommands:
Available enb CLI commands.
{plugin,show,help}
plugin Install and manage plugins.
show Show useful information about enb and enb projects.
help Show this help and exit.
Listing available plugins and templates
The enb
library comes packed with several plugins and templates. To access them, one can use enb pluginp.
In particular, enb plugin -h shows all available options.
Use enb plugin list to get a list of all available plugins and templates. You can add extra parameters for filtering and/or -v for extra details.
Showing 49 plugins.
You can add arguments to filter this list, and/or use the --exclude argument.
Add -v for extra information on the listed plugins.
analysis-gallery :: Self-contained gallery of data analysis and plotting examples.
arithmetic_codec :: Arithmetic codec (8 bit).
basic-workflow :: Basic, self-contained example of enb's workflow.
bwt :: Application of the Burrows-Wheeler Transform (BWT).
ccsds122x1 :: Wrappers for CCSDS 122.1 (privative).
ccsds124x0 :: Wrappers for CCSDS 124.0-B-1 (privative).
cluster-config :: Template of a CSV configuration file for enb clusters.
colors-dark :: Color template for dark background terminals.
colors-default :: Default color template for light or dark background terminals.
emporda :: Wrapper for the emporda codec (AC and CCSDS versions).
enb.ini :: Copy a full enb.ini configuration in the destination project folder.
entropy-codec-comparison :: Template for the comparison of entropy codecs.
experiment :: Generic experiment template. Run `enb plugin list experiment` for specific experiment templates.
fapec :: Wrappers for FAPEC (privative).
file_version_example :: Template and usage example for the FileVersionTable class for dataset curation.
flif :: Wrapper for the FLIF compressor.
fpack :: Wrapper for the FPACK compressor.
fpc :: FPC codec wrappers.
fpzip :: Wrappers for FPZIP codecs.
fse :: Wrappers for FSE codecs. Includes a Huffman-only entropy codec.
hdf5 :: Codec wrappers for the h5py library.
hevc :: Wrapper for the HEVC / H.265 codec.
huffman :: Huffman codec (8 bit).
iraf_photometry :: Extra photometry information from image files.
jpeg :: Reference JPEG and JPEG-LS implementation.
jpegxl :: Wrapper for a reference JPEG-XL implementation.
kakadu :: Wrappers for Kakadu JPEG 2000 (privative).
lc :: LC Framework codec generation and application tools.
lcnl :: Wrappers for LCNL CCSDS 123.0-B-2 (privative).
lossless-compression :: Template for lossless compression experiments.
lossy-compression :: Template for lossy compression experiments.
lpaq8 :: Implementation of the LPAQ8 algorithm.
lz4 :: Wrapper for a LZ4 codec.
marlin :: Marlin-based image compressor.
matplotlibrc :: Copy matplotlib's default rc file into the destination directory.
mcalic :: Wrapper for E. Magli et al.'s M-CALIC codec.
montecarlo-pi :: Demo project that approximates pi in a distributed way.
montsec :: Wrapper for the montsec codec.
ndzip :: Wrapper for ndzip.
port-experiment-example :: Self-contained port scanning experiment example.
rle :: Run-Length Encoding codec.
spdp :: Wrapper for the spdp codec.
speck :: Wrapper for the SPECK codec.
test-codecs :: Install all codec plugins verify their availability.
v2f :: Wrapper of a codec based on V2F forests (privative).
vvc :: Wrapper for the VVC / H.266 codec.
zfp :: Wrapper for the zfp library.
zip :: Assortment of lz-based and bzip-based codecs.
zstd :: Zstandard library wrapper.
The following plugin tags have been defined and can be used for filtering:
- documentation ( 8 plugins) Documentation examples referenced in the user manual.
- codec ( 33 plugins) Data compression/decompression class definitions.
- data compression ( 38 plugins) Data compression tools.
- template ( 14 plugins) Templates formatteable into the installation dir.
- project ( 11 plugins) Project templates, including configuration files.
- test ( 1 plugin) Plugins for testing purposes.
- image ( 19 plugins) Tools for image processing (including compression and analysis).
- privative ( 6 plugins) Plugins requiring additional privative software.
Run with -v for authorship and additional information, with -h for full help.
Plugin and template installation
To install a plugin or template, use the following syntax
enb plugin install <plugin-name> <destination-folder>
For example, enb plugin install zip ./codecs/zip_codecs should produce something similar to:
Installing zip into [...]/codecs/zip_codecs...
Building zip into [...]/codecs/zip_codecs...
Plugin 'zip' successfully installed into './codecs/zip_codecs'.
Note
You can add calls to the enb.plugins.install()
method within your scripts
to automatically install (unless already installed) and import a plugin, e.g.,
enb.plugins.install("zip")
- enb.plugins.install(name, target_dir=None, overwrite=False, automatic_import=True)
Install an Installable by name into target_dir.
- Parameters:
name – name of the installable (e.g., plugin) to be installed. Run enb plugin list in the CLI to get a list of all available installables.
target_dir – If target_dir is None, it is set to plugins/<plugin_name> by default.
overwrite – If overwrite is False and target_dir already exists, no action is taken.
automatic_import – If True, the installable is imported as a module.
CLI with scripts using enb
If your script includes import enb, then you can pass –option=value arguments when invoking it to set the values of enb.config.options.
In addition, to get a list of all of all available options, you can run your script with the -h options. An example option of this is shown next:
usage: basic_workflow.py [-h] [-v] [--ini [INI ...]] [--cpu_limit CPU_LIMIT]
[-f] [-q] [--render_only] [--chunk_size CHUNK_SIZE]
[--repetitions REPETITIONS] [--report_wall_time]
[--force_sanity_checks]
[--selected_columns SELECTED_COLUMNS [SELECTED_COLUMNS ...]]
[--progress_report_period PROGRESS_REPORT_PERIOD]
[--disable_progress_bar]
[--project_root PROJECT_ROOT]
[--base_dataset_dir BASE_DATASET_DIR]
[--persistence_dir PERSISTENCE_DIR]
[--reconstructed_dir RECONSTRUCTED_DIR]
[--base_version_dataset_dir BASE_VERSION_DATASET_DIR]
[--base_tmp_dir BASE_TMP_DIR]
[--external_bin_base_dir EXTERNAL_BIN_BASE_DIR]
[--plot_dir PLOT_DIR] [--analysis_dir ANALYSIS_DIR]
[--ssh_csv SSH_CSV] [--disable_swap]
[--worker_script_name WORKER_SCRIPT_NAME]
[--preshutdown_wait_seconds PRESHUTDOWN_WAIT_SECONDS]
[--ray_port RAY_PORT]
[--ray_port_count RAY_PORT_COUNT]
[--no_remote_mount_needed]
[--selected_log_level {core,error,warning,message,verbose,informative,debug}]
[--default_print_level {core,error,warning,message,verbose,informative,debug}]
[--log_level_prefix {True,False}]
[--show_prefix_level {core,error,warning,message,verbose,informative,debug}]
A number of options can be set via the command line interface, then accessed
via enb.config.options.property_name. All of them are optional, and may be
interpreted differently by enb core modules and client code.
options:
-h, --help show this help message and exit
General Options:
Group of uncategorized options.
-v, --verbose Be verbose? Repeat for more. Change at any time to
increase the logger's verbosity. (default: 0)
--ini [INI ...], --extra_ini_paths [INI ...]
Additional .ini files to be used to attain file-based
configurations, in addition to the default ones
(system, user and project). If defined more than once,
the last definition sets the list instead of appending
to a common list of extra ini paths. (default: [])
Execution Options:
General execution options.
--cpu_limit CPU_LIMIT
Maximum number of CPUs to use for computation in this
machine See
https://miguelinux314.github.io/experiment-
notebook/cluster_setup.html for details on how to set
the resources employed in remote computation nodes.
(default: None)
-f, --force, --overwrite
Force calculation of pre-existing results, if
available? Note that should an error occur while re-
computing a given index, that index is dropped from
the persistent support. (default: 0)
-q, --quick Perform a quick test with a subset of the input
samples? If specified q>0 times, a subset of the first
q target indices is employed in most get_df methods
from ATable instances (default: 0)
--render_only, --no_new_results
If True, ATable's get_df method relies entirely on the
loaded persistence data, no new rows are computed.
This can be useful to speed up the rendering process,
for instance to try different aesthetic plotting
options. Use this option only if you know you need it.
(default: False)
--chunk_size CHUNK_SIZE
Chunk size used when running ATable's get_df(). Each
processed chunk is made persistent before processing
the next one. This parameter can be used to control
the trade-off between error tolerance and overall
speed. (default: None)
--repetitions REPETITIONS
Number of repetitions when calculating execution
times. This value allows computation of more reliable
execution times in some experiments, but is normally
most representative in combination with -s to use a
single execution process at a time. (default: 1)
--report_wall_time If this flag is activated, the wall time instead of
the CPU time is reported by default by
tcall.get_status_output_time. (default: False)
--force_sanity_checks
If this flag is used, extra sanity checks are
performed by enb during the execution of this script.
The trade-off for rare error condition detection is a
slower execution time. (default: False)
--selected_columns SELECTED_COLUMNS [SELECTED_COLUMNS ...]
List of selected column names for computation. If one
or more column names are provided, all others are
ignored. Multiple columns can be expressed, separated
by spaces. (default: None)
--progress_report_period PROGRESS_REPORT_PERIOD
Default minimum time in seconds between progress
report updates, when get_df() is invoked and
computation is being processed in parallel. (default:
1)
--disable_progress_bar
If this flag is enabled, no progress bar is employed
(useful to minimize the stdout volume of long-running
experiments). (default: False)
Dir Options:
Options regarding default data directories.
--project_root PROJECT_ROOT
Project root path. It should not normally be modified.
(default: /data/Dropbox/desarrollo/experiment-notebook
.git/enb/plugins/template_basic_workflow_example)
--base_dataset_dir BASE_DATASET_DIR
Directory to be used as source of input files for
indices in the get_df method of tables and
experiments. It should be an existing, readable
directory. (default:
/data/Dropbox/desarrollo/experiment-notebook.git/enb/p
lugins/template_basic_workflow_example/datasets)
--persistence_dir PERSISTENCE_DIR
Directory where persistence files are to be stored.
(default: /data/Dropbox/desarrollo/experiment-notebook
.git/enb/plugins/template_basic_workflow_example/persi
stence_basic_workflow.py)
--reconstructed_dir RECONSTRUCTED_DIR
Base directory where reconstructed versions are to be
stored. (default: None)
--base_version_dataset_dir BASE_VERSION_DATASET_DIR
Base dir for versioned folders. (default:
/data/Dropbox/desarrollo/experiment-notebook.git/enb/p
lugins/template_basic_workflow_example/datasets)
--base_tmp_dir BASE_TMP_DIR
Temporary dir used for intermediate data storage. This
can be useful when experiments make heavy use of tmp
and memory is limited, avoiding out-of-RAM crashes at
the cost of potentially slower execution time. The dir
is created when defined if necessary. (default:
/dev/shm)
--external_bin_base_dir EXTERNAL_BIN_BASE_DIR
External binary base dir. In case a centralized
repository is defined at the project or system level.
(default: None)
--plot_dir PLOT_DIR Directory to store produced plots. (default:
/data/Dropbox/desarrollo/experiment-notebook.git/enb/p
lugins/template_basic_workflow_example/plots)
--analysis_dir ANALYSIS_DIR
Directory to store analysis results. (default:
/data/Dropbox/desarrollo/experiment-notebook.git/enb/p
lugins/template_basic_workflow_example/analysis)
Ray Options:
Options related to the ray library, used for parallel/distributed
computing only when --ssh_cluster_csv_path (or, equivalently --ssh_csv)
are employed.
--ssh_csv SSH_CSV, --ssh_cluster_csv_path SSH_CSV
Path to the CSV file containing a enb ssh cluster
configuration. See
https://miguelinux314.github.io/experiment-
notebook/installation.html. (default: None)
--disable_swap If this flag is used, then swap memory will not be
allowed by ray. By default, swap memory is enabled.
Note that your system may become unstable if swap
memory is used (specially a big portion thereof).
(default: False)
--worker_script_name WORKER_SCRIPT_NAME
Base name of ray's worker scripts, invoked to run
tasks in parallel processes. You don't need to change
this unless you want to use custom ray workers.
(default: default_worker.py)
--preshutdown_wait_seconds PRESHUTDOWN_WAIT_SECONDS
A wait period can be held before shutting down ray.
This allows displaying messages produced by child
processes (e.g., stack traces) in case of abrupt
termination of enb client code. (default: 0.5)
--ray_port RAY_PORT Ray port and first port that need to be open in case a
cluster is to be set up. Refer to
https://miguelinux314.github.io/experiment-
notebook/installation.html for further information on
this. (default: 11000)
--ray_port_count RAY_PORT_COUNT
Total number of consecutive ports that can be assumed
to be open after `ray_port`. For instance, if
`ray_port` is 11000 and `ray_port_count` is 1000, then
ports 11000-11999 will be used for parallelization and
(if so-configured) enb clusters. (default: 500)
--no_remote_mount_needed
If this flag is used, the calling script's project
root path is assumed to be valid AND synchronized
(e.g., via NFS). By default, remote mounting via sshfs
and vde2 is employed. (default: False)
Logging Options:
Options controlling what and how is printed and/or logged to files.
--selected_log_level {core,error,warning,message,verbose,informative,debug}
Maximum log level / minimum priority required when
printing messages. (default: message)
--default_print_level {core,error,warning,message,verbose,informative,debug}
Selects the default log level equivalent to a regular
print-like message. It is most effective when combined
with log_print set to True. (default: message)
--log_level_prefix {True,False}
If True, logged messages include a prefix, e.g., based
on their priority. (default: True)
--show_prefix_level {core,error,warning,message,verbose,informative,debug}
Note
*.ini files in the project root are searched for to look for default values for the attributes of enb.config.options. You can copy and modify the default enb.ini template to your project if you would like to use file-based configuration.
Note
You can also modify enb.config.options in your code as in enb.config.options.verbose = 5. If you do, this is applied after the CLI parameter recognition, and therefore overwrites any values set via –option=value.
A set of predefined parameters are available in enb.config.options, the single instance
of the enb.config.AllOptions
class.
When running a script that import enb.config, “-h” can be passed as argument to show
a help message with all available options.
All these options (whether provided via the CLI or not) can be read and/or changed with code like the following:
from enb.config import options
print(f"Verbose level: {options.verbose}")