enb.compression package
Submodules
enb.compression.codec module
Codecs implement the compress and decompress methods as well as a name and label to identify and represent them.
A param_dict is passed on initialization that describes the configuration of each codec instance. Codecs may choose the number of parameters and their names.
- class enb.compression.codec.AbstractCodec(param_dict=None)
Bases:
ExperimentTask
Base class for all codecs.
- __init__(param_dict=None)
- Parameters:
param_dict – dictionary of parameters for this codec instance.
- compress(original_path: str, compressed_path: str, original_file_info=None) CompressionResults
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- compression_results_from_paths(original_path, compressed_path)
Get the default CompressionResults instance corresponding to the compression of original_path into compressed_path
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- decompression_results_from_paths(compressed_path, reconstructed_path)
Return a enb.icompression.DecompressionResults instance given the compressed and reconstructed paths.
- property label
Label to be displayed for the codec. May not be strictly unique nor fully informative. By default, self’s class name is returned.
- property name
Name of the codec. Subclasses are expected to yield different values when different parameters are used. By default, the class name is folled by all elements in self.param_dict sorted alphabetically are included in the name.
- class enb.compression.codec.LosslessCodec(param_dict=None)
Bases:
AbstractCodec
An AbstractCodec that identifies itself as lossless.
- class enb.compression.codec.LossyCodec(param_dict=None)
Bases:
AbstractCodec
An AbstractCodec that identifies itself as lossy.
- class enb.compression.codec.NearLosslessCodec(param_dict=None)
Bases:
LossyCodec
An AbstractCodec that identifies itself as near lossless.
- class enb.compression.codec.PassthroughCodec
Bases:
LosslessCodec
Codec that simply copies the input into the output in both compression and decompression.
- __init__()
- Parameters:
param_dict – dictionary of parameters for this codec instance.
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
enb.compression.compression module
Data compression tools common to any compression modality.
- exception enb.compression.compression.CompressionException(original_path=None, compressed_path=None, file_info=None, status=None, output=None)
Bases:
Exception
Base class for exceptions occurred during a compression instance
- __init__(original_path=None, compressed_path=None, file_info=None, status=None, output=None)
- class enb.compression.compression.CompressionExperiment(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
Bases:
Experiment
This class allows seamless execution of compression experiments.
In the functions decorated with @atable,column_function, the row argument contains two magic properties, compression_results and decompression_results. These give access to the
CompressionResults
andDecompressionResults
instances resulting respectively from compressing and decompressing according to the row index parameters. The paths referenced in the compression and decompression results are valid while the row is being processed, and are disposed of afterwards. Also, the image_info_row attribute gives access to the image metainformation (e.g., geometry)- class CompressionDecompressionWrapper(file_path, codec, image_info_row, reconstructed_copy_dir=None, compressed_copy_dir=None)
Bases:
object
This class is instantiated for each row of the table, and added to a temporary column row_wrapper_column_name. Column-setting methods can then access this wrapper, and in particular its compression_results and decompression_results properties, which will run compression and decompression at most once. This way, many columns can be defined independently without needing to compress and decompress for each one.
- __init__(file_path, codec, image_info_row, reconstructed_copy_dir=None, compressed_copy_dir=None)
- Parameters:
file_path – path to the original image being compressed
codec – AbstractCodec instance to be used for compression/decompression
image_info_row – dict-like object with geometry and data type information about file_path
reconstructed_copy_dir – if not None, a copy of the reconstructed images is stored, based on the class of codec.
compressed_copy_dir – if not None, a copy of the compressed images is stored, based on the class of codec.
- property compression_results: CompressionResults
Perform the actual compression experiment for the selected row.
- property decompression_results: DecompressionResults
Perform the actual decompression experiment for the selected row.
- property numpy_dtype
Get the numpy dtype corresponding to the original image’s data format
- __init__(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
- Parameters:
codecs – list of
AbstractCodec
instances. Note that codecs are compatible with the interface ofExperimentTask
.dataset_paths – list of paths to the files to be used as input for compression. If it is None, this list is obtained automatically from the configured base dataset dir.
csv_experiment_path – if not None, path to the CSV file giving persistence support to this experiment. If None, it is automatically determined within options.persistence_dir.
csv_dataset_path – if not None, path to the CSV file given persistence support to the dataset file properties. If None, it is automatically determined within options.persistence_dir.
dataset_info_table – if not None, it must be a ImagePropertiesTable instance or subclass instance that can be used to obtain dataset file metainformation, and/or gather it from csv_dataset_path. If None, a new ImagePropertiesTable instance is created and used for this purpose.
overwrite_file_properties – if True, file properties are recomputed before starting the experiment. Useful for temporary and/or random datasets. Note that overwrite control for the experiment results themselves is controlled in the call to get_df
reconstructed_dir_path – if not None, a directory where reconstructed images are to be stored.
compressed_copy_dir_path – if not None, it gives the directory where a copy of the compressed images. is to be stored. If may not be generated for images for which all columns are known
task_families – if not None, it must be a list of TaskFamily instances. It is used to set the “family_label” column for each row. If the codec is not found within the families, a default label is set indicating so.
- property codecs
- Returns:
an iterable of defined codecs
- property codecs_by_name
Alias for
tasks_by_name
- column_to_properties = {'bpppc': ColumnProperties('name'='bpppc', 'fun'=<function CompressionExperiment.set_bpppc>, 'label'='Compressed data rate (bpppc)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_file_sha256': ColumnProperties('name'='compressed_file_sha256', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'="Compressed file's SHA256", 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_size_bytes': ColumnProperties('name'='compressed_size_bytes', 'fun'=<function CompressionExperiment.set_compressed_data_size>, 'label'='Compressed data size (Bytes)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_efficiency_1byte_entropy': ColumnProperties('name'='compression_efficiency_1byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 1byte entropy', 'plot_min'=0, 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (1bytes entropy)'), 'compression_efficiency_2byte_entropy': ColumnProperties('name'='compression_efficiency_2byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 2byte entropy', 'plot_min'=0, 'plot_max'=2, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (2bytes entropy)'), 'compression_efficiency_4byte_entropy': ColumnProperties('name'='compression_efficiency_4byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 4byte entropy', 'plot_min'=0, 'plot_max'=4, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (4bytes entropy)'), 'compression_memory_kb': ColumnProperties('name'='compression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio': ColumnProperties('name'='compression_ratio', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio_dr': ColumnProperties('name'='compression_ratio_dr', 'fun'=<function CompressionExperiment.set_compression_ratio_dr>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_time_seconds': ColumnProperties('name'='compression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_memory_kb': ColumnProperties('name'='decompression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_time_seconds': ColumnProperties('name'='decompression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'family_label': ColumnProperties('name'='family_label', 'fun'=<function Experiment.set_family_label>, 'label'='Family label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'lossless_reconstruction': ColumnProperties('name'='lossless_reconstruction', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Lossless?', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'param_dict': ColumnProperties('name'='param_dict', 'fun'=<function Experiment.set_param_dict>, 'label'='Param dict', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=True, 'has_iterable_values'=False, 'has_object_values'=False), 'repetitions': ColumnProperties('name'='repetitions', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Number of compression/decompression repetitions', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_apply_time': ColumnProperties('name'='task_apply_time', 'fun'=<function Experiment.set_task_apply_time>, 'label'='Task apply time', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_label': ColumnProperties('name'='task_label', 'fun'=<function Experiment.set_task_label>, 'label'='Task label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_name': ColumnProperties('name'='task_name', 'fun'=<function Experiment.set_task_name>, 'label'='Task name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- property compression_results: CompressionResults
Get the current compression results from self.codec_results. This property is intended to be read from functions that set columns of a row. It triggers the compression of that row’s sample with that row’s codec if it hasn’t been compressed yet. Otherwise, None is returned.
- compute_one_row(filtered_df, index, loc, column_fun_tuples, overwrite)
Process a single row of an ATable instance, returning a Series object corresponding to that row. If an error is detected, an exception is returned instead of a Series. Note that the exception is not raised here, but intended to be detected by the compute_target_rows(), i.e., the dispatcher function.
- Parameters:
filtered_df – |DataFrame| retrieved from persistent storage, with index compatible with loc. The loc argument itself needs not be present in filtered_df, but is used to avoid recomputing in case overwrite is not True and columns had been set.
index – index value or values corresponding to the row to be processed.
loc – location compatible with .loc of filtered_df (although it might not be present there), and that will be set into the full loaded_df also using its .loc accessor.
column_fun_tuples – a list of (column, fun) tuples, where fun is to be invoked to fill column
overwrite – if True, existing values are overwritten with newly computed data. Otherwise, only missing or None columns are populated (and therefore only their column functions called)
- Returns:
a pandas.Series instance corresponding to this row, with a column named as given by self.private_index_column set to the loc argument passed to this function.
- dataset_files_extension = 'raw'
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
- property decompression_results: DecompressionResults
Get the current decompression results from self.codec_results. This property is intended to be read from functions that set columns of a row. It triggers the compression and decompression of that row’s sample with that row’s codec if they have not been compressed yet. Otherwise, None is returned.
- default_file_properties_table_class
alias of
ImagePropertiesTable
- row_wrapper_column_name = '_codec_wrapper'
- set_bpppc(index, row)
- set_comparison_results(index, row)
Perform a compression-decompression cycle and store the comparison results
- set_compressed_data_size(index, row)
- set_compression_ratio_dr(index, row)
Set the compression ratio calculated based on the dynamic range of the input samples, as opposed to 8*bytes_per_sample.
- set_efficiency(index, row)
- class enb.compression.compression.CompressionResults(codec_name=None, codec_param_dict=None, original_path=None, compressed_path=None, compression_time_seconds=None, maximum_memory_kb=None)
Bases:
object
Base class that defines the minimal fields that are returned by a call to a coder’s compress() method (or produced by the CompressionExperiment instance).
- __init__(codec_name=None, codec_param_dict=None, original_path=None, compressed_path=None, compression_time_seconds=None, maximum_memory_kb=None)
- Parameters:
codec_name – codec’s reported_name
codec_param_dict – dictionary of parameters to the codec
original_path – path to the input original file
compressed_path – path to the output compressed file
compression_time_seconds – effective average compression time in seconds
maximum_memory_kb – maximum resident memory in kilobytes
- exception enb.compression.compression.DecompressionException(compressed_path=None, reconstructed_path=None, file_info=None, status=None, output=None)
Bases:
Exception
Base class for exceptions occurred during a decompression instance
- __init__(compressed_path=None, reconstructed_path=None, file_info=None, status=None, output=None)
- class enb.compression.compression.DecompressionResults(codec_name=None, codec_param_dict=None, compressed_path=None, reconstructed_path=None, decompression_time_seconds=None, maximum_memory_kb=None)
Bases:
object
Base class that defines the minimal fields that are returned by a call to a coder’s decompress() method (or produced by the CompressionExperiment instance).
- __init__(codec_name=None, codec_param_dict=None, compressed_path=None, reconstructed_path=None, decompression_time_seconds=None, maximum_memory_kb=None)
- Parameters:
codec_name – codec’s reported_name
codec_param_dict – dictionary of parameters to the codec
compressed_path – path to the output compressed file
reconstructed_path – path to the reconstructed file after decompression
decompression_time_seconds – effective decompression time in seconds
maximum_memory_kb – maximum resident memory in kilobytes
- class enb.compression.compression.GeneralLosslessExperiment(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
Bases:
LosslessCompressionExperiment
Lossless compression experiment for general data contents.
- codec_results: CompressionDecompressionWrapper | None
- column_to_properties = {'bpppc': ColumnProperties('name'='bpppc', 'fun'=<function CompressionExperiment.set_bpppc>, 'label'='Compressed data rate (bpppc)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_file_sha256': ColumnProperties('name'='compressed_file_sha256', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'="Compressed file's SHA256", 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_size_bytes': ColumnProperties('name'='compressed_size_bytes', 'fun'=<function CompressionExperiment.set_compressed_data_size>, 'label'='Compressed data size (Bytes)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_efficiency_1byte_entropy': ColumnProperties('name'='compression_efficiency_1byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 1byte entropy', 'plot_min'=0, 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (1bytes entropy)'), 'compression_efficiency_2byte_entropy': ColumnProperties('name'='compression_efficiency_2byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 2byte entropy', 'plot_min'=0, 'plot_max'=2, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (2bytes entropy)'), 'compression_efficiency_4byte_entropy': ColumnProperties('name'='compression_efficiency_4byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 4byte entropy', 'plot_min'=0, 'plot_max'=4, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (4bytes entropy)'), 'compression_memory_kb': ColumnProperties('name'='compression_memory_kb', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Compression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio': ColumnProperties('name'='compression_ratio', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio_dr': ColumnProperties('name'='compression_ratio_dr', 'fun'=<function CompressionExperiment.set_compression_ratio_dr>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_time_seconds': ColumnProperties('name'='compression_time_seconds', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Compression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_memory_kb': ColumnProperties('name'='decompression_memory_kb', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Decompression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_time_seconds': ColumnProperties('name'='decompression_time_seconds', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Decompression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'family_label': ColumnProperties('name'='family_label', 'fun'=<function Experiment.set_family_label>, 'label'='Family label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'lossless_reconstruction': ColumnProperties('name'='lossless_reconstruction', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Lossless?', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'param_dict': ColumnProperties('name'='param_dict', 'fun'=<function Experiment.set_param_dict>, 'label'='Param dict', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=True, 'has_iterable_values'=False, 'has_object_values'=False), 'repetitions': ColumnProperties('name'='repetitions', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Number of compression/decompression repetitions', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_apply_time': ColumnProperties('name'='task_apply_time', 'fun'=<function Experiment.set_task_apply_time>, 'label'='Task apply time', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_label': ColumnProperties('name'='task_label', 'fun'=<function Experiment.set_task_label>, 'label'='Task label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_name': ColumnProperties('name'='task_name', 'fun'=<function Experiment.set_task_name>, 'label'='Task name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- dataset_files_extension = ''
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
- default_file_properties_table_class
alias of
GenericFilePropertiesTable
- class enb.compression.compression.GenericFilePropertiesTable(csv_support_path=None, base_dir=None)
Bases:
ImagePropertiesTable
File properties table that considers the input path as a 1D, u8be array.
- column_to_properties = {'big_endian': ColumnProperties('name'='big_endian', 'fun'=<function GenericFilePropertiesTable.set_image_geometry>, 'label'='Big endian', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'bytes_per_sample': ColumnProperties('name'='bytes_per_sample', 'fun'=<function GenericFilePropertiesTable.set_bytes_per_sample>, 'label'='Bytes per sample', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'component_count': ColumnProperties('name'='component_count', 'fun'=<function GenericFilePropertiesTable.set_image_geometry>, 'label'='Components', 'plot_min'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'corpus': ColumnProperties('name'='corpus', 'fun'=<function FilePropertiesTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'dtype': ColumnProperties('name'='dtype', 'fun'=<function ImageGeometryTable.set_column_dtype>, 'label'='Numpy dtype', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'dynamic_range_bits': ColumnProperties('name'='dynamic_range_bits', 'fun'=<function ImagePropertiesTable.set_dynamic_range_bits>, 'label'='Dynamic range (bits)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'entropy_1B_bps': ColumnProperties('name'='entropy_1B_bps', 'fun'=<function ImagePropertiesTable.set_file_entropy>, 'label'='Entropy (bits, 1-byte samples)', 'plot_min'=0, 'plot_max'=8, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'entropy_2B_bps': ColumnProperties('name'='entropy_2B_bps', 'fun'=<function ImagePropertiesTable.set_file_entropy>, 'label'='Entropy (bits, 2-byte samples)', 'plot_min'=0, 'plot_max'=16, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'entropy_4B_bps': ColumnProperties('name'='entropy_4B_bps', 'fun'=<function ImagePropertiesTable.set_file_entropy>, 'label'='Entropy (bits, 4-byte samples)', 'plot_min'=0, 'plot_max'=32, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'float': ColumnProperties('name'='float', 'fun'=<function GenericFilePropertiesTable.set_image_geometry>, 'label'='Float', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'height': ColumnProperties('name'='height', 'fun'=<function GenericFilePropertiesTable.set_image_geometry>, 'label'='Height', 'plot_min'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sample_max': ColumnProperties('name'='sample_max', 'fun'=<function GenericFilePropertiesTable.set_sample_stats>, 'label'='Max sample value (byte samples)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sample_min': ColumnProperties('name'='sample_min', 'fun'=<function GenericFilePropertiesTable.set_sample_stats>, 'label'='Min sample value (byte samples)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'samples': ColumnProperties('name'='samples', 'fun'=<function ImageGeometryTable.set_samples>, 'label'='Sample count', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'signed': ColumnProperties('name'='signed', 'fun'=<function ImageGeometryTable.set_signed>, 'label'='Signed samples', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'type_name': ColumnProperties('name'='type_name', 'fun'=<function ImageGeometryTable.set_type_name>, 'label'='Type name usable in file names', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'unique_sample_count': ColumnProperties('name'='unique_sample_count', 'fun'=<function GenericFilePropertiesTable.set_sample_stats>, 'label'='Number of different sample values', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'width': ColumnProperties('name'='width', 'fun'=<function GenericFilePropertiesTable.set_image_geometry>, 'label'='Width', 'plot_min'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- set_bytes_per_sample(file_path, row)
Infer the number of bytes per sample based from the file path.
- set_image_geometry(file_path, row)
Obtain the image’s geometry (width, height and number of components) based on the filename tags (and possibly its size)
- set_sample_stats(file_path, row)
Set basic file statistics (unique count, min, max)
- verify_file_size = False
- class enb.compression.compression.LosslessCompressionExperiment(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
Bases:
CompressionExperiment
Lossless data compression experiment. It fails if lossless reconstruction is not achieved.
- codec_results: CompressionDecompressionWrapper | None
- column_to_properties = {'bpppc': ColumnProperties('name'='bpppc', 'fun'=<function CompressionExperiment.set_bpppc>, 'label'='Compressed data rate (bpppc)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_file_sha256': ColumnProperties('name'='compressed_file_sha256', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'="Compressed file's SHA256", 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_size_bytes': ColumnProperties('name'='compressed_size_bytes', 'fun'=<function CompressionExperiment.set_compressed_data_size>, 'label'='Compressed data size (Bytes)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_efficiency_1byte_entropy': ColumnProperties('name'='compression_efficiency_1byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 1byte entropy', 'plot_min'=0, 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (1bytes entropy)'), 'compression_efficiency_2byte_entropy': ColumnProperties('name'='compression_efficiency_2byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 2byte entropy', 'plot_min'=0, 'plot_max'=2, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (2bytes entropy)'), 'compression_efficiency_4byte_entropy': ColumnProperties('name'='compression_efficiency_4byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 4byte entropy', 'plot_min'=0, 'plot_max'=4, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (4bytes entropy)'), 'compression_memory_kb': ColumnProperties('name'='compression_memory_kb', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Compression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio': ColumnProperties('name'='compression_ratio', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio_dr': ColumnProperties('name'='compression_ratio_dr', 'fun'=<function CompressionExperiment.set_compression_ratio_dr>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_time_seconds': ColumnProperties('name'='compression_time_seconds', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Compression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_memory_kb': ColumnProperties('name'='decompression_memory_kb', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Decompression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_time_seconds': ColumnProperties('name'='decompression_time_seconds', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Decompression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'family_label': ColumnProperties('name'='family_label', 'fun'=<function Experiment.set_family_label>, 'label'='Family label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'lossless_reconstruction': ColumnProperties('name'='lossless_reconstruction', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Lossless?', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'param_dict': ColumnProperties('name'='param_dict', 'fun'=<function Experiment.set_param_dict>, 'label'='Param dict', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=True, 'has_iterable_values'=False, 'has_object_values'=False), 'repetitions': ColumnProperties('name'='repetitions', 'fun'=<function LosslessCompressionExperiment.set_comparison_results>, 'label'='Number of compression/decompression repetitions', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_apply_time': ColumnProperties('name'='task_apply_time', 'fun'=<function Experiment.set_task_apply_time>, 'label'='Task apply time', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_label': ColumnProperties('name'='task_label', 'fun'=<function Experiment.set_task_label>, 'label'='Task label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_name': ColumnProperties('name'='task_name', 'fun'=<function Experiment.set_task_name>, 'label'='Task name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- set_comparison_results(index, row)
Perform a compression-decompression cycle and store the comparison results
enb.compression.fits module
FITS format manipulation tools. See https://fits.gsfc.nasa.gov/fits_documentation.html.
- class enb.compression.fits.FITSVersionTable(original_base_dir, version_base_dir)
Bases:
FileVersionTable
,FilePropertiesTable
Read FITS files and convert them to raw files, sorting them by type ( integer or float) and by bits per pixel.
- __init__(original_base_dir, version_base_dir)
- Parameters:
version_base_dir – path to the versioned base directory (versioned directories preserve names and structure within the base dir)
original_base_dir – path to the original directory (it must contain all indices requested later with self.get_df()). If None, options.base_datset_dir is used
- allowed_extensions = ['fit', 'fits']
- column_to_properties = {'corpus': ColumnProperties('name'='corpus', 'fun'=<function FileVersionTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'original_file_path': ColumnProperties('name'='original_file_path', 'fun'=<function FileVersionTable.set_original_file_path>, 'label'='Original file path', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_name': ColumnProperties('name'='version_name', 'fun'=<function FileVersionTable.column_version_name>, 'label'='Version name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_time': ColumnProperties('name'='version_time', 'fun'=<function FileVersionTable.set_version_time>, 'label'='Versioning time (s)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- fits_extension = 'fit'
- get_default_target_indices()
Get the list of samples in self.original_base_dir and its subdirs that have extension self.dataset_files_extension.
- original_to_versioned_path(original_path)
Get the path of the versioned file corresponding to original_path. This function will replicate the folder structure within self.original_base_dir.
- set_version_repetitions(file_path, row)
Set the number of times the versioning process is performed.
- version(input_path, output_path, row)
Create a version of input_path and write it into output_path.
- Parameters:
input_path – path to the file to be versioned
output_path – path where the version should be saved
row – metainformation available using super().get_df for input_path
- Returns:
if not None, the time in seconds it took to perform the ( forward) versioning.
- version_name = 'FitsToRaw'
- class enb.compression.fits.FITSWrapperCodec(compressor_path, decompressor_path, param_dict=None, output_invocation_dir=None, signature_in_name=False)
Bases:
WrapperCodec
Raw images are coded into FITS before compression with the wrapper, and FITS is decoded to raw after decompression.
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
enb.compression.icompression module
Image compression experiment module.
- class enb.compression.icompression.GiciLibHelper
Bases:
object
Definition of helper methods that can be used with software based on the GiciLibs (see gici.uab.cat/GiciWebPage/downloads.php).
- file_info_to_data_str(original_file_info)
- file_info_to_endianness_str(original_file_info)
- get_gici_geometry_str(original_file_info)
Get a string to be passed to the -ig or -og parameters. The ‘-ig’ or ‘-og’ part is not included in the returned string.
- class enb.compression.icompression.LossyCompressionExperiment(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
Bases:
CompressionExperiment
Lossy compression of raw image files.
- codec_results: CompressionDecompressionWrapper | None
- column_to_properties = {'bpppc': ColumnProperties('name'='bpppc', 'fun'=<function CompressionExperiment.set_bpppc>, 'label'='Compressed data rate (bpppc)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_file_sha256': ColumnProperties('name'='compressed_file_sha256', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'="Compressed file's SHA256", 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_size_bytes': ColumnProperties('name'='compressed_size_bytes', 'fun'=<function CompressionExperiment.set_compressed_data_size>, 'label'='Compressed data size (Bytes)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_efficiency_1byte_entropy': ColumnProperties('name'='compression_efficiency_1byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 1byte entropy', 'plot_min'=0, 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (1bytes entropy)'), 'compression_efficiency_2byte_entropy': ColumnProperties('name'='compression_efficiency_2byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 2byte entropy', 'plot_min'=0, 'plot_max'=2, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (2bytes entropy)'), 'compression_efficiency_4byte_entropy': ColumnProperties('name'='compression_efficiency_4byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 4byte entropy', 'plot_min'=0, 'plot_max'=4, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (4bytes entropy)'), 'compression_memory_kb': ColumnProperties('name'='compression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio': ColumnProperties('name'='compression_ratio', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio_dr': ColumnProperties('name'='compression_ratio_dr', 'fun'=<function CompressionExperiment.set_compression_ratio_dr>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_time_seconds': ColumnProperties('name'='compression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_memory_kb': ColumnProperties('name'='decompression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_time_seconds': ColumnProperties('name'='decompression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'family_label': ColumnProperties('name'='family_label', 'fun'=<function Experiment.set_family_label>, 'label'='Family label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'lossless_reconstruction': ColumnProperties('name'='lossless_reconstruction', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Lossless?', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'mse': ColumnProperties('name'='mse', 'fun'=<function LossyCompressionExperiment.set_MSE>, 'label'='MSE', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'pae': ColumnProperties('name'='pae', 'fun'=<function LossyCompressionExperiment.set_PAE>, 'label'='PAE', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'param_dict': ColumnProperties('name'='param_dict', 'fun'=<function Experiment.set_param_dict>, 'label'='Param dict', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=True, 'has_iterable_values'=False, 'has_object_values'=False), 'psnr_bps': ColumnProperties('name'='psnr_bps', 'fun'=<function LossyCompressionExperiment.set_PSNR_nominal>, 'label'='PSNR (dB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'psnr_dr': ColumnProperties('name'='psnr_dr', 'fun'=<function LossyCompressionExperiment.set_PSNR_dynamic_range>, 'label'='PSNR (dB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'repetitions': ColumnProperties('name'='repetitions', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Number of compression/decompression repetitions', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_apply_time': ColumnProperties('name'='task_apply_time', 'fun'=<function Experiment.set_task_apply_time>, 'label'='Task apply time', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_label': ColumnProperties('name'='task_label', 'fun'=<function Experiment.set_task_label>, 'label'='Task label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_name': ColumnProperties('name'='task_name', 'fun'=<function Experiment.set_task_name>, 'label'='Task name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- set_MSE(index, row)
Set the mean squared error of the reconstructed image.
- set_PAE(index, row)
Set the peak absolute error (maximum absolute pixelwise difference) of the reconstructed image.
- set_PSNR_dynamic_range(index, row)
Set the PSNR assuming dynamic range given by dynamic_range_bits.
- set_PSNR_nominal(index, row)
Set the PSNR assuming nominal dynamic range given by bytes_per_sample.
- class enb.compression.icompression.SpectralAngleTable(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
Bases:
LossyCompressionExperiment
Lossy compression experiment that computes spectral angle “distance” measures between the compressed and the reconstructed images.
Subclasses of LossyCompressionExperiment may inherit from this one to automatically add the data columns defined here
- codec_results: CompressionDecompressionWrapper | None
- column_to_properties = {'bpppc': ColumnProperties('name'='bpppc', 'fun'=<function CompressionExperiment.set_bpppc>, 'label'='Compressed data rate (bpppc)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_file_sha256': ColumnProperties('name'='compressed_file_sha256', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'="Compressed file's SHA256", 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_size_bytes': ColumnProperties('name'='compressed_size_bytes', 'fun'=<function CompressionExperiment.set_compressed_data_size>, 'label'='Compressed data size (Bytes)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_efficiency_1byte_entropy': ColumnProperties('name'='compression_efficiency_1byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 1byte entropy', 'plot_min'=0, 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (1bytes entropy)'), 'compression_efficiency_2byte_entropy': ColumnProperties('name'='compression_efficiency_2byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 2byte entropy', 'plot_min'=0, 'plot_max'=2, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (2bytes entropy)'), 'compression_efficiency_4byte_entropy': ColumnProperties('name'='compression_efficiency_4byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 4byte entropy', 'plot_min'=0, 'plot_max'=4, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (4bytes entropy)'), 'compression_memory_kb': ColumnProperties('name'='compression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio': ColumnProperties('name'='compression_ratio', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio_dr': ColumnProperties('name'='compression_ratio_dr', 'fun'=<function CompressionExperiment.set_compression_ratio_dr>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_time_seconds': ColumnProperties('name'='compression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_memory_kb': ColumnProperties('name'='decompression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_time_seconds': ColumnProperties('name'='decompression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'family_label': ColumnProperties('name'='family_label', 'fun'=<function Experiment.set_family_label>, 'label'='Family label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'lossless_reconstruction': ColumnProperties('name'='lossless_reconstruction', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Lossless?', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'max_spectral_angle_deg': ColumnProperties('name'='max_spectral_angle_deg', 'fun'=<function SpectralAngleTable.set_spectral_distances>, 'label'='Max spectral angle (deg)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'mean_spectral_angle_deg': ColumnProperties('name'='mean_spectral_angle_deg', 'fun'=<function SpectralAngleTable.set_spectral_distances>, 'label'='Mean spectral angle (deg)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'mse': ColumnProperties('name'='mse', 'fun'=<function LossyCompressionExperiment.set_MSE>, 'label'='MSE', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'pae': ColumnProperties('name'='pae', 'fun'=<function LossyCompressionExperiment.set_PAE>, 'label'='PAE', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'param_dict': ColumnProperties('name'='param_dict', 'fun'=<function Experiment.set_param_dict>, 'label'='Param dict', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=True, 'has_iterable_values'=False, 'has_object_values'=False), 'psnr_bps': ColumnProperties('name'='psnr_bps', 'fun'=<function LossyCompressionExperiment.set_PSNR_nominal>, 'label'='PSNR (dB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'psnr_dr': ColumnProperties('name'='psnr_dr', 'fun'=<function LossyCompressionExperiment.set_PSNR_dynamic_range>, 'label'='PSNR (dB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'repetitions': ColumnProperties('name'='repetitions', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Number of compression/decompression repetitions', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_apply_time': ColumnProperties('name'='task_apply_time', 'fun'=<function Experiment.set_task_apply_time>, 'label'='Task apply time', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_label': ColumnProperties('name'='task_label', 'fun'=<function Experiment.set_task_label>, 'label'='Task label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_name': ColumnProperties('name'='task_name', 'fun'=<function Experiment.set_task_name>, 'label'='Task name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- get_spectral_angles_deg(index, row)
Return a sequence of spectral angles (in degrees), one per (x,y) position in the image, flattened in raster order.
- set_spectral_distances(index, row)
- class enb.compression.icompression.StructuralSimilarity(codecs, dataset_paths=None, csv_experiment_path=None, csv_dataset_path=None, dataset_info_table=None, overwrite_file_properties=False, reconstructed_dir_path=None, compressed_copy_dir_path=None, task_families=None)
Bases:
CompressionExperiment
Set the Structural Similarity (SSIM) and Multi-Scale Structural Similarity metrics (MS-SSIM) to measure the similarity between two images.
- Authors:
- codec_results: CompressionDecompressionWrapper | None
- column_to_properties = {'bpppc': ColumnProperties('name'='bpppc', 'fun'=<function CompressionExperiment.set_bpppc>, 'label'='Compressed data rate (bpppc)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_file_sha256': ColumnProperties('name'='compressed_file_sha256', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'="Compressed file's SHA256", 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compressed_size_bytes': ColumnProperties('name'='compressed_size_bytes', 'fun'=<function CompressionExperiment.set_compressed_data_size>, 'label'='Compressed data size (Bytes)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_efficiency_1byte_entropy': ColumnProperties('name'='compression_efficiency_1byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 1byte entropy', 'plot_min'=0, 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (1bytes entropy)'), 'compression_efficiency_2byte_entropy': ColumnProperties('name'='compression_efficiency_2byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 2byte entropy', 'plot_min'=0, 'plot_max'=2, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (2bytes entropy)'), 'compression_efficiency_4byte_entropy': ColumnProperties('name'='compression_efficiency_4byte_entropy', 'fun'=<function CompressionExperiment.set_efficiency>, 'label'='Compression efficiency 4byte entropy', 'plot_min'=0, 'plot_max'=4, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False, 'labytesel'='Compression efficiency (4bytes entropy)'), 'compression_memory_kb': ColumnProperties('name'='compression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio': ColumnProperties('name'='compression_ratio', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_ratio_dr': ColumnProperties('name'='compression_ratio_dr', 'fun'=<function CompressionExperiment.set_compression_ratio_dr>, 'label'='Compression ratio', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'compression_time_seconds': ColumnProperties('name'='compression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Compression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_memory_kb': ColumnProperties('name'='decompression_memory_kb', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression memory usage (KB)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'decompression_time_seconds': ColumnProperties('name'='decompression_time_seconds', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Decompression time (s)', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'family_label': ColumnProperties('name'='family_label', 'fun'=<function Experiment.set_family_label>, 'label'='Family label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'lossless_reconstruction': ColumnProperties('name'='lossless_reconstruction', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Lossless?', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'ms_ssim': ColumnProperties('name'='ms_ssim', 'fun'=<function StructuralSimilarity.set_StructuralSimilarity>, 'label'='MS-SSIM', 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'param_dict': ColumnProperties('name'='param_dict', 'fun'=<function Experiment.set_param_dict>, 'label'='Param dict', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=True, 'has_iterable_values'=False, 'has_object_values'=False), 'repetitions': ColumnProperties('name'='repetitions', 'fun'=<function CompressionExperiment.set_comparison_results>, 'label'='Number of compression/decompression repetitions', 'plot_min'=0, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'ssim': ColumnProperties('name'='ssim', 'fun'=<function StructuralSimilarity.set_StructuralSimilarity>, 'label'='SSIM', 'plot_max'=1, 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_apply_time': ColumnProperties('name'='task_apply_time', 'fun'=<function Experiment.set_task_apply_time>, 'label'='Task apply time', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_label': ColumnProperties('name'='task_label', 'fun'=<function Experiment.set_task_label>, 'label'='Task label', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'task_name': ColumnProperties('name'='task_name', 'fun'=<function Experiment.set_task_name>, 'label'='Task name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- compute_SSIM(img1, img2, max_val=255, filter_size=11, filter_sigma=1.5, k1=0.01, k2=0.03, full=False)
Return the Structural Similarity Map between img1 and img2.
This function attempts to match the functionality of ssim_index_new.m by Zhou Wang: http://www.cns.nyu.edu/~lcv/ssim/msssim.zip
Author’s Python implementation: https://github.com/dashayushman/TAC-GAN/blob/master/msssim.py
- Parameters:
img1 – Numpy array holding the first RGB image batch.
img2 – Numpy array holding the second RGB image batch.
max_val – the dynamic range of the images (i.e., the difference between the maximum the and minimum allowed values).
filter_size – Size of blur kernel to use (will be reduced for small images). :param filter_sigma: Standard deviation for Gaussian blur kernel (will be reduced for small images). :param k1: Constant used to maintain stability in the SSIM calculation (0.01 in the original paper). :param k2: Constant used to maintain stability in the SSIM calculation (0.03 in the original paper).
- cumpute_MSSIM(img1, img2, max_val=255, filter_size=11, filter_sigma=1.5, k1=0.01, k2=0.03, weights=None)
Return the MS-SSIM score between img1 and img2.
This function implements Multi-Scale Structural Similarity (MS-SSIM) Image Quality Assessment according to Zhou Wang’s paper, “Multi-scale structural similarity for image quality assessment” (2003). Link: https://ece.uwaterloo.ca/~z70wang/publications/msssim.pdf
Author’s MATLAB implementation: http://www.cns.nyu.edu/~lcv/ssim/msssim.zip
Author’s Python implementation: https://github.com/dashayushman/TAC-GAN/blob/master/msssim.py
Authors documentation:
- Parameters:
img1 – Numpy array holding the first RGB image batch.
img2 – Numpy array holding the second RGB image batch.
max_val – the dynamic range of the images (i.e., the difference between the maximum the and minimum allowed values).
filter_size – Size of blur kernel to use (will be reduced for small images).
filter_sigma – Standard deviation for Gaussian blur kernel ( will be reduced for small images).
k1 – Constant used to maintain stability in the SSIM calculation (0.01 in the original paper).
k2 – Constant used to maintain stability in the SSIM calculation (0.03 in the original paper).
- set_StructuralSimilarity(index, row)
enb.compression.jpg module
JPEG manipulation (e.g., curation) tools.
- class enb.compression.jpg.JPEGCurationTable(original_base_dir, version_base_dir, csv_support_path=None)
Bases:
PNGCurationTable
Given a directory tree containing JPEG images, copy those images into a new directory tree in raw BSQ format adding geometry information tags to the output names recognized by enb.isets.load_array_bsq.
- column_to_properties = {'corpus': ColumnProperties('name'='corpus', 'fun'=<function FileVersionTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'original_file_path': ColumnProperties('name'='original_file_path', 'fun'=<function FileVersionTable.set_original_file_path>, 'label'='Original file path', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_name': ColumnProperties('name'='version_name', 'fun'=<function FileVersionTable.column_version_name>, 'label'='Version name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_time': ColumnProperties('name'='version_time', 'fun'=<function FileVersionTable.set_version_time>, 'label'='Versioning time (s)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- dataset_files_extension = 'jpg'
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
enb.compression.pgm module
Module to handle PGM (P5) and PPM (P6) images
- class enb.compression.pgm.PGMCurationTable(original_base_dir, version_base_dir, csv_support_path=None)
Bases:
PNGCurationTable
Given a directory tree containing PGM images, copy those images into a new directory tree in raw BSQ format adding geometry information tags to the output names recognized by enb.isets.load_array_bsq.
- column_to_properties = {'corpus': ColumnProperties('name'='corpus', 'fun'=<function FileVersionTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'original_file_path': ColumnProperties('name'='original_file_path', 'fun'=<function FileVersionTable.set_original_file_path>, 'label'='Original file path', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_name': ColumnProperties('name'='version_name', 'fun'=<function FileVersionTable.column_version_name>, 'label'='Version name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_time': ColumnProperties('name'='version_time', 'fun'=<function FileVersionTable.set_version_time>, 'label'='Versioning time (s)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- dataset_files_extension = 'pgm'
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
- class enb.compression.pgm.PGMWrapperCodec(compressor_path, decompressor_path, param_dict=None, output_invocation_dir=None, signature_in_name=False)
Bases:
WrapperCodec
Raw images are coded into PNG before compression with the wrapper, and PNG is decoded to raw after decompression.
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- class enb.compression.pgm.PPMCurationTable(original_base_dir, version_base_dir, csv_support_path=None)
Bases:
PNGCurationTable
Given a directory tree containing PPM images, copy those images into a new directory tree in raw BSQ format adding geometry information tags to the output names recognized by enb.isets.load_array_bsq.
- column_to_properties = {'corpus': ColumnProperties('name'='corpus', 'fun'=<function FileVersionTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'original_file_path': ColumnProperties('name'='original_file_path', 'fun'=<function FileVersionTable.set_original_file_path>, 'label'='Original file path', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_name': ColumnProperties('name'='version_name', 'fun'=<function FileVersionTable.column_version_name>, 'label'='Version name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_time': ColumnProperties('name'='version_time', 'fun'=<function FileVersionTable.set_version_time>, 'label'='Versioning time (s)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- dataset_files_extension = 'ppm'
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
- enb.compression.pgm.pgm_to_raw(input_path, output_path)
Read a file in PGM format and write its contents in raw format, which does not include any geometry or data type information.
- enb.compression.pgm.ppm_to_raw(input_path, output_path)
Read a file in PPM format and write its contents in raw format, which does not include any geometry or data type information.
- enb.compression.pgm.read_pgm(input_path, byteorder='>')
Return image data from a raw PGM file as numpy array. Format specification: http://netpbm.sourceforge.net/doc/pgm.html
(From answer: https://stackoverflow.com/questions/7368739/numpy-and-16-bit-pgm)
- enb.compression.pgm.read_ppm(input_path, byteorder='>')
Return image data from a raw PGM file as numpy array. Format specification: http://netpbm.sourceforge.net/doc/pgm.html
(From answer: https://stackoverflow.com/questions/7368739/numpy-and-16-bit-pgm)
- enb.compression.pgm.write_pgm(array, bytes_per_sample, output_path, byteorder='>')
Write a 2D array indexed with [x,y] into output_path with PGM format.
- enb.compression.pgm.write_ppm(array, bytes_per_sample, output_path)
Write a 3-component 3D array indexed with [x,y,z] into output_path with PPM format.
enb.compression.png module
PNG manipulation (e.g., curation) tools.
- class enb.compression.png.PDFToPNG(input_pdf_dir, output_png_dir, csv_support_path=None)
Bases:
FileVersionTable
Take all .pdf files in input dir and save them as .png files into output_dir, maintining the relative folder structure.
- __init__(input_pdf_dir, output_png_dir, csv_support_path=None)
- Parameters:
version_base_dir – path to the versioned base directory (versioned directories preserve names and structure within the base dir)
version_name – arbitrary name of this file version
original_base_dir – path to the original directory (it must contain all indices requested later with self.get_df()). If None, enb.config.options.base_dataset_dir is used
original_properties_table – instance of the file properties subclass to be used when reading the original data to be versioned. If None, a FilePropertiesTable is instanced automatically.
csv_support_path – path to the file where results (of the versioned data) are to be long-term stored. If None, one is assigned by default based on options.persistence_dir.
check_generated_files – if True, the table checks that each call to version() produces a file to output_path. Set to false to create arbitrarily named output files.
- column_to_properties = {'corpus': ColumnProperties('name'='corpus', 'fun'=<function FileVersionTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'original_file_path': ColumnProperties('name'='original_file_path', 'fun'=<function FileVersionTable.set_original_file_path>, 'label'='Original file path', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_name': ColumnProperties('name'='version_name', 'fun'=<function FileVersionTable.column_version_name>, 'label'='Version name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_time': ColumnProperties('name'='version_time', 'fun'=<function FileVersionTable.set_version_time>, 'label'='Versioning time (s)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- dataset_files_extension = 'pdf'
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
- version(input_path, output_path, row)
Create a version of input_path and write it into output_path.
- Parameters:
input_path – path to the file to be versioned
output_path – path where the version should be saved
row – metainformation available using super().get_df for input_path
- Returns:
if not None, the time in seconds it took to perform the ( forward) versioning.
- class enb.compression.png.PNGCurationTable(original_base_dir, version_base_dir, csv_support_path=None)
Bases:
FileVersionTable
Given a directory tree containing PNG images, copy those images into a new directory tree in raw BSQ format adding geometry information tags to the output names recognized by load_array_bsq.
- __init__(original_base_dir, version_base_dir, csv_support_path=None)
- Parameters:
original_base_dir – path to the original directory (it must contain all indices requested later with self.get_df()). If None, options.base_datset_dir is used
version_base_dir – path to the versioned base directory (versioned directories preserve names and structure within the base dir)
csv_support_path – path to the file where results (of the versioned data) are to be long-term stored. If None, one is assigned by default based on options.persistence_dir.
- column_to_properties = {'corpus': ColumnProperties('name'='corpus', 'fun'=<function FileVersionTable.set_corpus>, 'label'='Corpus name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'original_file_path': ColumnProperties('name'='original_file_path', 'fun'=<function FileVersionTable.set_original_file_path>, 'label'='Original file path', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'sha256': ColumnProperties('name'='sha256', 'fun'=<function FilePropertiesTable.set_hash_digest>, 'label'='sha256 hex digest', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'size_bytes': ColumnProperties('name'='size_bytes', 'fun'=<function FilePropertiesTable.set_file_size>, 'label'='File size (bytes)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_name': ColumnProperties('name'='version_name', 'fun'=<function FileVersionTable.column_version_name>, 'label'='Version name', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False), 'version_time': ColumnProperties('name'='version_time', 'fun'=<function FileVersionTable.set_version_time>, 'label'='Versioning time (s)', 'semilog_x'=False, 'semilog_y'=False, 'semilog_x_base'=10, 'semilog_y_base'=10, 'has_dict_values'=False, 'has_iterable_values'=False, 'has_object_values'=False)}
The column_properties attribute keeps track of what columns have been defined, and the methods that need to be called to computed them. The keys of this attribute can be used to determine the columns defined in a given class or instance. The values are |ColumnProperties| instances, which can be set manually after definition and before calling |Analyzer| subclasses’ get_df.
- dataset_files_extension = 'png'
Default input sample extension. If affects the result of enb.atable.get_all_test_files,
- version(input_path, output_path, row)
Transform PNG files into raw images with name tags recognized by isets.
- class enb.compression.png.PNGWrapperCodec(compressor_path, decompressor_path, param_dict=None, output_invocation_dir=None, signature_in_name=False)
Bases:
WrapperCodec
Raw images are coded into PNG before compression with the wrapper, and PNG is decoded to raw after decompression.
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- enb.compression.png.pdf_to_png(input_dir, output_dir)
Take all .pdf files in input dir and save them as .png files into output_dir, maintining the relative folder structure.
It is perfectly valid for input_dir and output_dir to point to the same location, but input_dir must exist beforehand.
- enb.compression.png.raw_path_to_png(raw_path, png_path, image_properties_row=None)
Render an uint8 or uint16 raw image with 1, 3 or 4 components.
- Parameters:
raw_path – path to the image in raw format to render in png.
png_path – path where the png file is to be stored.
image_properties_row – if row_path does not contain geometry information, this parameter should be a dict-like object that indicates width, height, number of components, bytes per sample, signedness and endianness if applicable.
- enb.compression.png.render_array_png(img, png_path)
Render an uint8 or uint16 image with 1, 3 or 4 components. :param img: image array indexed by [x,y,z]. :param png_path: path where the png file is to be stored.
enb.compression.tarlite module
Lite archiving format to write several files into a single one.
- class enb.compression.tarlite.TarliteReader(tarlite_path)
Bases:
object
Extract files created by
TarliteWriter
.- __init__(tarlite_path)
- extract_all(output_dir_path)
Extract all files to output_dir_path.
- class enb.compression.tarlite.TarliteWriter(initial_input_paths=None)
Bases:
object
Input a series of file paths and output a single file with all the inputs contents, plus some meta-information to reconstruct them. Files are stored flatly, i.e., only names are stored, discarding any information about their directory structure. Therefore, it is not possible to store two files with the same name even if all input paths point to different files.
- __init__(initial_input_paths=None)
- add_file(input_path)
Add a file path to the list of pending ones. Note that files are not read until the write() method is invoked.
- write(output_path)
Save the current list of input paths into output_path.
- enb.compression.tarlite.tarlite_files(input_paths, output_tarlite_path)
Take a list of input paths and combine them into a single tarlite file.
- enb.compression.tarlite.untarlite_files(input_tarlite_path, output_dir_path)
Take a tarlite file and output the contents into the given directory. The file names are preserved.
enb.compression.wrapper module
Wrapper codec classes.
Existing codec implementations (including non-python binaries) can be easily added to enb via WrapperCodec (sub)classes.
- class enb.compression.wrapper.JavaWrapperCodec(compressor_jar, decompressor_jar, param_dict=None)
Bases:
WrapperCodec
Wrapper for *.jar codecs. The compression and decompression parameters are those that need to be passed to the ‘java’ command.
The compressor_jar and decompressor_jar attributes are added upon initialization based on the params to __init__.
- __init__(compressor_jar, decompressor_jar, param_dict=None)
- Parameters:
compressor_path – path to the executable to be used for compression
decompressor_path – path to the executable to be used for decompression
param_dict – name-value mapping of the parameters to be used for compression
output_invocation_dir – if not None, invocation strings are stored in this directory with name based on the codec and the sample’s full path.
- Pram signature_in_name:
if True, the default codec name includes part of the hexdigest of the compressor and decompressor binaries being used
- class enb.compression.wrapper.LittleEndianWrapper(compressor_path, decompressor_path, param_dict=None, output_invocation_dir=None, signature_in_name=False)
Bases:
WrapperCodec
Wrapper with identical semantics as WrapperCodec, but performs a big endian to little endian conversion for (big-endian) 2-byte and 4-byte samples. If the input is flagged as little endian, e.g., if -u16le- is in the original file name, then no transformation is performed.
Codecs inheriting from this class automatically receive little-endian samples, and are expected to reconstruct little-endian files (which are then translated back to big endian if and only if the original image was flagged as big endian.
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- class enb.compression.wrapper.QuantizationWrapperCodec(codec: AbstractCodec, qstep: int)
Bases:
NearLosslessCodec
Perform uniform scalar quantization before compressing and after decompressing with a wrapped codec instance. Midpoint reconstruction is used in the dequantization stage.
- __init__(codec: AbstractCodec, qstep: int)
- Parameters:
codec – The codec instance used to compress and decompress the quantized data.
qstep – The quantization interval length
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- property label
Return the original codec label and the quantization parameter.
- property name
Return the original codec name and the quantization parameter
- class enb.compression.wrapper.ReindexWrapper(codec: AbstractCodec, width_bytes: int)
Bases:
AbstractCodec
Input samples are first reindexed to a contiguous support preserving the ordering (If x and y are two sample values present in the input file, then x < y <=> reindex(x) < reindex(y)).
Reindexed data are stored as unsigned, big-endian samples of the width configured on initialization. After reindexing, the codec passed to the initializer is used for compression.
The user is responsible for using a codec compatible with the type of the reindexed data, and a data type that can hold the number of unique samples present in the input file.
Note that only integer input samples are currently supported.
- __init__(codec: AbstractCodec, width_bytes: int)
- Parameters:
param_dict – dictionary of parameters for this codec instance.
- compress(original_path: str, compressed_path: str, original_file_info=None) CompressionResults
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path: str, reconstructed_path: str, original_file_info=None) DecompressionResults
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- property label
Return the original codec label and the quantization parameter.
- property name
Return the original codec name and the quantization parameter
- class enb.compression.wrapper.WrapperCodec(compressor_path, decompressor_path, param_dict=None, output_invocation_dir=None, signature_in_name=False)
Bases:
AbstractCodec
A codec that uses an external process to compress and decompress.
- __init__(compressor_path, decompressor_path, param_dict=None, output_invocation_dir=None, signature_in_name=False)
- Parameters:
compressor_path – path to the executable to be used for compression
decompressor_path – path to the executable to be used for decompression
param_dict – name-value mapping of the parameters to be used for compression
output_invocation_dir – if not None, invocation strings are stored in this directory with name based on the codec and the sample’s full path.
- Pram signature_in_name:
if True, the default codec name includes part of the hexdigest of the compressor and decompressor binaries being used
- compress(original_path: str, compressed_path: str, original_file_info=None)
Compress original_path into compress_path using param_dict as params. :param original_path: path to the original file to be compressed :param compressed_path: path to the compressed file to be created :param original_file_info: a dict-like object describing
original_path’s properties (e.g., geometry), or None.
- Returns:
(optional) a CompressionResults instance, or None (see self.compression_results_from_paths)
- decompress(compressed_path, reconstructed_path, original_file_info=None)
Decompress compressed_path into reconstructed_path using param_dict as params (if needed).
- Parameters:
compressed_path – path to the input compressed file
reconstructed_path – path to the output reconstructed file
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None. Should only be actually used in special cases, since codecs are expected to store all needed metainformation in the compressed file.
- Returns:
(optional) a DecompressionResults instance, or None (see
self.decompression_results_from_paths)
- static get_binary_signature(binary_path)
Return a string with a (hopefully) unique signature for the contents of binary_path. By default, the first 5 digits of the sha-256 hexdigest are returned.
- get_compression_params(original_path, compressed_path, original_file_info)
Return a string (shell style) with the parameters to be passed to the compressor.
Same parameter semantics as
AbstractCodec.compress()
.- Parameters:
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None
- get_decompression_params(compressed_path, reconstructed_path, original_file_info)
Return a string (shell style) with the parameters to be passed to the decompressor. Same parameter semantics as
AbstractCodec.decompress()
.- Parameters:
original_file_info – a dict-like object describing original_path’s properties (e.g., geometry), or None
- property name
Return the codec’s name and parameters, also including the encoder and decoder hash summaries (so that changes in the reference binaries can be easily detected)
Module contents
enb.compression: data compression in enb.
The compression and icompression modules implement enb.experiment.Experiment classes and other basic tools to facilitate them.
Several other modules are declared for specific compressed data formats.