legenddataflowscripts.utils package

Submodules

legenddataflowscripts.utils.alias_table module

legenddataflowscripts.utils.alias_table.alias_table(file, mapping)

Create HDF5 hard-link aliases for existing tables in an LH5 file.

Given a mapping of {source_path: alias_path} pairs, this function opens file in append mode and creates one HDF5 hard link per entry so that the data can be accessed under both the original and the alias path. If alias_path is a list or tuple each element is registered as a separate alias. After each alias is created, parent groups are annotated with LGDO struct datatype metadata via convert_parents_to_structs().

The function can also accept a JSON-encoded string or a list of mappings (which are applied sequentially).

Parameters:
  • file (str or pathlib.Path) – Path to the LH5 (HDF5) file to modify.

  • mapping (str or dict or list) –

    One of:

    • A JSON string that decodes to a dict or list.

    • A dict mapping source paths to alias path(s).

    • A list of such dicts, applied recursively.

legenddataflowscripts.utils.alias_table.convert_parents_to_structs(h5group)

Recursively annotate HDF5 parent groups with LGDO struct datatype attributes.

When a new alias (hard link) is created inside an HDF5 file the parent groups may not carry the datatype attribute expected by the LGDO reader. This function walks up the HDF5 group hierarchy from h5group and ensures every ancestor group carries a datatype attribute of the form struct{child1,child2,…}.

Parameters:

h5group (h5py.Group) – Leaf group whose parent hierarchy should be annotated.

legenddataflowscripts.utils.cfgtools module

legenddataflowscripts.utils.cfgtools.get_channel_config(mapping, channel, default_key='__default__')

Return the configuration entry for channel with fallback to a default.

Looks up channel in mapping. If no entry is found the value stored under default_key is returned instead. This mirrors the convention used throughout the dataflow configuration where __default__ is reserved as a catch-all for channels that do not have an explicit entry.

Parameters:
  • mapping (collections.abc.Mapping) – A mapping from channel identifier to configuration value (e.g. a dict or dbetto.AttrsDict).

  • channel (str) – The channel identifier to look up.

  • default_key (str) – Fallback key used when channel is not present in mapping. Defaults to "__default__".

Returns:

object – Value associated with channel if present, otherwise the value associated with default_key.

Raises:

KeyError – If neither channel nor default_key is present in mapping.

legenddataflowscripts.utils.convert_np module

legenddataflowscripts.utils.convert_np.convert_dict_np_to_float(dic)

Convert numpy scalars in a dictionary to native Python types.

Recursively converts all numpy scalar values (integers, floats, booleans) to their Python equivalents to ensure JSON/YAML serializability.

Parameters:

dic (dict) – The dictionary to convert.

Returns:

dict – The dictionary with all numpy scalars converted to Python types.

Return type:

dict

legenddataflowscripts.utils.log module

class legenddataflowscripts.utils.log.StreamToLogger(logger, log_level=40)

Bases: object

File-like stream object that redirects writes to a logger instance.

Wraps a logging.Logger so that it can be used wherever a writable file-like object is expected (e.g. as a replacement for sys.stderr). Each call to write() splits the incoming buffer on newlines and forwards each resulting line (including empty ones) to the underlying logger at the configured level.

Parameters:
  • logger (logging.Logger) – The logger instance to write to.

  • log_level (int) – Logging level used for every line written, e.g. logging.ERROR. Defaults to logging.ERROR.

Examples

Redirect stderr to a logger:

import logging, sys
from legenddataflowscripts.utils import StreamToLogger

log = logging.getLogger("myapp")
sys.stderr = StreamToLogger(log, logging.WARNING)
flush()

No-op flush required by the file-like interface.

write(buf)

Write buf to the logger, one log record per line.

Parameters:

buf (str) – Text to forward to the logger. Trailing whitespace is stripped from each line before logging.

legenddataflowscripts.utils.log.build_log(config_dict, log_file=None, fallback='prod')

Build and configure a logger from a configuration dictionary.

Accepts three forms for config_dict:

  • A string path to a logging properties file (JSON/YAML).

  • A plain logging dict (keys handlers, formatters, …) as consumed by logging.config.dictConfig().

  • A dataflow config dict already containing an optionslogging sub-key.

After the logger is created, sys.stderr is redirected to it at logging.ERROR level, and sys.excepthook is overridden so that unhandled exceptions are written to the same file handler.

Parameters:
  • config_dict (dict or str) – Logging configuration. See above for accepted forms.

  • log_file (str, optional) – Path to the log file. When provided the directory is created automatically and the path is injected into the dataflow handler of the logging config.

  • fallback (str) – Logger name returned when config_dict does not contain a logging config. Defaults to "prod".

Returns:

logging.Logger – Configured logger instance.

Return type:

Logger

legenddataflowscripts.utils.plot_dict module

legenddataflowscripts.utils.plot_dict.fill_plot_dict(plot_class, data, plot_options, plot_dict=None)

Populate a dictionary with figures produced by calibration plot functions.

Iterates over plot_options and, for each entry, calls the specified function with plot_class and data as positional arguments followed by any keyword arguments defined in item["options"]. Results are stored in plot_dict under the corresponding key.

Parameters:
  • plot_class (object) – Calibration class instance passed as the first argument to each plot function (e.g. a CalAoE or LQCal instance).

  • data (pandas.DataFrame) – Event-level data passed as the second argument to each plot function.

  • plot_options (dict or None) – Mapping of {label: {"function": callable, "options": dict | None}}. If None or empty no figures are generated.

  • plot_dict (dict, optional) – Existing dictionary to append results to. A new empty dict is created when not provided.

Returns:

dict – Updated plot_dict with one entry per key in plot_options.

legenddataflowscripts.utils.pulser_removal module

legenddataflowscripts.utils.pulser_removal.get_pulser_mask(pulser_file)

Load and concatenate pulser event masks from one or more files.

Each file is expected to be a JSON or YAML file with a top-level mask key containing a boolean array. When multiple files are provided the individual masks are concatenated in order.

Parameters:

pulser_file (str or list of str) – Path or list of paths to pulser mask files.

Returns:

numpy.ndarray – Boolean array of shape (N,) where True marks pulser events.