legenddataflowscripts.par.geds.hit package¶

Submodules¶

legenddataflowscripts.par.geds.hit.aoe module¶

legenddataflowscripts.par.geds.hit.aoe.get_results_dict(aoe_class)¶

Extract serialisable results from a calibrated CalAoE object.

Parameters:: aoe_class (pygama.pargen.AoE_cal.CalAoE) – Calibrated A/E object after calling calibrate().
Returns:: dict – Mapping of run timestamp → results dictionary. Each entry contains the calibration energy parameter, drift-time parameter, correction fit results, low-side and two-sided survival fractions (both overall and per-run), and the A/E cut values.

legenddataflowscripts.par.geds.hit.aoe.par_geds_hit_aoe()¶

Calibrate the A/E (current-amplitude over energy) pulse-shape discriminant.

CLI entry point registered as par-geds-hit-aoe. Loads DSP data for a single detector channel, applies energy and pulser masks, and runs run_aoe_calibration() to derive the A/E mean, width, and cut values as a function of energy and (optionally) drift time. Results are written to hit-pars (JSON/YAML) and the calibration object is serialised to aoe-results (pickle).

Notes

Command-line arguments

fileslist of str: One or more file lists (.filelist) containing DSP LH5 paths.
--pulser-filestr, optional: Path to the pulser mask file.
--tcm-fileliststr, optional: Unused placeholder.
--ecal-filestr: Energy calibration output file (JSON/YAML with pars and results keys).
--eres-filestr: Energy calibration pickle file containing calibration objects.
--inplotsstr, optional: Existing pickle plot file to merge with A/E plots.
--timestampstr: Run timestamp label (default "20000101T000000Z").
--logstr, optional: Path to the log file.
--log-configstr, optional: Logging configuration file.
--config-filelist of str: A/E calibration configuration file(s).
--table-namestr: LH5 table path within the DSP files.
--detectorstr, optional: Detector name used to look up override parameters.
--override-fileslist of str, optional: JSON/YAML file(s) with detector-specific override parameters.
--plot-filestr, optional: Output path for diagnostic plots (pickle).
--hit-parsstr: Output path for the A/E hit parameters (JSON/YAML).
--aoe-resultsstr: Output path for the serialised A/E calibration object (pickle).
-d / --debug: Enable debug mode for additional diagnostic output.

legenddataflowscripts.par.geds.hit.aoe.run_aoe_calibration(data, cal_dicts, results_dicts, object_dicts, plot_dicts, config, debug_mode=False, override_dict=None)¶

Run the A/E calibration and update all output dictionaries.

Instantiates CalAoE and calls calibrate() on data. All input mapping arguments are keyed by run timestamp and the corresponding output mappings are returned with the A/E results merged in.

Parameters:

data (pandas.DataFrame) – Event-level data containing the energy, current-amplitude, and drift-time parameters required by the A/E calibration.
cal_dicts (dict) – {timestamp: operations_dict} mapping of existing hit-level calibration operations. Updated in-place with the A/E cut expressions.
results_dicts (dict) – {timestamp: results_dict} mapping of calibration results from preceding steps (energy calibration, partition calibration).
object_dicts (dict) – {timestamp: object_dict} mapping of pickled calibration objects from preceding steps.
plot_dicts (dict) – {timestamp: plot_dict} mapping of existing plot dictionaries.
config (dict or str or list) – A/E calibration configuration. Must contain run_aoe (bool), current_param, energy_param, cal_energy_param, cut_field, and threshold.
debug_mode (bool) – Activates additional diagnostic output in CalAoE. Defaults to False.
override_dict (dict, optional) – Per-detector override parameters passed to calibrate().

Returns:

cal_dicts (dict) – Updated calibration operations mappings.
out_result_dicts (dict) – Updated results mappings including A/E results.
out_object_dicts (dict) – Updated object mappings including the CalAoE instance.
out_plot_dicts (dict) – Updated plot mappings including A/E diagnostic figures.

legenddataflowscripts.par.geds.hit.ecal module¶

legenddataflowscripts.par.geds.hit.ecal.baseline_tracking_plots(files, lh5_path, plot_options=None)¶

Generate baseline-tracking plots from a set of LH5 files.

Reads bl_mean, baseline, and timestamp from the LH5 files and calls each plot function specified in plot_options.

Parameters:

files (list of str) – Paths to the LH5 files to read.
lh5_path (str) – LH5 table path within the files.
plot_options (dict, optional) – Mapping of {label: {"function": callable, "options": dict | None}}. When None no plots are produced.

Returns:

dict – Plot dictionary with one entry per key in plot_options.

legenddataflowscripts.par.geds.hit.ecal.bin_baseline(data, parameter='bl_mean-baseline', dx=1, bl_range=None)¶

Bin an evaluated baseline-residual parameter into a histogram.

Parameters:

data (pandas.DataFrame) – Event-level data. The expression parameter is evaluated with pandas.DataFrame.eval().
parameter (str) – A pandas-eval expression for the baseline residual. Defaults to "bl_mean-baseline".
dx (float) – Bin width in ADC units. Defaults to 1.
bl_range (list of float, optional) – [low, high] range for the histogram. Defaults to [-500, 500].

Returns:

dict – Dictionary with keys "bl_array" (histogram counts) and "bins" (bin centres).

legenddataflowscripts.par.geds.hit.ecal.bin_bl_stability(data, time_slice=180, parameter='bl_mean')¶

Compute median baseline value and spread binned in time.

Parameters:

data (pandas.DataFrame) – Full event-level data with a timestamp column.
time_slice (float) – Width of each time bin in seconds. Defaults to 180.
parameter (str) – Name of the baseline parameter. Defaults to "bl_mean".

Returns:

dict – Dictionary with keys "time" (bin centres), "baseline" (median baseline per bin), and "spread" (variance / √N per bin).

legenddataflowscripts.par.geds.hit.ecal.bin_pulser_stability(data, cal_energy_param, selection_string, pulser_field='is_pulser', time_slice=180)¶

Compute median pulser energy and its spread binned in time.

Parameters:

data (pandas.DataFrame) – Full event-level data including the pulser flag.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Unused; kept for a consistent plot-function signature.
pulser_field (str) – Boolean column identifying pulser events. Defaults to "is_pulser".
time_slice (float) – Width of each time bin in seconds. Defaults to 180.

Returns:

dict – Dictionary with keys "time" (bin centres), "energy" (median calibrated energy per bin), and "spread" (variance / √N per bin).

legenddataflowscripts.par.geds.hit.ecal.bin_spectrum(data, cal_energy_param, selection_string, cut_field='is_valid_cal', pulser_field='is_pulser', erange=(0, 3000), dx=0.5)¶

Bin events into a calibrated energy spectrum with pass, cut, and pulser counts.

Parameters:

data (pandas.DataFrame) – Event-level data.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string identifying passing physics events.
cut_field (str) – Boolean column for the quality-cut acceptance flag. Defaults to "is_valid_cal".
pulser_field (str) – Boolean column for the pulser flag. Defaults to "is_pulser".
erange (tuple of float) – (low, high) energy range in keV. Defaults to (0, 3000).
dx (float) – Bin width in keV. Defaults to 0.5.

Returns:

dict – Dictionary with keys "bins" (bin centres), "counts" (passing-event histogram), "cut_counts" (events rejected by the cut), and "pulser_counts" (pulser-event histogram).

legenddataflowscripts.par.geds.hit.ecal.bin_stability(data, cal_energy_param, selection_string, time_slice=180, energy_range=(2585, 2660))¶

Compute median energy and spread near the 2614 keV line binned in time.

Parameters:

data (pandas.DataFrame) – Event-level data with a timestamp column.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string applied to select physics events.
time_slice (float) – Width of each time bin in seconds. Defaults to 180.
energy_range (tuple of float) – (low, high) keV range used to select events near the 2614 keV peak. Defaults to (2585, 2660).

Returns:

dict – Dictionary with keys "time" (bin centres), "energy" (median calibrated energy per bin), and "spread" (variance / √N per bin).

legenddataflowscripts.par.geds.hit.ecal.bin_survival_fraction(data, cal_energy_param, selection_string, cut_field='is_valid_cal', pulser_field='is_pulser', erange=(0, 3000), dx=6)¶

Compute the cut survival fraction as a function of calibrated energy.

Parameters:

data (pandas.DataFrame) – Event-level data.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string identifying passing physics events.
cut_field (str) – Boolean column for the quality-cut acceptance flag. Defaults to "is_valid_cal".
pulser_field (str) – Boolean column for the pulser flag. Defaults to "is_pulser".
erange (tuple of float) – (low, high) energy range in keV. Defaults to (0, 3000).
dx (float) – Bin width in keV. Defaults to 6.

Returns:

dict – Dictionary with keys "bins" (bin centres in keV) and "sf" (survival fraction in percent, regularised to avoid division by zero).

legenddataflowscripts.par.geds.hit.ecal.get_err(x)¶

legenddataflowscripts.par.geds.hit.ecal.get_median(x)¶

legenddataflowscripts.par.geds.hit.ecal.get_results_dict(ecal_class, data, cal_energy_param, selection_string)¶

Extract serialisable calibration results from an HPGeCalibration object.

Parameters:

ecal_class (pygama.pargen.energy_cal.HPGeCalibration) – Fitted calibration object.
data (pandas.DataFrame) – Event-level data containing cal_energy_param.
cal_energy_param (str) – Name of the calibrated energy column used to count events near the FEP (2614 keV), SEP (2103 keV), and DEP (1592 keV).
selection_string (str) – Pandas query string identifying physics events.

Returns:

dict – Results dictionary containing total and passing event counts at FEP, SEP, and DEP; linear and quadratic FWHM curve parameters; fitted peak parameters; and the list of calibrated peak energies. Returns {} when the calibration failed (NaN parameters).

legenddataflowscripts.par.geds.hit.ecal.monitor_parameters(files, lh5_path, parameters)¶

Compute the mode and standard deviation of selected parameters from LH5 files.

Parameters:

files (list of str) – Paths to the LH5 files to read.
lh5_path (str) – LH5 table path within the files.
parameters (list of str) – Column names to compute statistics for.

Returns:

dict – Mapping of {parameter_name: {"mode": float, "stdev": float}}.

legenddataflowscripts.par.geds.hit.ecal.par_geds_hit_ecal()¶

Perform HPGe energy calibration and write hit-level calibration parameters.

CLI entry point registered as par-geds-hit-ecal. Loads DSP-level data for a single detector channel, applies charge-trapping corrections from the DSP parameter database, and runs HPGeCalibration to find and fit gamma peaks, compute FWHM curves (linear and quadratic), and generate per-energy-parameter calibration expressions.

The following gamma lines are used for calibration:

238.6 keV (²¹²Pb), 583.2 keV, 727.3 keV, 860.6 keV, 1592.5 keV, 1620.5 keV, 2103.5 keV, 2614.5 keV (²⁰⁸Tl)

Notes

Command-line arguments

--fileslist of str: DSP LH5 file(s) or file list(s).
--tcm-fileliststr, optional: Unused placeholder.
--pulser-filestr, optional: Path to the pulser mask file.
--ctc-dictlist of str: JSON/YAML file(s) with charge-trapping correction parameters keyed by channel.
--in-hit-dictstr, optional: Existing hit parameter file to extend (JSON/YAML).
--inplot-dictstr, optional: Existing pickle plot file to extend.
--logstr, optional: Path to the log file.
--log-configstr, optional: Logging configuration file.
--config-filelist of str: Energy calibration configuration file(s).
--det-statusstr: Detector status ("on" or other); affects peak-finding tolerances. Defaults to "on".
--table-namestr: LH5 table path within the DSP files.
--channelstr: Channel identifier used to look up CTC parameters.
--plot-pathstr, optional: Output path for calibration diagnostic plots (pickle).
--save-pathstr: Output path for the calibration hit parameters (JSON/YAML).
--results-pathstr: Output path for the serialised calibration objects (pickle).
-d / --debug: Enable debug mode for additional diagnostic output.

legenddataflowscripts.par.geds.hit.ecal.plot_2614_timemap(data, cal_energy_param, selection_string, figsize=(8, 6), fontsize=12, erange=(2580, 2630), dx=1, time_dx=180)¶

Plot a 2D time-vs-energy histogram centred on the 2614 keV line.

Parameters:

data (pandas.DataFrame) – Event-level data with columns cal_energy_param and timestamp.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string applied to select physics events.
figsize (tuple of float) – Figure size in inches. Defaults to (8, 6).
fontsize (int) – Base font size. Defaults to 12.
erange (tuple of float) – Energy range (low, high) in keV. Defaults to (2580, 2630).
dx (float) – Energy bin width in keV. Defaults to 1.
time_dx (float) – Time bin width in seconds. Defaults to 180.

Returns:

matplotlib.figure.Figure – The generated figure.

legenddataflowscripts.par.geds.hit.ecal.plot_baseline_timemap(data, figsize=(8, 6), fontsize=12, parameter='bl_mean', dx=1, n_spread=5, time_dx=180)¶

Plot a 2D time-vs-baseline histogram for monitoring detector baseline stability.

Parameters:

data (pandas.DataFrame) – Event-level data with columns parameter and timestamp.
figsize (tuple of float) – Figure size in inches. Defaults to (8, 6).
fontsize (int) – Base font size. Defaults to 12.
parameter (str) – Name of the baseline parameter to plot. Defaults to "bl_mean".
dx (float) – Baseline bin width (in ADC units). Defaults to 1.
n_spread (int) – Half-width of the baseline axis in multiples of the 10th-50th percentile spread. Defaults to 5.
time_dx (float) – Time bin width in seconds. Defaults to 180.

Returns:

matplotlib.figure.Figure – The generated figure.

legenddataflowscripts.par.geds.hit.ecal.plot_pulser_timemap(data, cal_energy_param, selection_string, pulser_field='is_pulser', figsize=(8, 6), fontsize=12, dx=0.2, time_dx=180, n_spread=3)¶

Plot a 2D time-vs-energy histogram for pulser events.

Parameters:

data (pandas.DataFrame) – Event-level data including the pulser flag column.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Unused; kept for a consistent plot-function signature.
pulser_field (str) – Boolean column name identifying pulser events. Defaults to "is_pulser".
figsize (tuple of float) – Figure size in inches.
fontsize (int) – Base font size.
dx (float) – Energy bin width in keV.
time_dx (float) – Time bin width in seconds.
n_spread (int) – Half-width of the energy axis in multiples of the 10th-50th percentile spread.

Returns:

matplotlib.figure.Figure – The generated figure.

legenddataflowscripts.par.geds.hit.lq module¶

legenddataflowscripts.par.geds.hit.lq.get_results_dict(lq_class)¶

Extract serialisable results from a calibrated LQCal object.

Parameters:: lq_class (pygama.pargen.lq_cal.LQCal) – Calibrated LQ object after calling calibrate().
Returns:: dict – Results dictionary containing the calibration energy parameter, DEP mean values per run, drift-time correction fit parameters, cut fit parameters, cut value, and survival fractions.

legenddataflowscripts.par.geds.hit.lq.lq_calibration(data, cal_dicts, energy_param, cal_energy_param, dt_param, eres_func, cdf=<pygama.math.functions.gauss.GaussianGen object>, selection_string='', plot_options=None, debug_mode=False)¶

Calibrate the LQ (late-charge) pulse-shape discriminant.

Constructs a LQCal instance, computes the energy-normalised LQ observable LQ_Ecorr = lq80 / energy_param, and calls calibrate(). The resulting cut expressions are appended to cal_dicts.

Parameters:

data (pandas.DataFrame) – Event-level data containing lq80, energy_param, cal_energy_param, dt_param, and any selection columns.
cal_dicts (dict) – Mapping of run timestamps → hit-level calibration operations. Updated in-place with the LQ cut expression.
energy_param (str) – Raw (uncalibrated) energy parameter name used for LQ normalisation.
cal_energy_param (str) – Calibrated energy parameter name.
dt_param (str) – Drift-time parameter name used for the energy-dependent LQ correction.
eres_func (callable) – Energy resolution function sigma(E) used to set cut windows.
cdf (callable) – Cumulative distribution function used for binned LQ fitting. Defaults to pygama.math.distributions.gaussian().
selection_string (str) – Pandas query string applied before calibration. Defaults to "".
plot_options (dict, optional) – Mapping of {label: {"function": callable, "options": dict | None}} passed to fill_plot_dict().
debug_mode (bool) – Activates additional diagnostic output. Defaults to False.

Returns:

cal_dicts (dict) – Updated calibration operations mapping.
results_dict (dict) – LQ calibration results from get_results_dict().
plot_dict (dict) – Diagnostic figures (empty when plot_options is None).
lq (pygama.pargen.lq_cal.LQCal) – Calibrated LQ object.

legenddataflowscripts.par.geds.hit.lq.par_geds_hit_lq()¶

Calibrate the LQ pulse-shape discriminant and write hit-level parameters.

CLI entry point registered as par-geds-hit-lq. Loads DSP-level data for a single detector channel, applies energy threshold and pulser masks, and runs run_lq_calibration() to derive the energy-normalised LQ observable, its drift-time correction, and the DEP-based cut value.

Results are written to hit-pars (JSON/YAML) and the calibration objects are serialised to lq-results (pickle).

Notes

Command-line arguments

fileslist of str: One or more file lists (.filelist) containing DSP LH5 paths.
--pulser-filestr, optional: Path to the pulser mask file.
--tcm-fileliststr, optional: Unused placeholder.
--ecal-filestr: Energy calibration output file (JSON/YAML with pars and results keys).
--eres-filestr: Energy calibration pickle file containing calibration objects.
--inplotsstr, optional: Existing pickle plot file to merge with LQ plots.
--logstr, optional: Path to the log file.
--log-configstr, optional: Logging configuration file.
--config-filelist of str: LQ calibration configuration file(s). Must contain run_lq (bool), energy_param, cal_energy_param, cut_field, dt_param, and threshold.
--table-namestr: LH5 table path within the DSP files.
--timestampstr: Run timestamp label. Defaults to "20000101T000000Z".
--plot-filestr, optional: Output path for diagnostic plots (pickle).
--hit-parsstr: Output path for the LQ hit parameters (JSON/YAML).
--lq-resultsstr: Output path for the serialised LQ calibration object (pickle).
-d / --debug: Enable debug mode for additional diagnostic output.

legenddataflowscripts.par.geds.hit.lq.run_lq_calibration(data, cal_dicts, results_dicts, object_dicts, plot_dicts, configs, debug_mode=False)¶

Run the LQ calibration and update all output dictionaries.

Wraps lq_calibration() to operate on timestamp-keyed dictionaries and merge LQ results into the shared output structures used by the dataflow.

Parameters:

data (pandas.DataFrame) – Event-level data with the LQ, energy, drift-time, and cut columns.
cal_dicts (dict) – {timestamp: operations_dict} mapping of existing hit-level calibration operations. Updated with the LQ cut expression.
results_dicts (dict) – {timestamp: results_dict} mapping of preceding calibration results (energy calibration and partition calibration).
object_dicts (dict) – {timestamp: object_dict} mapping of pickled calibration objects.
plot_dicts (dict) – {timestamp: plot_dict} mapping of existing diagnostic figures.
configs (dict or str or list) – LQ calibration configuration. Must contain run_lq (bool), energy_param, cal_energy_param, cut_field, and dt_param.
debug_mode (bool) – Activates additional diagnostic output. Defaults to False.

Returns:

cal_dicts (dict) – Updated calibration operations mappings.
out_result_dicts (dict) – Updated results mappings including LQ results.
out_object_dicts (dict) – Updated object mappings including the LQCal instance.
out_plot_dicts (dict) – Updated plot mappings including LQ diagnostic figures.

legenddataflowscripts.par.geds.hit.qc module¶

legenddataflowscripts.par.geds.hit.qc.build_qc(config, cal_files, fft_files, table_name, overwrite=None, pulser_file=None, build_plots=False)¶

Derive quality-cut classifiers from calibration and FFT data.

Generates data-driven cut classifiers using pygama.pargen.data_cleaning.generate_cut_classifiers(). The procedure is:

FFT cuts (if fft_files is non-empty) - derive cuts on baseline and noise parameters from FFT-run data; discharge-recovery events are masked before fitting.
Initial calibration cuts (if "initial_cal_cuts" is in config) - a coarse pre-selection applied before the main calibration cuts to remove grossly anomalous events.
Calibration cuts - the main quality cuts derived from a random subsample (up to 4000 events) of clean calibration physics data.

Survival fractions are computed for each cut on both calibration and FFT data and returned in the results dictionary.

Parameters:

config (dict) – Quality-cut configuration. Expected keys: cal_fields (dict with cut_parameters), optionally fft_fields, initial_cal_cuts, and rounding (int, default 4).
cal_files (list of str) – Paths to calibration DSP LH5 files.
fft_files (list of str) – Paths to FFT run DSP LH5 files. Pass an empty list to skip FFT cuts.
table_name (str) – LH5 table path within the input files (e.g. ch1057600/dsp).
overwrite (dict, optional) – Cut-classifier dictionary that overrides automatically derived cuts by name.
pulser_file (str, optional) – Path to the pulser mask file. When None no pulser mask is applied.
build_plots (bool) – When True, diagnostic plots are generated by generate_cut_classifiers. Defaults to False.

Returns:

out_dict (dict) – Dictionary with keys "operations" (hit-level cut expressions) and "results" (survival fractions for each cut).
plot_dict (dict) – Diagnostic figures keyed by cut name (empty when build_plots is False).

legenddataflowscripts.par.geds.hit.qc.par_geds_hit_qc()¶

Generate quality-cut (QC) hit parameters for HPGe detectors.

CLI entry point registered as par-geds-hit-qc. Wraps build_qc() to provide a command-line interface for the quality-cut generation step. The derived cut classifiers are written to save-path as JSON/YAML, and optional diagnostic plots are serialised to plot-path.

Notes

Command-line arguments

--cal-fileslist of str: Calibration DSP LH5 files (or a single .filelist file).
--fft-fileslist of str: FFT DSP LH5 files (or a single .filelist file).
--tcm-fileliststr, optional: Unused placeholder for TCM file list.
--pulser-filestr, optional: Path to the pulser mask file.
--overwrite-fileslist of str, optional: JSON/YAML file(s) containing manual overrides for specific cuts.
--channelstr: Channel identifier used to look up overrides.
--logstr, optional: Path to the log file.
--log-configstr, optional: Logging configuration file.
--config-filelist of str: Quality-cut configuration file(s).
--table-namestr: LH5 table path within the input files.
--plot-pathstr, optional: Output path for diagnostic plots (pickle).
--save-pathstr: Output path for the QC parameters (JSON/YAML).