legenddataflowscripts.par.geds.hit package¶
Submodules¶
legenddataflowscripts.par.geds.hit.aoe module¶
- legenddataflowscripts.par.geds.hit.aoe.get_results_dict(aoe_class)¶
Extract serialisable results from a calibrated
CalAoEobject.- Parameters:
aoe_class (pygama.pargen.AoE_cal.CalAoE) – Calibrated A/E object after calling
calibrate().- Returns:
dict – Mapping of run timestamp → results dictionary. Each entry contains the calibration energy parameter, drift-time parameter, correction fit results, low-side and two-sided survival fractions (both overall and per-run), and the A/E cut values.
- legenddataflowscripts.par.geds.hit.aoe.par_geds_hit_aoe()¶
Calibrate the A/E (current-amplitude over energy) pulse-shape discriminant.
CLI entry point registered as
par-geds-hit-aoe. Loads DSP data for a single detector channel, applies energy and pulser masks, and runsrun_aoe_calibration()to derive the A/E mean, width, and cut values as a function of energy and (optionally) drift time. Results are written to hit-pars (JSON/YAML) and the calibration object is serialised to aoe-results (pickle).Notes
Command-line arguments
fileslist of strOne or more file lists (
.filelist) containing DSP LH5 paths.--pulser-filestr, optionalPath to the pulser mask file.
--tcm-fileliststr, optionalUnused placeholder.
--ecal-filestrEnergy calibration output file (JSON/YAML with
parsandresultskeys).--eres-filestrEnergy calibration pickle file containing calibration objects.
--inplotsstr, optionalExisting pickle plot file to merge with A/E plots.
--timestampstrRun timestamp label (default
"20000101T000000Z").--logstr, optionalPath to the log file.
--log-configstr, optionalLogging configuration file.
--config-filelist of strA/E calibration configuration file(s).
--table-namestrLH5 table path within the DSP files.
--detectorstr, optionalDetector name used to look up override parameters.
--override-fileslist of str, optionalJSON/YAML file(s) with detector-specific override parameters.
--plot-filestr, optionalOutput path for diagnostic plots (pickle).
--hit-parsstrOutput path for the A/E hit parameters (JSON/YAML).
--aoe-resultsstrOutput path for the serialised A/E calibration object (pickle).
-d/--debugEnable debug mode for additional diagnostic output.
- legenddataflowscripts.par.geds.hit.aoe.run_aoe_calibration(data, cal_dicts, results_dicts, object_dicts, plot_dicts, config, debug_mode=False, override_dict=None)¶
Run the A/E calibration and update all output dictionaries.
Instantiates
CalAoEand callscalibrate()on data. All input mapping arguments are keyed by run timestamp and the corresponding output mappings are returned with the A/E results merged in.- Parameters:
data (pandas.DataFrame) – Event-level data containing the energy, current-amplitude, and drift-time parameters required by the A/E calibration.
cal_dicts (dict) –
{timestamp: operations_dict}mapping of existing hit-level calibration operations. Updated in-place with the A/E cut expressions.results_dicts (dict) –
{timestamp: results_dict}mapping of calibration results from preceding steps (energy calibration, partition calibration).object_dicts (dict) –
{timestamp: object_dict}mapping of pickled calibration objects from preceding steps.plot_dicts (dict) –
{timestamp: plot_dict}mapping of existing plot dictionaries.config (dict or str or list) – A/E calibration configuration. Must contain
run_aoe(bool),current_param,energy_param,cal_energy_param,cut_field, andthreshold.debug_mode (bool) – Activates additional diagnostic output in
CalAoE. Defaults toFalse.override_dict (dict, optional) – Per-detector override parameters passed to
calibrate().
- Returns:
cal_dicts (dict) – Updated calibration operations mappings.
out_result_dicts (dict) – Updated results mappings including A/E results.
out_object_dicts (dict) – Updated object mappings including the
CalAoEinstance.out_plot_dicts (dict) – Updated plot mappings including A/E diagnostic figures.
legenddataflowscripts.par.geds.hit.ecal module¶
- legenddataflowscripts.par.geds.hit.ecal.baseline_tracking_plots(files, lh5_path, plot_options=None)¶
Generate baseline-tracking plots from a set of LH5 files.
Reads
bl_mean,baseline, andtimestampfrom the LH5 files and calls each plot function specified in plot_options.- Parameters:
- Returns:
dict – Plot dictionary with one entry per key in plot_options.
- legenddataflowscripts.par.geds.hit.ecal.bin_baseline(data, parameter='bl_mean-baseline', dx=1, bl_range=None)¶
Bin an evaluated baseline-residual parameter into a histogram.
- Parameters:
data (pandas.DataFrame) – Event-level data. The expression parameter is evaluated with
pandas.DataFrame.eval().parameter (str) – A pandas-eval expression for the baseline residual. Defaults to
"bl_mean-baseline".dx (float) – Bin width in ADC units. Defaults to
1.bl_range (list of float, optional) –
[low, high]range for the histogram. Defaults to[-500, 500].
- Returns:
dict – Dictionary with keys
"bl_array"(histogram counts) and"bins"(bin centres).
- legenddataflowscripts.par.geds.hit.ecal.bin_bl_stability(data, time_slice=180, parameter='bl_mean')¶
Compute median baseline value and spread binned in time.
- Parameters:
data (pandas.DataFrame) – Full event-level data with a
timestampcolumn.time_slice (float) – Width of each time bin in seconds. Defaults to
180.parameter (str) – Name of the baseline parameter. Defaults to
"bl_mean".
- Returns:
dict – Dictionary with keys
"time"(bin centres),"baseline"(median baseline per bin), and"spread"(variance / √N per bin).
- legenddataflowscripts.par.geds.hit.ecal.bin_pulser_stability(data, cal_energy_param, selection_string, pulser_field='is_pulser', time_slice=180)¶
Compute median pulser energy and its spread binned in time.
- Parameters:
data (pandas.DataFrame) – Full event-level data including the pulser flag.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Unused; kept for a consistent plot-function signature.
pulser_field (str) – Boolean column identifying pulser events. Defaults to
"is_pulser".time_slice (float) – Width of each time bin in seconds. Defaults to
180.
- Returns:
dict – Dictionary with keys
"time"(bin centres),"energy"(median calibrated energy per bin), and"spread"(variance / √N per bin).
- legenddataflowscripts.par.geds.hit.ecal.bin_spectrum(data, cal_energy_param, selection_string, cut_field='is_valid_cal', pulser_field='is_pulser', erange=(0, 3000), dx=0.5)¶
Bin events into a calibrated energy spectrum with pass, cut, and pulser counts.
- Parameters:
data (pandas.DataFrame) – Event-level data.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string identifying passing physics events.
cut_field (str) – Boolean column for the quality-cut acceptance flag. Defaults to
"is_valid_cal".pulser_field (str) – Boolean column for the pulser flag. Defaults to
"is_pulser".erange (tuple of float) –
(low, high)energy range in keV. Defaults to(0, 3000).dx (float) – Bin width in keV. Defaults to
0.5.
- Returns:
dict – Dictionary with keys
"bins"(bin centres),"counts"(passing-event histogram),"cut_counts"(events rejected by the cut), and"pulser_counts"(pulser-event histogram).
- legenddataflowscripts.par.geds.hit.ecal.bin_stability(data, cal_energy_param, selection_string, time_slice=180, energy_range=(2585, 2660))¶
Compute median energy and spread near the 2614 keV line binned in time.
- Parameters:
data (pandas.DataFrame) – Event-level data with a
timestampcolumn.cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string applied to select physics events.
time_slice (float) – Width of each time bin in seconds. Defaults to
180.energy_range (tuple of float) –
(low, high)keV range used to select events near the 2614 keV peak. Defaults to(2585, 2660).
- Returns:
dict – Dictionary with keys
"time"(bin centres),"energy"(median calibrated energy per bin), and"spread"(variance / √N per bin).
- legenddataflowscripts.par.geds.hit.ecal.bin_survival_fraction(data, cal_energy_param, selection_string, cut_field='is_valid_cal', pulser_field='is_pulser', erange=(0, 3000), dx=6)¶
Compute the cut survival fraction as a function of calibrated energy.
- Parameters:
data (pandas.DataFrame) – Event-level data.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string identifying passing physics events.
cut_field (str) – Boolean column for the quality-cut acceptance flag. Defaults to
"is_valid_cal".pulser_field (str) – Boolean column for the pulser flag. Defaults to
"is_pulser".erange (tuple of float) –
(low, high)energy range in keV. Defaults to(0, 3000).dx (float) – Bin width in keV. Defaults to
6.
- Returns:
dict – Dictionary with keys
"bins"(bin centres in keV) and"sf"(survival fraction in percent, regularised to avoid division by zero).
- legenddataflowscripts.par.geds.hit.ecal.get_err(x)¶
- legenddataflowscripts.par.geds.hit.ecal.get_median(x)¶
- legenddataflowscripts.par.geds.hit.ecal.get_results_dict(ecal_class, data, cal_energy_param, selection_string)¶
Extract serialisable calibration results from an
HPGeCalibrationobject.- Parameters:
ecal_class (pygama.pargen.energy_cal.HPGeCalibration) – Fitted calibration object.
data (pandas.DataFrame) – Event-level data containing cal_energy_param.
cal_energy_param (str) – Name of the calibrated energy column used to count events near the FEP (2614 keV), SEP (2103 keV), and DEP (1592 keV).
selection_string (str) – Pandas query string identifying physics events.
- Returns:
dict – Results dictionary containing total and passing event counts at FEP, SEP, and DEP; linear and quadratic FWHM curve parameters; fitted peak parameters; and the list of calibrated peak energies. Returns
{}when the calibration failed (NaNparameters).
- legenddataflowscripts.par.geds.hit.ecal.monitor_parameters(files, lh5_path, parameters)¶
Compute the mode and standard deviation of selected parameters from LH5 files.
- legenddataflowscripts.par.geds.hit.ecal.par_geds_hit_ecal()¶
Perform HPGe energy calibration and write hit-level calibration parameters.
CLI entry point registered as
par-geds-hit-ecal. Loads DSP-level data for a single detector channel, applies charge-trapping corrections from the DSP parameter database, and runsHPGeCalibrationto find and fit gamma peaks, compute FWHM curves (linear and quadratic), and generate per-energy-parameter calibration expressions.The following gamma lines are used for calibration:
238.6 keV (²¹²Pb), 583.2 keV, 727.3 keV, 860.6 keV, 1592.5 keV, 1620.5 keV, 2103.5 keV, 2614.5 keV (²⁰⁸Tl)
Notes
Command-line arguments
--fileslist of strDSP LH5 file(s) or file list(s).
--tcm-fileliststr, optionalUnused placeholder.
--pulser-filestr, optionalPath to the pulser mask file.
--ctc-dictlist of strJSON/YAML file(s) with charge-trapping correction parameters keyed by channel.
--in-hit-dictstr, optionalExisting hit parameter file to extend (JSON/YAML).
--inplot-dictstr, optionalExisting pickle plot file to extend.
--logstr, optionalPath to the log file.
--log-configstr, optionalLogging configuration file.
--config-filelist of strEnergy calibration configuration file(s).
--det-statusstrDetector status (
"on"or other); affects peak-finding tolerances. Defaults to"on".--table-namestrLH5 table path within the DSP files.
--channelstrChannel identifier used to look up CTC parameters.
--plot-pathstr, optionalOutput path for calibration diagnostic plots (pickle).
--save-pathstrOutput path for the calibration hit parameters (JSON/YAML).
--results-pathstrOutput path for the serialised calibration objects (pickle).
-d/--debugEnable debug mode for additional diagnostic output.
- legenddataflowscripts.par.geds.hit.ecal.plot_2614_timemap(data, cal_energy_param, selection_string, figsize=(8, 6), fontsize=12, erange=(2580, 2630), dx=1, time_dx=180)¶
Plot a 2D time-vs-energy histogram centred on the 2614 keV line.
- Parameters:
data (pandas.DataFrame) – Event-level data with columns cal_energy_param and
timestamp.cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Pandas query string applied to select physics events.
figsize (tuple of float) – Figure size in inches. Defaults to
(8, 6).fontsize (int) – Base font size. Defaults to
12.erange (tuple of float) – Energy range
(low, high)in keV. Defaults to(2580, 2630).dx (float) – Energy bin width in keV. Defaults to
1.time_dx (float) – Time bin width in seconds. Defaults to
180.
- Returns:
matplotlib.figure.Figure – The generated figure.
- legenddataflowscripts.par.geds.hit.ecal.plot_baseline_timemap(data, figsize=(8, 6), fontsize=12, parameter='bl_mean', dx=1, n_spread=5, time_dx=180)¶
Plot a 2D time-vs-baseline histogram for monitoring detector baseline stability.
- Parameters:
data (pandas.DataFrame) – Event-level data with columns parameter and
timestamp.figsize (tuple of float) – Figure size in inches. Defaults to
(8, 6).fontsize (int) – Base font size. Defaults to
12.parameter (str) – Name of the baseline parameter to plot. Defaults to
"bl_mean".dx (float) – Baseline bin width (in ADC units). Defaults to
1.n_spread (int) – Half-width of the baseline axis in multiples of the 10th-50th percentile spread. Defaults to
5.time_dx (float) – Time bin width in seconds. Defaults to
180.
- Returns:
matplotlib.figure.Figure – The generated figure.
- legenddataflowscripts.par.geds.hit.ecal.plot_pulser_timemap(data, cal_energy_param, selection_string, pulser_field='is_pulser', figsize=(8, 6), fontsize=12, dx=0.2, time_dx=180, n_spread=3)¶
Plot a 2D time-vs-energy histogram for pulser events.
- Parameters:
data (pandas.DataFrame) – Event-level data including the pulser flag column.
cal_energy_param (str) – Name of the calibrated energy column.
selection_string (str) – Unused; kept for a consistent plot-function signature.
pulser_field (str) – Boolean column name identifying pulser events. Defaults to
"is_pulser".fontsize (int) – Base font size.
dx (float) – Energy bin width in keV.
time_dx (float) – Time bin width in seconds.
n_spread (int) – Half-width of the energy axis in multiples of the 10th-50th percentile spread.
- Returns:
matplotlib.figure.Figure – The generated figure.
legenddataflowscripts.par.geds.hit.lq module¶
- legenddataflowscripts.par.geds.hit.lq.get_results_dict(lq_class)¶
Extract serialisable results from a calibrated
LQCalobject.- Parameters:
lq_class (pygama.pargen.lq_cal.LQCal) – Calibrated LQ object after calling
calibrate().- Returns:
dict – Results dictionary containing the calibration energy parameter, DEP mean values per run, drift-time correction fit parameters, cut fit parameters, cut value, and survival fractions.
- legenddataflowscripts.par.geds.hit.lq.lq_calibration(data, cal_dicts, energy_param, cal_energy_param, dt_param, eres_func, cdf=<pygama.math.functions.gauss.GaussianGen object>, selection_string='', plot_options=None, debug_mode=False)¶
Calibrate the LQ (late-charge) pulse-shape discriminant.
Constructs a
LQCalinstance, computes the energy-normalised LQ observableLQ_Ecorr = lq80 / energy_param, and callscalibrate(). The resulting cut expressions are appended to cal_dicts.- Parameters:
data (pandas.DataFrame) – Event-level data containing
lq80, energy_param, cal_energy_param, dt_param, and any selection columns.cal_dicts (dict) – Mapping of run timestamps → hit-level calibration operations. Updated in-place with the LQ cut expression.
energy_param (str) – Raw (uncalibrated) energy parameter name used for LQ normalisation.
cal_energy_param (str) – Calibrated energy parameter name.
dt_param (str) – Drift-time parameter name used for the energy-dependent LQ correction.
eres_func (callable) – Energy resolution function
sigma(E)used to set cut windows.cdf (callable) – Cumulative distribution function used for binned LQ fitting. Defaults to
pygama.math.distributions.gaussian().selection_string (str) – Pandas query string applied before calibration. Defaults to
"".plot_options (dict, optional) – Mapping of
{label: {"function": callable, "options": dict | None}}passed tofill_plot_dict().debug_mode (bool) – Activates additional diagnostic output. Defaults to
False.
- Returns:
cal_dicts (dict) – Updated calibration operations mapping.
results_dict (dict) – LQ calibration results from
get_results_dict().plot_dict (dict) – Diagnostic figures (empty when plot_options is
None).lq (pygama.pargen.lq_cal.LQCal) – Calibrated LQ object.
- legenddataflowscripts.par.geds.hit.lq.par_geds_hit_lq()¶
Calibrate the LQ pulse-shape discriminant and write hit-level parameters.
CLI entry point registered as
par-geds-hit-lq. Loads DSP-level data for a single detector channel, applies energy threshold and pulser masks, and runsrun_lq_calibration()to derive the energy-normalised LQ observable, its drift-time correction, and the DEP-based cut value.Results are written to hit-pars (JSON/YAML) and the calibration objects are serialised to lq-results (pickle).
Notes
Command-line arguments
fileslist of strOne or more file lists (
.filelist) containing DSP LH5 paths.--pulser-filestr, optionalPath to the pulser mask file.
--tcm-fileliststr, optionalUnused placeholder.
--ecal-filestrEnergy calibration output file (JSON/YAML with
parsandresultskeys).--eres-filestrEnergy calibration pickle file containing calibration objects.
--inplotsstr, optionalExisting pickle plot file to merge with LQ plots.
--logstr, optionalPath to the log file.
--log-configstr, optionalLogging configuration file.
--config-filelist of strLQ calibration configuration file(s). Must contain
run_lq(bool),energy_param,cal_energy_param,cut_field,dt_param, andthreshold.--table-namestrLH5 table path within the DSP files.
--timestampstrRun timestamp label. Defaults to
"20000101T000000Z".--plot-filestr, optionalOutput path for diagnostic plots (pickle).
--hit-parsstrOutput path for the LQ hit parameters (JSON/YAML).
--lq-resultsstrOutput path for the serialised LQ calibration object (pickle).
-d/--debugEnable debug mode for additional diagnostic output.
- legenddataflowscripts.par.geds.hit.lq.run_lq_calibration(data, cal_dicts, results_dicts, object_dicts, plot_dicts, configs, debug_mode=False)¶
Run the LQ calibration and update all output dictionaries.
Wraps
lq_calibration()to operate on timestamp-keyed dictionaries and merge LQ results into the shared output structures used by the dataflow.- Parameters:
data (pandas.DataFrame) – Event-level data with the LQ, energy, drift-time, and cut columns.
cal_dicts (dict) –
{timestamp: operations_dict}mapping of existing hit-level calibration operations. Updated with the LQ cut expression.results_dicts (dict) –
{timestamp: results_dict}mapping of preceding calibration results (energy calibration and partition calibration).object_dicts (dict) –
{timestamp: object_dict}mapping of pickled calibration objects.plot_dicts (dict) –
{timestamp: plot_dict}mapping of existing diagnostic figures.configs (dict or str or list) – LQ calibration configuration. Must contain
run_lq(bool),energy_param,cal_energy_param,cut_field, anddt_param.debug_mode (bool) – Activates additional diagnostic output. Defaults to
False.
- Returns:
cal_dicts (dict) – Updated calibration operations mappings.
out_result_dicts (dict) – Updated results mappings including LQ results.
out_object_dicts (dict) – Updated object mappings including the
LQCalinstance.out_plot_dicts (dict) – Updated plot mappings including LQ diagnostic figures.
legenddataflowscripts.par.geds.hit.qc module¶
- legenddataflowscripts.par.geds.hit.qc.build_qc(config, cal_files, fft_files, table_name, overwrite=None, pulser_file=None, build_plots=False)¶
Derive quality-cut classifiers from calibration and FFT data.
Generates data-driven cut classifiers using
pygama.pargen.data_cleaning.generate_cut_classifiers(). The procedure is:FFT cuts (if fft_files is non-empty) - derive cuts on baseline and noise parameters from FFT-run data; discharge-recovery events are masked before fitting.
Initial calibration cuts (if
"initial_cal_cuts"is in config) - a coarse pre-selection applied before the main calibration cuts to remove grossly anomalous events.Calibration cuts - the main quality cuts derived from a random subsample (up to 4000 events) of clean calibration physics data.
Survival fractions are computed for each cut on both calibration and FFT data and returned in the results dictionary.
- Parameters:
config (dict) – Quality-cut configuration. Expected keys:
cal_fields(dict withcut_parameters), optionallyfft_fields,initial_cal_cuts, androunding(int, default 4).cal_files (list of str) – Paths to calibration DSP LH5 files.
fft_files (list of str) – Paths to FFT run DSP LH5 files. Pass an empty list to skip FFT cuts.
table_name (str) – LH5 table path within the input files (e.g.
ch1057600/dsp).overwrite (dict, optional) – Cut-classifier dictionary that overrides automatically derived cuts by name.
pulser_file (str, optional) – Path to the pulser mask file. When
Noneno pulser mask is applied.build_plots (bool) – When
True, diagnostic plots are generated bygenerate_cut_classifiers. Defaults toFalse.
- Returns:
out_dict (dict) – Dictionary with keys
"operations"(hit-level cut expressions) and"results"(survival fractions for each cut).plot_dict (dict) – Diagnostic figures keyed by cut name (empty when build_plots is
False).
- legenddataflowscripts.par.geds.hit.qc.par_geds_hit_qc()¶
Generate quality-cut (QC) hit parameters for HPGe detectors.
CLI entry point registered as
par-geds-hit-qc. Wrapsbuild_qc()to provide a command-line interface for the quality-cut generation step. The derived cut classifiers are written to save-path as JSON/YAML, and optional diagnostic plots are serialised to plot-path.Notes
Command-line arguments
--cal-fileslist of strCalibration DSP LH5 files (or a single
.filelistfile).--fft-fileslist of strFFT DSP LH5 files (or a single
.filelistfile).--tcm-fileliststr, optionalUnused placeholder for TCM file list.
--pulser-filestr, optionalPath to the pulser mask file.
--overwrite-fileslist of str, optionalJSON/YAML file(s) containing manual overrides for specific cuts.
--channelstrChannel identifier used to look up overrides.
--logstr, optionalPath to the log file.
--log-configstr, optionalLogging configuration file.
--config-filelist of strQuality-cut configuration file(s).
--table-namestrLH5 table path within the input files.
--plot-pathstr, optionalOutput path for diagnostic plots (pickle).
--save-pathstrOutput path for the QC parameters (JSON/YAML).