Pipeline Module

pipeline.py

The SampleDye class — the central object for processing one sample+dye combination through the full exo2micro pipeline.

Each instance manages:
  • Current parameter state

  • Checkpoint file discovery and resume logic

  • Stage execution with automatic save/load

  • Directory structure creation

  • Plot generation and caching

Usage

import exo2micro as e2m

run = e2m.SampleDye('CD070', 'SybrGld_microbe', output_dir='processed')
run.set_params(boundary_width=20)
run.run()                                    # resumes from latest checkpoint
run.compare('boundary_width', [10, 15, 20])  # grid comparison
class exo2micro.pipeline.SampleDye(sample, dye, output_dir='processed', raw_dir='raw', checkpoint_format='tiff')[source]

Bases: object

Pipeline controller for a single sample + dye combination.

Manages parameter state, checkpoint files, and stage execution. Images are loaded from disk on demand and released after processing to preserve RAM.

Parameters:
  • sample (str) – Sample name, e.g. ‘CD070’.

  • dye (str) – Dye/channel name, e.g. ‘SybrGld_microbe’.

  • output_dir (str) – Root output directory (default ‘processed’).

  • raw_dir (str) – Root directory containing raw images (default ‘raw’).

  • checkpoint_format ({'tiff', 'fits', 'both'}) –

    Which file formats to write for each pipeline checkpoint (default 'tiff'). TIFF is the most widely-supported format for downstream inspection; FITS adds an ~equal-size copy with metadata in the header for provenance. 'both' writes both.

    For loading (resume), this setting only governs what gets written. Reads are format-agnostic and will use whichever file exists — if both exist, TIFF is preferred for speed. A warning is printed when the loaded format differs from the configured save format, because the output directory will end up with mixed formats.

compare(param_name, values, save=False)[source]

Compare the effect of varying a single parameter.

Generates a grid of plots showing the effect of each value on either the alignment or the final difference image.

Parameters:
  • param_name (str) – Parameter to vary.

  • values (list) – Values to try.

  • save (bool) – If True, save each variant’s checkpoints. If False, display only (default False).

Returns:

Results for each parameter value.

Return type:

list of dict

non_default_params(stage=None)[source]

Return a dict of parameters that differ from their defaults.

Parameters:

stage (int or None) – If set, only return non-defaults for this stage and upstream.

Return type:

dict

property params

Current parameter state as a dict.

reset_params()[source]

Reset all parameters to their default values.

run(from_stage=None, to_stage=None, force=False, force_run=False)[source]

Run the pipeline, resuming from the latest available checkpoint.

Stages

  1. Padding — load raw images onto a common padded canvas

  2. Coarse alignment — boundary correlation + ICP

  3. Interior alignment — SIFT feature matching refinement

  4. Diagnostics & subtraction — scale estimation, plots, difference image

param from_stage:

Force re-run from this stage onward. If None, auto-detect.

type from_stage:

int or None

param to_stage:

Stop after this stage. If None, run to completion.

type to_stage:

int or None

param force:

If True, re-run all stages even if checkpoints exist.

type force:

bool

param force_run:

If True, downgrade any hard-fail pre-flight check (RAM or disk estimate exceeds available) to a warning. Default False. See exo2micro.utils.preflight_check().

type force_run:

bool

returns:

Results dict with status.

rtype:

dict

set_params(**kwargs)[source]

Set one or more pipeline parameters.

Only known parameter names are accepted. Unknown names raise ValueError.

Parameters:

**kwargs – Parameter name=value pairs, e.g. boundary_width=20.

status()[source]

Print a summary of which checkpoints exist for current parameters.