Extending exo2micro
This page covers how to modify exo2micro for uses the shipping pipeline doesn’t cover — custom plots, new workflows, integration with other tools.
Adding a new diagnostic plot
Every stage-4 diagnostic plot lives as a function in
exo2micro.plotting that takes (post_im, pre_im, ..., save_path=None)
and is called from SampleDye._run_stage_4_diagnostics.
To add a new plot:
Write the plotting function. Put it in
exo2micro/plotting.py. Follow the existing signature pattern — takepost_im,pre_im, any method-specific kwargs, andsample,dye,save_path. Return the matplotlib figure. Close the figure ifsave_pathis given; callplt.show()otherwise.Export it from
exo2micro/__init__.py:from .plotting import plot_my_new_diagnostic
Call it from stage 4. In
_run_stage_4_diagnostics:my_path = self._check_path('my_new_diagnostic') if force or not os.path.exists(my_path): plotting.plot_my_new_diagnostic( post_full, pre_aligned, sample=self.sample, dye=self.dye, save_path=my_path)
Add it to the inline preview in
gui.pyunder_show_inline_resultsso the GUI displays it after a run:plots_to_show.append( ('my_new_diagnostic', 'Short caption explaining it.'))
Add it to ``SampleDye.status()`` in
pipeline.pyso status checks include it.
Adding a new parameter
Parameters live in exo2micro/defaults.py in
PARAMETER_REGISTRY. Each entry is:
('param_name', (default_value, 'abbrev', stage_number, 'description')),
Steps:
Add to the registry with a short abbreviation that doesn’t collide with any existing one. Check
ABBREVIATIONSfirst. The abbreviation appears in checkpoint filenames as{abbrev}{value}.Consume it in the relevant stage method. Access via
self._params['param_name']. Parameters are type-checked byset_paramsagainst their default value’s type.Choose the type carefully. Booleans, ints, floats, and
Noneround-trip through filename suffixes viaparams_from_suffix. Strings work but rarely needed.Update the parameter reference docs in
docs/source/developers/parameters.rst.
Boolean parameters serialize as 0 / 1 in filenames (e.g.
_iecc0). None-able parameters serialize as none (e.g.
_sp_none). Floats use :g format, so 1.42 →
msc1.42, 1.0 → msc1.
Adding a new pipeline stage
This is a deeper change and touches several files.
Update ``STAGE_NAMES`` and ``MAX_STAGE`` in
defaults.py. Stages are numbered 1-indexed; filenames embed the number zero-padded to two digits.Add a ``_run_stage_N_…`` method on
SampleDyefollowing the pattern of the existing ones:Take
force=Falseas the only argument.Check for existing checkpoints via
self._has_checkpointand skip if present (unlessforce).Auto-run missing upstream stages.
Load inputs via
self._load_image.Save outputs via
self._save_image.
Wire it into ``run()``. Add the stage guard block to
SampleDye.run:if from_stage <= N and to_stage >= N: self._run_stage_N_mystage(force)
Add its files to ``_check_upstream`` so downstream stages can detect when it’s missing.
Add its files to ``status()`` so users can see its checkpoint status.
Update ``gui.py`` — the
from_stage/to_stagedropdowns are built fromSTAGE_NAMESautomatically, so they’ll pick up the new stage without code changes.
Custom image loaders
If your raw data isn’t in the TIFF-with-PreStain/PostStain
naming convention exo2micro expects, you have two options:
Option 1: Write files in the expected format before running. Simplest — just rename or copy your files into the structure exo2micro wants.
Option 2: Replace ``load_image_pair``. It lives in
exo2micro.utils and returns a 4-tuple of
(post_im, pre_im, post_path, pre_path). Write your own
function with the same signature, then patch it:
import exo2micro.utils as u
import exo2micro.pipeline as p
def my_loader(sample, dye, raw_dir='...'):
# your logic here
return post, pre, post_path, pre_path
u.load_image_pair = my_loader
p.load_image_pair = my_loader
Ugly but effective. A cleaner alternative is to subclass
SampleDye and override _run_stage_1_padding.
Accessing pipeline internals
The stage methods are prefixed with underscores
(_run_stage_1_padding, etc.) to signal that they’re internal
implementation details. However, they’re stable across minor
releases within a major version, and sometimes you need them.
Common things you might want:
SampleDye._load_image(stage, name)()— load any checkpoint as a numpy array. ReturnsNoneif not found.SampleDye._save_image(image, stage, name, extra_headers=None)()— save an array as TIFF + FITS with the usual metadata.SampleDye._has_checkpoint(stage, name)()— check if a specific checkpoint file exists for the current parameters.SampleDye._tiff_path(stage, name)/SampleDye._fits_path(stage, name)/SampleDye._check_path(name)— construct the full on-disk paths for checkpoints and check plots.SampleDye._results— transient dict of in-memory results from the current run. Keys includewarp_matrix,debug_data,scale_estimate,scale_percentile_value. Not persisted across instances.
Integration with downstream analysis
A typical downstream analysis reads the difference image and segments the positive signal:
from astropy.io import fits
from skimage import morphology, measure
hdul = fits.open(
'processed/CD070/SybrGld_microbe/fits/04_difference_difference.fits')
diff = hdul[0].data
scale = hdul[0].header['SCALE']
# Threshold at e.g. 5 sigma above background noise
bg_std = diff[diff != 0].std()
mask = diff > 5 * bg_std
# Clean up small artefacts
mask = morphology.remove_small_objects(mask, min_size=20)
# Label connected components
labels = measure.label(mask)
props = measure.regionprops(labels, intensity_image=diff)
for p in props:
print(f"feature {p.label}: centroid {p.centroid} "
f"area {p.area} mean intensity {p.mean_intensity:.1f}")
The FITS header carries the scale factor and all non-default
parameters, so this downstream script has full provenance
information for every feature it detects. For fully reproducible
pipelines, log hdul[0].header alongside your segmentation
results.
Where to look in the code
The package is organised around modules with clear responsibilities:
pipeline.py— theSampleDyeclass, parameter management, stage orchestration, file I/O routing.alignment.py— image registration.register_highorderis the main entry point and does boundary correlation + ICP in one call.refine_interior_eccis the SIFT refinement.plotting.py— active visualization functions (diagnostics, zoom, registration plots).utils.py— file I/O, preprocessing, masking, channel detection. Most things in here are used bypipeline.pybut also exported for standalone use.defaults.py— parameter registry and filename conventions.parallel.py— batch processing (serial and parallel).gui.py— ipywidgets GUI, layered on top ofSampleDye.legacy.py— deprecated functions kept for back-compat.
For autodoc-generated details on every public function, see API Reference.