CLI Reference: Preprocessing
The PrismToolBox CLI provides useful preprocessing capabilities for whole slide images through the ptb preprocessing
command.
Overview
The preprocessing module includes two main commands:
contouring
: Extract tissue contours from whole slide imagespatching
: Extract patches from slides using tissue contours
Installation
Make sure you have PrismToolBox installed:
Global Options
All preprocessing commands support these global options:
--verbose, -v
: Increase verbosity (can be used multiple times:-v
,-vv
)--help
: Show help message
Commands
ptb preprocessing contouring
Extract tissue contours from whole slide images.
Usage
Arguments
SLIDE_DIRECTORY
: Path to the directory containing the slide filesRESULTS_DIRECTORY
: Path to the directory where the results will be saved
Options
Option | Type | Description | Default |
---|---|---|---|
--engine |
str |
Engine for reading slides (openslide , tiffslide ). |
openslide |
--annotations-directory |
str | None |
Path to annotations directory | None |
--contours-exts |
list[str] |
File extensions for contour annotations (geojson , pickle ) |
[pickle] |
--config-file |
str |
Path to configuration file | None |
--visualize |
bool |
Visualize the extracted contours | False |
Configuration File
You can use a YAML configuration file to specify tissue extraction and visualization parameters:
# Default configuration for PrismToolBox contouring
# Tissue contour extraction parameters
contouring:
seg_level: 2 # (int) Segmentation level for the tissue contour extraction.
window_avg: 30 # (int) Size of the window average for tissue extraction.
window_eng: 3 # (int) Size of the window to use for computing energy for tissue extraction.
thresh: 120 # (int) Threshold for the tissue extraction algorithm.
area_min: 6000 # (int) Minimum area for the tissue contour.
# Tissue visualization parameters
visualizing:
vis_level: 2 # (int) Visualization level for the tissue contour extraction.
number_contours: false # (bool) Plot the id number for each contour.
line_thickness: 50 # (bool) Line thickness for the contour visualization.
Examples
# Basic contour extraction
ptb preprocessing contouring slides/ results/
# With visualization
ptb preprocessing contouring slides/ results/ --visualize
# Using custom configuration
ptb preprocessing contouring slides/ results/ --config-file custom_config.yaml
# With annotations and multiple output formats
ptb preprocessing contouring slides/ results/ --annotations-directory annotations/ --contours-exts pickle geojson --visualize
ptb preprocessing patching
Extract patches from slides using tissue contours.
Usage
Arguments
SLIDE_DIRECTORY
: Path to the directory containing the slide filesRESULTS_DIRECTORY
: Path to the directory where the results will be saved
Options
Option | Type | Description | Default |
---|---|---|---|
--contours-directory |
str | None |
Path to directory containing contour annotations | None |
--engine |
str |
Engine for reading slides | openslide |
--mode |
str |
Extraction mode (contours , roi , all ) |
contours |
--patch-exts |
list[str] |
File extensions for patches (h5 , geojson ) |
[h5] |
--config-file |
str | None |
Path to configuration file | None |
Configuration File
Example configuration for patch extraction:
# Default configuration for PrismToolBox patching
# Patch extraction parameters
patching:
patch_level: 0 # (float) Level of the slide to extract patches from.
patch_size: 256 # (float) Size of the patches to extract.
overlap: 0 # (float) Overlap between the patches.
units: ["px", "px"] # (str, str) Units for the patch size and overlap. Options are 'pixels' or 'micro' for micrometers.
contours_mode: "four_pt" # (str) The mode to use for the contour checking. Possible values are center, four_pt, and four_pt_hard.
rgb_threshs: [2, 240] # (int, int) The thresholds for the RGB channels (black threshold, white threshold).
percentages: [0.6, 0.9] # (float, float) The percentages of pixels below/above the thresholds to consider the patch as black/white.
# Patch stitching parameters
stitching:
vis_level: 2 # (int) Level of the slide to stitch the patches at.
draw_grid: false # (bool) Whether to draw a grid on the stitched image.
Examples
# Basic patch extraction
ptb preprocessing patching slides/ results/ --contours-directory results/contours/
# With custom configuration
ptb preprocessing patching slides/ results/ --contours-directory results/contours/ --config-file patch_config.yaml
# Extract patches in multiple formats
ptb preprocessing patching slides/ results/ --contours-directory results/contours/ --patch-exts h5 geojson
Complete Workflow Example
Here's a complete example of processing a dataset:
# Step 1: Extract tissue contours with visualization
ptb preprocessing contouring slides/ results/ --visualize --config-file tissue_config.yaml
# Step 2: Extract patches from the contours
ptb preprocessing patching slides/ results/ --contours-directory results/contours/ --config-file patch_config.yaml --patch-exts geojson
# Results will be saved in:
# - results/contours/ (tissue contours)
# - results/contoured_images/ (visualizations)
# - results/patches_256_ovelap_0/ (extracted patches)
# - results/stitched_images_256_ovelap_0/ (patch visualizations)
Error Handling
Common issues and solutions:
Missing Dependencies
Error: Segmentation features require additional dependencies.
Please install with: pip install prismtoolbox[seg]
Solution: Install the required dependencies:
Configuration File Issues
Solution: Ensure your configuration file contains all required parameters for each section.
File Path Issues
Solution: Check that your configuration file path is correct and the file exists.
Tips and Best Practices
- Start with visualization: Use
--visualize
flag to check if tissue detection works correctly - Test with small datasets: Process a few slides first to validate your parameters
- Use configuration files: Store your parameters in YAML files for reproducibility
- Monitor output: Use verbose mode (
-v
or-vv
) to see detailed processing information