sopa.io
Notes
Due to many updates in the data format provided by the different companies, you might have issues loading your data. In this case, consider opening an issue detailing the version of the machine you used and the error log, as well as an example of file names that you are trying to read.
Related to spatialdata-io
A library called spatialdata-io
already contains a lot of readers. Here, we updated some readers already existing in spatialdata-io
, and added a few others. In the future, we will completely rely on spatialdata-io
.
Readers
sopa.io.xenium(path, image_models_kwargs=None, imread_kwargs=None, **kwargs)
Read Xenium data as a SpatialData
object. For more information, refer to spatialdata-io.
This function reads the following files
transcripts.parquet
: transcripts locations and namesexperiment.xenium
: metadata filemorphology_focus.ome.tif
: morphology image (or a directory, for recent versions of the Xenium)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path
|
Path to the Xenium directory containing all the experiment files |
required |
image_models_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
imread_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/xenium.py
sopa.io.merscope(path, backend=None, z_layers=3, region_name=None, slide_name=None, image_models_kwargs=None, imread_kwargs=None, **kwargs)
Read MERSCOPE data as a SpatialData
object. For more information, refer to spatialdata-io.
This function reads the following files
detected_transcripts.csv
: transcripts locations and names- all the images under the
images
directory images/micron_to_mosaic_pixel_transform.csv
: affine transformation
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path
|
Path to the MERSCOPE directory containing all the experiment files |
required |
backend |
Literal['dask_image', 'rioxarray'] | None
|
Either |
None
|
z_layers |
int | list[int] | None
|
Indices of the z-layers to consider. Either one |
3
|
region_name |
str | None
|
Name of the region of interest, e.g., |
None
|
slide_name |
str | None
|
Name of the slide/run. If |
None
|
image_models_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
imread_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/merscope.py
sopa.io.cosmx(path, dataset_id=None, fov=None, read_proteins=False, image_models_kwargs=None, imread_kwargs=None)
Read Cosmx Nanostring data. The fields of view are stitched together, except if fov
is provided.
This function reads the following files
*_fov_positions_file.csv
or*_fov_positions_file.csv.gz
: FOV locationsMorphology2D
directory: all the FOVs morphology imagesMorphology_ChannelID_Dictionary.txt
: Morphology channels names*_tx_file.csv.gz
or*_tx_file.csv
: Transcripts location and names- If
read_proteins
isTrue
, all the images under the nestedProteinImages
directories will be read
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path
|
Path to the root directory containing Nanostring files. |
required |
dataset_id |
Optional[str]
|
Optional name of the dataset (needs to be provided if not infered). |
None
|
fov |
int | str | None
|
Name or number of one single field of view to be read. If a string is provided, an example of correct syntax is "F008". By default, reads all FOVs. |
None
|
read_proteins |
bool
|
Whether to read the proteins or the transcripts. |
False
|
image_models_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
imread_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/cosmx.py
sopa.io.macsima(path, **kwargs)
Read MACSIMA data as a SpatialData
object
Notes
For all dulicated name, their index will be added in brackets after, for instance you may find DAPI (1)
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
Path
|
Path to the directory containing the MACSIMA |
required |
kwargs |
int
|
Kwargs for the |
{}
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/macsima.py
sopa.io.phenocycler(path, channels_renaming=None, image_models_kwargs=None)
Read Phenocycler data as a SpatialData
object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path
|
Path to a |
required |
channels_renaming |
dict | None
|
A dictionnary whose keys correspond to channels and values to their corresponding new name. Not all channels need to be renamed. |
None
|
image_models_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/phenocycler.py
sopa.io.hyperion(path, image_models_kwargs=None, imread_kwargs=None)
Read Hyperion data as a SpatialData
object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
Path
|
Path to the directory containing the Hyperion |
required |
image_models_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
imread_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/hyperion.py
sopa.io.aicsimageio(path, z_stack=0, image_models_kwargs=None, aics_kwargs=None)
Read an image using AICSImageIO. It supports special formats such as ND2
, CZI
, LIF
, or DV
.
Extra dependencies
To use this reader, you'll need the aicsimageio
dependency (pip install aicsimageio
). To read .czi
images, you'll also need to install aicspylibczi
(for instance pip install aicspylibczi
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
Path
|
Path to the image file |
required |
z_stack |
int
|
(Only for 3D images) Index of the stack in the z-axis to use. |
0
|
image_models_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
aics_kwargs |
dict | None
|
Keyword arguments passed to |
None
|
Returns:
Type | Description |
---|---|
SpatialData
|
A |
Source code in sopa/io/reader/aics.py
sopa.io.ome_tif(path, as_image=False)
Read an .ome.tif
image. This image should be a 2D image (with possibly multiple channels).
Typically, this function can be used to open Xenium IF images.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
Path
|
Path to the |
required |
as_image |
bool
|
If |
False
|
Returns:
Type | Description |
---|---|
DataArray | SpatialData
|
A |
Source code in sopa/io/reader/utils.py
sopa.io.wsi(path, chunks=(3, 256, 256), as_image=False, backend='tiffslide')
Read a WSI into a SpatialData
object
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str | Path
|
Path to the WSI |
required |
chunks |
tuple[int, int, int]
|
Tuple representing the chunksize for the dimensions |
(3, 256, 256)
|
as_image |
bool
|
If |
False
|
backend |
str
|
The library to use as a backend in order to load the WSI. One of: |
'tiffslide'
|
Returns:
Type | Description |
---|---|
SpatialData | DataTree
|
A |
Source code in sopa/io/reader/wsi.py
sopa.io.uniform(*_, length=2048, cell_density=0.0001, n_points_per_cell=100, c_coords=['DAPI', 'CK', 'CD3', 'CD20'], genes=['EPCAM', 'CD3E', 'CD20', 'CXCL4', 'CXCL10'], sigma_factor=0.05, pixel_size=0.1, seed=0, include_vertices=False, include_image=True, apply_blur=True, as_output=False, transcript_cell_id_as_merscope=False)
Generate a dummy dataset composed of cells generated uniformly in a square. It also has transcripts.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
length |
int
|
Size of the square, in pixels |
2048
|
cell_density |
float
|
Density of cells per pixel^2 |
0.0001
|
n_points_per_cell |
int
|
Mean number of transcripts per cell |
100
|
c_coords |
list[str]
|
Channel names |
['DAPI', 'CK', 'CD3', 'CD20']
|
genes |
int | list[str]
|
Number of different genes, or list of gene names |
['EPCAM', 'CD3E', 'CD20', 'CXCL4', 'CXCL10']
|
sigma_factor |
float
|
Factor used to determine |
0.05
|
pixel_size |
float
|
Number of microns in one pixel. |
0.1
|
seed |
int
|
Numpy random seed |
0
|
include_vertices |
bool
|
Whether to include the vertices of the cells (as points) in the spatialdata object |
False
|
include_image |
bool
|
Whether to include the image in the spatialdata object |
True
|
apply_blur |
bool
|
Whether to apply gaussian blur on the image (without blur, cells are just one pixel) |
True
|
as_output |
bool
|
If |
False
|
Returns:
Type | Description |
---|---|
SpatialData
|
A SpatialData object with a 2D image ( |
Source code in sopa/utils/data.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
|
sopa.io.blobs(*_, length=1024, n_points=10000, c_coords=['DAPI', 'CK', 'CD3', 'CD20'], **kwargs)
Adapts the blobs dataset from SpatialData for sopa. Please refer to the SpatialData documentation
Source code in sopa/utils/data.py
Xenium Explorer
sopa.io.write(path, sdata, image_key=None, shapes_key=None, points_key=None, gene_column=None, pixel_size=0.2125, layer=None, polygon_max_vertices=13, lazy=True, ram_threshold_gb=4, mode=None, save_h5ad=False)
Transform a SpatialData object into inputs for the Xenium Explorer.
After running this function, double-click on the experiment.xenium
file to open it.
Software download
Make sure you have the latest version of the Xenium Explorer
Note
This function will create up to 7 files, depending on the SpatialData
object and the arguments:
-
experiment.xenium
contains some experiment metadata. Double-click on this file to open the Xenium Explorer. This file can also be created withwrite_metadata
. -
morphology.ome.tif
is the primary image. This file can also be created withwrite_image
. Add more images withalign
. -
analysis.zarr.zip
contains the cells categories (or clusters), i.e.adata.obs
. This file can also be created withwrite_cell_categories
. -
cell_feature_matrix.zarr.zip
contains the cell-by-gene counts. This file can also be created withwrite_gene_counts
. -
cells.zarr.zip
contains the cells polygon boundaries. This file can also be created withwrite_polygons
. -
transcripts.zarr.zip
contains transcripts locations. This file can also be created withwrite_transcripts
. -
adata.h5ad
is theAnnData
object from theSpatialData
. This is not used by the Explorer, but only saved for convenience.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
Path to the directory where files will be saved. |
required |
sdata |
SpatialData
|
SpatialData object. |
required |
image_key |
str | None
|
Name of the image of interest (key of |
None
|
shapes_key |
str | None
|
Name of the cell shapes (key of |
None
|
points_key |
str | None
|
Name of the transcripts (key of |
None
|
gene_column |
str | None
|
Column name of the points dataframe containing the gene names. |
None
|
pixel_size |
float
|
Number of microns in a pixel. Invalid value can lead to inconsistent scales in the Explorer. |
0.2125
|
layer |
str | None
|
Layer of the AnnData table where the gene counts are saved. If |
None
|
polygon_max_vertices |
int
|
Maximum number of vertices for the cell polygons. |
13
|
lazy |
bool
|
If |
True
|
ram_threshold_gb |
int | None
|
Threshold (in gygabytes) from which image can be loaded in memory. If |
4
|
mode |
str
|
string that indicated which files should be created. "-ib" means everything except images and boundaries, while "+tocm" means only transcripts/observations/counts/metadata (each letter corresponds to one explorer file). By default, keeps everything. |
None
|
save_h5ad |
bool
|
Whether to save the adata as h5ad in the explorer directory (for convenience only, since h5ad is faster to open than the original .zarr table) |
False
|
Source code in sopa/io/explorer/converter.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
|
sopa.io.align(sdata, image, transformation_matrix_path, image_key=None, overwrite=False)
Add an image to the SpatialData
object after alignment with the Xenium Explorer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdata |
SpatialData
|
A |
required |
image |
DataArray
|
A |
required |
transformation_matrix_path |
str
|
Path to the |
required |
image_key |
str
|
Optional name of the image on which it has been aligned. Required if multiple images in the |
None
|
overwrite |
bool
|
Whether to overwrite the image, if already existing. |
False
|
Source code in sopa/io/explorer/images.py
sopa.io.add_explorer_selection(sdata, path, shapes_key, image_key=None, pixel_size=0.2125)
After saving a selection on the Xenium Explorer, it will add all polygons inside sdata.shapes[shapes_key]
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdata |
SpatialData
|
A |
required |
path |
str
|
The path to the |
required |
shapes_key |
str
|
The name to provide to the shapes |
required |
image_key |
str | None
|
The original image name |
None
|
pixel_size |
float
|
Number of microns in a pixel. It must be the same value as the one used in |
0.2125
|
Source code in sopa/io/explorer/utils.py
sopa.io.int_cell_id(explorer_cell_id)
Transforms an alphabetical cell id from the Xenium Explorer to an integer ID
E.g., int_cell_id('aaaachba-1') = 10000
Source code in sopa/io/explorer/utils.py
sopa.io.str_cell_id(cell_id)
Transforms an integer cell ID into an Xenium Explorer alphabetical cell id
E.g., str_cell_id(10000) = 'aaaachba-1'
Source code in sopa/io/explorer/utils.py
sopa.io.write_image(path, image, lazy=True, tile_width=TILE_SIZE, n_subscales=5, pixel_size=0.2125, ram_threshold_gb=4, is_dir=True)
Convert an image into a morphology.ome.tif
file that can be read by the Xenium Explorer
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
Path to the Xenium Explorer directory where the image will be written |
required |
image |
DataTree | DataArray | ndarray
|
Image of shape |
required |
lazy |
bool
|
If |
True
|
tile_width |
int
|
Xenium tile width (do not update). |
TILE_SIZE
|
n_subscales |
int
|
Number of sub-scales in the pyramidal image. |
5
|
pixel_size |
float
|
Xenium pixel size (do not update). |
0.2125
|
ram_threshold_gb |
int | None
|
If an image (of any level of the pyramid) is below this threshold, it will be loaded in-memory. |
4
|
is_dir |
bool
|
If |
True
|
Source code in sopa/io/explorer/images.py
sopa.io.save_column_csv(path, adata, key)
Save one column of the AnnData object as a CSV that can be open interactively in the explorer, under the "cell" panel.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
Path where to write the CSV that will be open in the Xenium Explorer |
required |
adata |
AnnData
|
An |
required |
key |
str
|
Key of |
required |
Source code in sopa/io/explorer/table.py
Report
sopa.io.write_report(path, sdata)
Create a HTML report (or web report) after running Sopa.
Note
This report is automatically generated based on a custom python-to-html engine
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
Path to the |
required |
sdata |
SpatialData
|
A |
required |