Visium HD tutorial¶

We can run Sopa on Visium HD, as the 2 micron bins are subcellular. You can follow the "normal" API tutorial, or continue below to get exemples more specific to Visium HD data.

For this tutorial, we use the mouse small intestine public dataset from 10X Genomics.

In [ ]:

Copied!

import spatialdata

import sopa
import spatialdata

import sopa

Reading the data¶

In [ ]:

Copied!

sdata = sopa.io.visium_hd("data/visium_hd/Visium_HD_Mouse_Small_Intestine")
sdata
sdata = sopa.io.visium_hd("data/visium_hd/Visium_HD_Mouse_Small_Intestine")
sdata

Then, we save it on-disk:

In [ ]:

Copied!

sdata.write("mouse_small_intestine.zarr")  # save it

sdata = spatialdata.read_zarr("mouse_small_intestine.zarr")  # open-it back
sdata.write("mouse_small_intestine.zarr")  # save it

sdata = spatialdata.read_zarr("mouse_small_intestine.zarr")  # open-it back

Usual pipeline¶

Now, we run Sopa as usual. Although, since we don't have transcripts, we can't run Baysor. Therefore, we will run Stardist on the H&E image.

First, we create the patches for the cell segmentation.

In [5]:

Copied!

sopa.make_image_patches(sdata)
sopa.make_image_patches(sdata)

[INFO] (sopa.patches._patches) Added 156 patche(s) to sdata['image_patches']

Now we can run stardist on the H&E image.

In [6]:

Copied!

sopa.segmentation.stardist(sdata, min_area=30)
sopa.segmentation.stardist(sdata, min_area=30)

[WARNING] (sopa._settings) Running without parallelization backend can be slow. Consider using a backend, e.g. via `sopa.settings.parallelization_backend = 'dask'`, or `export SOPA_PARALLELIZATION_BACKEND=dask`.
  3%|▎         | 4/156 [00:34<22:03,  8.71s/it]

WARNING:tensorflow:5 out of the last 5 calls to <function TensorFlowTrainer.make_predict_function.<locals>.one_step_on_data_distributed at 0x5638ebb50> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has reduce_retracing=True option that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.

  3%|▎         | 5/156 [00:36<16:07,  6.40s/it]

WARNING:tensorflow:6 out of the last 6 calls to <function TensorFlowTrainer.make_predict_function.<locals>.one_step_on_data_distributed at 0x563ab4550> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has reduce_retracing=True option that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.

100%|██████████| 156/156 [16:27<00:00,  6.33s/it]
[INFO] (sopa.segmentation._stainings) Found 430536 total cells
Resolving conflicts: 100%|██████████| 63782/63782 [00:23<00:00, 2714.30it/s]
[INFO] (sopa.segmentation._stainings) Added 410196 cell boundaries in sdata['stardist_boundaries']

Aggregation¶

Now, we need to run sopa.aggregate to aggregate the bins into the cells. For each cell, we expand their radius, and then Sopa will sum the transcript counts of all 2-microns-bins touching or included within the cell.

Here, expand_radius_ratio = 1, which means that the cells will be expanded a value of 1 * mean_radius before aggregating the means. You can choose any positive float value.

There is an argument bins_key, but by default Sopa will understand that it's Visium HD data and that it should use the 2-microns bins. Also, on the example below, we only aggregate the bins, not the H&E channels.

In [9]:

Copied!

sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=1)
sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=1)

aligned_df.py (68): Transforming to str index.
_elements.py (104): Key `stardist_boundaries` already exists. Overwriting it in-memory.

Single-cell table¶

Now, we have an AnnData object with the gene expression per cell.

In [10]:

Copied!

adata = sdata["table"]
adata
adata = sdata["table"]
adata

Out[10]:

AnnData object with n_obs × n_vars = 408458 × 19059
    obs: 'region', 'slide', 'cell_id', 'area'
    var: 'gene_ids', 'feature_types', 'genome'
    uns: 'sopa_attrs', 'spatialdata_attrs'
    obsm: 'spatial'

For instance, we can now use Scanpy to plot gene expression.

In [11]:

Copied!

import scanpy as sc
import scanpy as sc

In [12]:

Copied!





# basic preprocessing
sc.pp.filter_genes(adata, min_cells=10)
sc.pp.filter_cells(adata, min_counts=20)
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
# basic preprocessing
sc.pp.filter_genes(adata, min_cells=10)
sc.pp.filter_cells(adata, min_counts=20)
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)

We can then use sc.pl.spatial to show the gene expression per cells. Note that, here, we show cells, not bins.

In [13]:

Copied!

sc.pl.spatial(adata, color="Tpm2", spot_size=80, vmax="p98")
sc.pl.spatial(adata, color="Tpm2", spot_size=80, vmax="p98")

No description has been provided for this image

Bins visualization¶

The 2-micron bins are arranged in a grid, so they can be visualized as an image of G channels, where G is the number of genes.

Creating the image would be massive, so we need to create it lazily. This can be done with spatialdata.rasterize_bins.

In [ ]:

Copied!





sdata["square_002um"].X = sdata["square_002um"].X.tocsc()  # optimisation with the csc format

lazy_bins_image = spatialdata.rasterize_bins(
    sdata,
    bins="Visium_HD_Mouse_Small_Intestine_square_002um",  # key of the bins shapes
    table_name="square_002um",  # key of the table with the bins gene expression
    row_key="array_row",
    col_key="array_col",
)
sdata["square_002um"].X = sdata["square_002um"].X.tocsc()  # optimisation with the csc format

lazy_bins_image = spatialdata.rasterize_bins(
    sdata,
    bins="Visium_HD_Mouse_Small_Intestine_square_002um",  # key of the bins shapes
    table_name="square_002um",  # key of the table with the bins gene expression
    row_key="array_row",
    col_key="array_col",
)

Note that lazy_bins_image is an image of size (19059, 690, 690), that is G=19059 genes, and 690x690 bins. This would correspond to a 33.80GB image in memory, if it wasn't lazy.

In [ ]:

Copied!

lazy_bins_image
lazy_bins_image

We can save this image in the sdata object.

In [13]:

Copied!

sdata["gene_expression_2_um"] = lazy_bins_image
sdata["gene_expression_2_um"] = lazy_bins_image

Then, we can visualize this image with Napari. When showing a gene, it will compute the corresponding layer of the lazy image, and it will be displayed in milliseconds, i.e. looking instantaneous.

In [ ]:

Copied!

from napari_spatialdata import Interactive

Interactive(sdata)
from napari_spatialdata import Interactive

Interactive(sdata)

You'll be able to display cells and the bins expression. It should look like that:

napari_visium_hd

Xenium Explorer¶

Although the Xenium Explorer can be used (as below), it will not display the bins. If you want to see the bins, use Napari as detailed above.

In [15]:

Copied!

sopa.io.explorer.write("mouse_small_intestine.explorer", sdata)
sopa.io.explorer.write("mouse_small_intestine.explorer", sdata)

[INFO] (sopa.io.explorer.table) Writing table with 18166 columns
[INFO] (sopa.io.explorer.table) Writing 2 cell categories: region, slide
[INFO] (sopa.io.explorer.shapes) Writing 332556 cell polygons
[INFO] (sopa.io.explorer.images) Writing multiscale image with procedure=semi-lazy (load in memory when possible)
[INFO] (sopa.io.explorer.images)    (Loading image of shape (3, 21943, 23618)) in memory
[INFO] (sopa.io.explorer.images)    > Image of shape (3, 21943, 23618)
[INFO] (sopa.io.explorer.images)    > Image of shape (3, 10971, 11809)
[INFO] (sopa.io.explorer.images)    > Image of shape (3, 5485, 5904)
[INFO] (sopa.io.explorer.images)    > Image of shape (3, 2742, 2952)
[INFO] (sopa.io.explorer.images)    > Image of shape (3, 1371, 1476)
[INFO] (sopa.io.explorer.images)    > Image of shape (3, 685, 738)
[INFO] (sopa.io.explorer.converter) Saved files in the following directory: mouse_small_intestine.explorer
[INFO] (sopa.io.explorer.converter) You can open the experiment with 'open mouse_small_intestine.explorer/experiment.xenium'