Aggregation
Recommendation
We recommend using the sopa.aggregate
function below, which is a wrapper for all types of aggregation. Internally, it uses aggregate_channels
, count_transcripts
, and/or aggregate_bins
, which are also documented below if needed.
sopa.aggregate(sdata, aggregate_genes=None, aggregate_channels=True, image_key=None, points_key=None, gene_column=None, shapes_key=None, bins_key=None, expand_radius_ratio=None, min_transcripts=0, min_intensity_ratio=0.1, key_added='table')
Aggregate gene counts and/or channel intensities over a SpatialData
object to create an AnnData
table (saved in sdata["table"]
).
Info
The main arguments are sdata
, aggregate_genes
, and aggregate_channels
. The rest of the arguments are optional and will be inferred from the data if not provided.
- If channels are aggregated and not genes, then
sdata['table'].X
will contain the mean channel intensities per cell. - If genes are aggregated and not channels, then
sdata['table'].X
will contain the gene counts per cell. - If both genes and channels are aggregated, then
sdata['table'].X
will contain the gene counts per cell andsdata['table'].obsm['intensities']
will contain the mean channel intensities per cell.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdata
|
SpatialData
|
A |
required |
aggregate_genes
|
bool | None
|
Whether to aggregate gene counts. If None, it will be inferred from the data. |
None
|
aggregate_channels
|
bool
|
Whether to aggregate channel intensities inside cells. |
True
|
image_key
|
str | None
|
Key of |
None
|
points_key
|
str | None
|
Key of |
None
|
gene_column
|
str | None
|
Key of |
None
|
shapes_key
|
str | None
|
Key of |
None
|
bins_key
|
str | None
|
Key of |
None
|
expand_radius_ratio
|
float | None
|
Ratio to expand the cells polygons for channels averaging. For instance, a ratio of 0.5 expands the shape radius by 50%. If |
None
|
min_transcripts
|
int
|
Min number of transcripts to keep a cell. |
0
|
min_intensity_ratio
|
float
|
Min ratio of the 90th quantile of the mean channel intensity to keep a cell. |
0.1
|
key_added
|
str | None
|
Key to save the table in |
'table'
|
Source code in sopa/aggregation/aggregation.py
sopa.aggregation.aggregate_channels(sdata, image_key=None, shapes_key=None, expand_radius_ratio=0, mode='average')
Aggregate the channel intensities per cell (either "average"
, or take the "min"
/ "max"
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdata
|
SpatialData
|
A |
required |
image_key
|
str | None
|
Key of |
None
|
shapes_key
|
str | None
|
Key of |
None
|
expand_radius_ratio
|
float
|
Cells polygons will be expanded by |
0
|
mode
|
str
|
Aggregation mode. One of |
'average'
|
Returns:
Type | Description |
---|---|
ndarray
|
A numpy |
Source code in sopa/aggregation/channels.py
sopa.aggregation.count_transcripts(sdata, gene_column=None, shapes_key=None, points_key=None, geo_df=None)
Counts transcripts per cell.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdata
|
SpatialData
|
A |
required |
gene_column
|
str | None
|
Column of the transcript dataframe containing the gene names |
None
|
shapes_key
|
str | None
|
Key of |
None
|
points_key
|
str | None
|
Key of |
None
|
geo_df
|
GeoDataFrame | None
|
If the cell boundaries are not yet in |
None
|
Returns:
Type | Description |
---|---|
AnnData
|
An |
Source code in sopa/aggregation/transcripts.py
sopa.aggregation.aggregate_bins(sdata, shapes_key, bins_key, expand_radius_ratio=0)
Aggregate bins (for instance, from Visium HD data) into cells.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdata
|
SpatialData
|
The |
required |
shapes_key
|
str
|
Key of the shapes containing the cell boundaries |
required |
bins_key
|
str
|
Key of the table containing the bin-by-gene counts |
required |
expand_radius_ratio
|
float
|
Cells polygons will be expanded by |
0
|
Returns:
Type | Description |
---|---|
AnnData
|
An |