Clustering
spacedeconv_clustering.Rmd
To gain more insights into a tissues composition the clustering function can be applied. It is possible to cluster by expression values, deconvolution results or pathway and transcription factor activities.
library(spacedeconv)
#> → checking spacedeconv environment and dependencies
#> Configuring package 'spacedeconv': please wait ...
#> Done!
library(SpatialExperiment)
#> Loading required package: SingleCellExperiment
#> Loading required package: SummarizedExperiment
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#>
#> Attaching package: 'BiocGenerics'
#> The following object is masked from 'package:spacedeconv':
#>
#> normalize
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> anyDuplicated, aperm, append, as.data.frame, basename, cbind,
#> colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
#> get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
#> match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
#> table, tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> expand.grid, I, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#>
#> rowMedians
#> The following objects are masked from 'package:matrixStats':
#>
#> anyMissing, rowMedians
data("single_cell_data_3")
data("spatial_data_3")
single_cell_data_3 <- spacedeconv::preprocess(single_cell_data_3)
spatial_data_3 <- spacedeconv::preprocess(spatial_data_3)
single_cell_data_3 <- spacedeconv::normalize(single_cell_data_3, method = "cpm")
spatial_data_3 <- spacedeconv::normalize(spatial_data_3, method = "cpm")
deconv <- deconvolute(spe, method = "epic", assay_sc = "cpm")
signature <- spacedeconv::build_model(
single_cell_obj = single_cell_data_3,
cell_type_col = "celltype_major",
method = "dwls", verbose = T, dwls_method = "mast_optimized", ncores = 10
)
deconv <- spacedeconv::deconvolute(
spatial_obj = spatial_data_3,
single_cell_obj = single_cell_data_3,
cell_type_col = "celltype_major",
method = "dwls",
signature = signature,
assay_sp = "cpm"
)
First we show how to cluster deconvolution data. Set the data parameter to “deconvolution” and provide the deconvolution tool you used. You can further set the following parameters:
- nclusters: Number of clusters you want, can be a range
- spmethod: should be the deconvolution tool used, or progeny/dorothea when clustering decoupleR results
- method: kmeans or hclust
- dist_method: for hclust, which distance method to use (“correlation”, “euclidean”, “maximum”, “manhattan”, “canberra”, “binary”, “minkowski”)
- hclust_method: for hclust, agglomeration method to us (“complete”, “ward.D”, “ward.D2”, “single”, “average”, “mcquitty”, “median”, “centroid”)
This function applies the Seurat clustering approach in the background. Set data to “expression”, this will use “counts” values for clustering. You can further set the following parameters:
- clusres: Cluster resolution, check the Seurat Vignette for details.
- pca_dim: Number of PCA dimensions to use
cluster <- spacedeconv::cluster(deconv, data = "expression", clusres = 0.5)
cluster <- readRDS(system.file("extdata", "cluster.rds", package = "spacedeconv"))
plot_celltype(cluster, "cluster", density = F) # plot the clustering stored in this object
With an available clustering you can exract the top features for each cluster. Here we extract the top features for each cluster based on expression, but we want the top features from the deconvolution results from this area. See the associated clusters in the plot above.
get_cluster_features(cluster, clusterid = "cluster_expression_0.5", spmethod = "dwls")
#> $`0`
#> dwls_Cancer.Epithelial dwls_PVL dwls_B.cells
#> 1.48622765 -0.09684142 -0.20176392
#>
#> $`1`
#> dwls_T.cells dwls_Endothelial dwls_PVL
#> 0.7706055 0.6349723 0.5992391
#>
#> $`2`
#> dwls_CAFs dwls_Normal.Epithelial dwls_Endothelial
#> 0.8553145 0.4136625 0.3882446
#>
#> $`3`
#> dwls_Myeloid dwls_Plasmablasts dwls_CAFs
#> 1.4167673 1.0846283 0.6236146
#>
#> $`4`
#> dwls_Normal.Epithelial dwls_PVL dwls_Endothelial
#> 2.02138850 0.08999605 -0.05634981
#>
#> $`5`
#> dwls_Cancer.Epithelial dwls_Myeloid dwls_T.cells
#> 0.83406399 0.72173387 0.06103056