Skip to contents
#tools::R_user_dir("mcRigor", which="cache")
library(mcRigor)
library(Seurat)
library(ggplot2)

Introduction

In this tutorial, we will show how to use mcRigor to detect dubious metacells for a given metacell partition. We will demonstrate this functionality of mcRigor on a semi-synthetic single cell RNA sequencing (scRNA-seq) dataset with known ground truth trustworthiness of metacells.

Input preparation

Two main inputs are required for this functionality: 1. the raw scRNA-seq data and 2. a given metacell partition generated by either existing metacell partitioning methods or ad hoc approaches. The raw scRNA-seq data needs to be provided as a Seurat object, obj_singlecell. The semi-synthetic scRNA-seq data, whose generation process is described in Liu and Li, 2024, stored as a rds file syn.rds, is available with the mcRigor package as an example. We first load the data.

sc_dir = system.file('extdata', 'syn.rds', package = 'mcRigor')
obj_singlecell= readRDS(file = sc_dir)
obj_singlecell
#> An object of class Seurat 
#> 2000 features across 13400 samples within 1 assay 
#> Active assay: RNA (2000 features, 2000 variable features)
#>  3 layers present: counts, data, scale.data
#>  2 dimensional reductions calculated: pca, umap

The metacell partition should be provided as a dataframe, cell_membership, showing the assignment of single cells to metacells. Specifically, this dataframe has only one column and each row of the dataframe represents one single cell. The metacell partitions for the semi-synthetic scRNA-seq data generated by the SEACells method (Persad et al., 2023), stored as a csv file seacells_cell_membership_rna_syn.csv, is available with the mcRigor package as an example. This csv file contains series of metacell partitions, which were generated under different granularity levels. Note that granularity level, γ\gamma, is a key parameter for metacell partitioning and is defined as the ratio of the number of single cells to the number of metacells. In this tutorial, we focus on the metacell partition given by granularity level γ=50\gamma = 50.

membership_dir = system.file('extdata', 'seacells_cell_membership_rna_syn.csv', package = 'mcRigor')
cell_membership_all <- read.csv(file = membership_dir, check.names = F, row.names = 1)
cell_membership <- cell_membership_all['50']
head(cell_membership)
#>                               50
#> 1_Cell1 mc50-allcells-SEACell-98
#> 2_Cell1 mc50-allcells-SEACell-98
#> 3_Cell1 mc50-allcells-SEACell-98
#> 4_Cell1 mc50-allcells-SEACell-98
#> 5_Cell1 mc50-allcells-SEACell-98
#> 6_Cell1 mc50-allcells-SEACell-98

Detection of dubious metacells

We call the function mcRigor_DETECT to detect dubious metacells for the metacell partition represented by cell_membership.

detect_res = mcRigor_DETECT(obj_singlecell = obj_singlecell, cell_membership = cell_membership)

The Seurat object of metacells are stored in the obj_metacell field of the output detect_res.The mcRigor detection results are recorded in the mc_res field of the output detect_res as well as the metadata of the Seurat object with name mcRigor.

table(detect_res$mc_res)
#> 
#>     dubious trustworthy 
#>          57         211
obj_metacell = detect_res$obj_metacell
head(obj_metacell$mcRigor)
#>   mc50-allcells-SEACell-0   mc50-allcells-SEACell-1  mc50-allcells-SEACell-10 
#>             "trustworthy"                 "dubious"                 "dubious" 
#> mc50-allcells-SEACell-100 mc50-allcells-SEACell-101 mc50-allcells-SEACell-102 
#>             "trustworthy"             "trustworthy"             "trustworthy"

Visualization

The function mcRigor_projection can draw the metacells projected to the two-dimensional embedding space of single cells and mark the detected dubious metacells

sc_membership = obj_metacell@misc$cell_membership$Metacell
names(sc_membership) = rownames(obj_metacell@misc$cell_membership)

plot = mcRigor_projection(obj_singlecell = obj_singlecell, sc_membership = sc_membership,
                           color_field = 'celltype.l1',
                           dub_mc_test.label = T, test_stats = detect_res$TabMC, Thre = detect_res$thre)
plot

The dubious metacells are marked by red circles while the trustworthy metacells are with black circles.

Session information

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggplot2_3.5.1      Seurat_5.1.0       SeuratObject_5.0.2 sp_2.1-4          
#> [5] mcRigor_1.0        BiocStyle_2.34.0  
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3     jsonlite_1.8.9         magrittr_2.0.3        
#>   [4] spatstat.utils_3.1-1   farver_2.1.2           rmarkdown_2.29        
#>   [7] fs_1.6.5               ragg_1.3.3             vctrs_0.6.5           
#>  [10] ROCR_1.0-11            spatstat.explore_3.3-3 htmltools_0.5.8.1     
#>  [13] sass_0.4.9             sctransform_0.4.1      parallelly_1.39.0     
#>  [16] KernSmooth_2.23-24     bslib_0.8.0            htmlwidgets_1.6.4     
#>  [19] desc_1.4.3             ica_1.0-3              plyr_1.8.9            
#>  [22] plotly_4.10.4          zoo_1.8-12             cachem_1.1.0          
#>  [25] igraph_2.1.1           mime_0.12              lifecycle_1.0.4       
#>  [28] pkgconfig_2.0.3        Matrix_1.7-1           R6_2.5.1              
#>  [31] fastmap_1.2.0          fitdistrplus_1.2-1     future_1.34.0         
#>  [34] shiny_1.9.1            digest_0.6.37          colorspace_2.1-1      
#>  [37] patchwork_1.3.0        tensor_1.5             RSpectra_0.16-2       
#>  [40] irlba_2.3.5.1          textshaping_0.4.0      labeling_0.4.3        
#>  [43] progressr_0.15.1       fansi_1.0.6            spatstat.sparse_3.1-0 
#>  [46] httr_1.4.7             polyclip_1.10-7        abind_1.4-8           
#>  [49] compiler_4.4.2         withr_3.0.2            fastDummies_1.7.4     
#>  [52] MASS_7.3-61            tools_4.4.2            lmtest_0.9-40         
#>  [55] httpuv_1.6.15          future.apply_1.11.3    goftest_1.2-3         
#>  [58] glue_1.8.0             nlme_3.1-166           promises_1.3.0        
#>  [61] grid_4.4.2             Rtsne_0.17             cluster_2.1.6         
#>  [64] reshape2_1.4.4         generics_0.1.3         gtable_0.3.6          
#>  [67] spatstat.data_3.1-4    tidyr_1.3.1            data.table_1.16.2     
#>  [70] utf8_1.2.4             spatstat.geom_3.3-4    RcppAnnoy_0.0.22      
#>  [73] ggrepel_0.9.6          RANN_2.6.2             pillar_1.9.0          
#>  [76] stringr_1.5.1          spam_2.11-0            RcppHNSW_0.6.0        
#>  [79] later_1.3.2            splines_4.4.2          dplyr_1.1.4           
#>  [82] lattice_0.22-6         survival_3.7-0         deldir_2.0-4          
#>  [85] tidyselect_1.2.1       miniUI_0.1.1.1         pbapply_1.7-2         
#>  [88] knitr_1.49             gridExtra_2.3          bookdown_0.41         
#>  [91] scattermore_1.2        xfun_0.49              matrixStats_1.4.1     
#>  [94] stringi_1.8.4          lazyeval_0.2.2         yaml_2.3.10           
#>  [97] evaluate_1.0.1         codetools_0.2-20       tibble_3.2.1          
#> [100] BiocManager_1.30.25    cli_3.6.3              uwot_0.2.2            
#> [103] xtable_1.8-4           reticulate_1.40.0      systemfonts_1.1.0     
#> [106] munsell_0.5.1          jquerylib_0.1.4        Rcpp_1.0.13-1         
#> [109] globals_0.16.3         spatstat.random_3.3-2  png_0.1-8             
#> [112] spatstat.univar_3.1-1  parallel_4.4.2         pkgdown_2.1.1         
#> [115] dotCall64_1.2          listenv_0.9.1          viridisLite_0.4.2     
#> [118] scales_1.3.0           ggridges_0.5.6         leiden_0.4.3.1        
#> [121] purrr_1.0.2            rlang_1.1.4            cowplot_1.1.3