Chapter 5 Data filtration

In this chapter, we will introduce you how scMINER assess the scRNA-seq data quality, estimate the cutoffs for data filtration, and remove the cells and features of low quality from the SparseEset object. ## QC metrics in scMINER

As we mentioned before, scMINER can automatically generate 5 meta data statistics and add them to the SparseEset object. These 5 meta data statistics are the metrics scMINER uses to assess the quality of cells and features:

  • For cell quality assessment, scMINER provides 4 metrics that commonly used by the community:

    • nUMI: number of total UMIs in each cell. Cells with abnormally high nUMI usually indicate doublets, while those with abnormally low nUMI usually indicate poorly sequenced cells or empty droplets.
    • nFeature: number of expressed features/genes in each cell. Similar to nUMI.
    • pctMito: percentage of UMIs of mitochondrial genes (defined by “mt-|MT-”) in each cell. Cells with aberrantly high pctMito usually indicate dying cells.
    • pctSpikeIn: percentage of UMIs of spike-in RNAs (defined by “ERCC-|Ercc-”)) in each cell. This is used to estimate the normalization factor. Cells with extremely high or low pctSpikeIn need to be removed.
  • For feature quality assessment, scMINER provides one metrics:

    • nCell: number of cells expressing the features/genes. Genes with extremely low nCell are poorly sequenced and are usually of low variance.