Chapter 6 Data normalization
In this chapter, we will introduce the method of data normalization in scMINER.
We recommend to use log2CPM
method for normalization: the raw counts in each cell are normalized to a library size of 1 million, followed by log2 transformation.
pbmc14k_log2cpm.eset <- normalizeSparseEset(pbmc14k_filtered.eset, scale_factor = 1000000, log_base = 2, log_pseudoCount = 1)
## Done! The data matrix of eset has been normalized and log-transformed!
## The returned eset contains: 8846 genes, 13605 cells.
## 5 x 5 Matrix of class "dgeMatrix"
## CACTTTGACGCAAT GTTACGGAAACGAA CACTTATGAGTCGT GCATGTGATTCTGT
## LINC00115 0 0.00000 0 0
## NOC2L 0 0.00000 0 0
## HES4 0 0.00000 0 0
## ISG15 0 10.05794 0 0
## C1orf159 0 0.00000 0 0
## TAGAATACGTATCG
## LINC00115 0
## NOC2L 0
## HES4 0
## ISG15 0
## C1orf159 0
This normalized and log-transformed SparseEset object can be directly used for Mutual Information-based clustering, network inference and other downstream analysis.
Don’t forget to save the SparseEset object after data normalization.