4.2 Using self-customized meta data
In some cases, you may have more meta data of either cells (e.g. sample id, treatment condition) or features (e.g. gene full name, gene type, genome location) which will be used in downstream analysis and you do want to add them into the sparse eSet object. the createSparseEset()
function provides another two arguments, cellData
and featureData
, to take the self-customized meta data. For the PBMC14k dataset, we have the true labels of cell type and would like to add them to the sparse eSet object.
## read the true labels of cell type for PBMC14k dataset
true_label <- read.table(system.file("extdata/demo_pbmc14k/PBMC14k_trueLabel.txt.gz", package = "scMINER"), header = T, row.names = 1, sep = "\t", quote = "", stringsAsFactors = FALSE)
head(true_label)
## trueLabel_full trueLabel
## CACTTTGACGCAAT CD14+ Monocyte Monocyte
## GTTACGGAAACGAA CD14+ Monocyte Monocyte
## AGTCACGACAGGAG CD14+ Monocyte Monocyte
## TTCGAGGACCAGTA CD14+ Monocyte Monocyte
## CACTTATGAGTCGT CD14+ Monocyte Monocyte
## GCATGTGATTCTGT CD14+ Monocyte Monocyte
##
## CD14+ Monocyte CD19+ B
## 2000 2000
## CD4+/CD25 T Reg CD4+/CD45RA+/CD25- Naive T
## 2000 2000
## CD4+/CD45RO+ Memory CD56+ NK
## 2000 2000
## CD8+/CD45RA+ Naive Cytotoxic
## 2000
## the true_label much cover all cells in the expression matrix
table(colnames(pbmc14k_rawCount) %in% row.names(true_label))
##
## TRUE
## 14000
## create the sparse eSet object using the true_label
pbmc14k_raw.eset <- createSparseEset(input_matrix = pbmc14k_rawCount, cellData = true_label, featureData = NULL, projectID = "PBMC14k", addMetaData = TRUE)
## Creating sparse eset from the input_matrix ...
## Adding meta data based on input_matrix ...
## Done! The sparse eset has been generated: 17986 genes, 14000 cells.
## trueLabel_full trueLabel projectID nUMI nFeature pctMito
## CACTTTGACGCAAT CD14+ Monocyte Monocyte PBMC14k 764 354 0.01832461
## GTTACGGAAACGAA CD14+ Monocyte Monocyte PBMC14k 956 442 0.01569038
## AGTCACGACAGGAG CD14+ Monocyte Monocyte PBMC14k 7940 2163 0.01977330
## TTCGAGGACCAGTA CD14+ Monocyte Monocyte PBMC14k 4177 1277 0.01149150
## CACTTATGAGTCGT CD14+ Monocyte Monocyte PBMC14k 629 323 0.02066773
## GCATGTGATTCTGT CD14+ Monocyte Monocyte PBMC14k 875 427 0.02628571
## pctSpikeIn CellID
## CACTTTGACGCAAT 0 CACTTTGACGCAAT
## GTTACGGAAACGAA 0 GTTACGGAAACGAA
## AGTCACGACAGGAG 0 AGTCACGACAGGAG
## TTCGAGGACCAGTA 0 TTCGAGGACCAGTA
## CACTTATGAGTCGT 0 CACTTATGAGTCGT
## GCATGTGATTCTGT 0 GCATGTGATTCTGT
##
## CD14+ Monocyte CD19+ B
## 2000 2000
## CD4+/CD25 T Reg CD4+/CD45RA+/CD25- Naive T
## 2000 2000
## CD4+/CD45RO+ Memory CD56+ NK
## 2000 2000
## CD8+/CD45RA+ Naive Cytotoxic
## 2000