4.1 Solely from the gene expression matrix
This is the most commonly used way to create the sparse eSet object with scMINER:
pbmc14k_raw.eset <- createSparseEset(input_matrix = pbmc14k_rawCount, projectID = "PBMC14k", addMetaData = TRUE)
## Creating sparse eset from the input_matrix ...
## Adding meta data based on input_matrix ...
## Done! The sparse eset has been generated: 17986 genes, 14000 cells.
## SparseExpressionSet (storageMode: environment)
## assayData: 17986 features, 14000 samples
## element names: exprs
## protocolData: none
## phenoData
## sampleNames: CACTTTGACGCAAT GTTACGGAAACGAA ... ACGTGCCTTAAAGG (14000
## total)
## varLabels: CellID projectID ... pctSpikeIn (6 total)
## varMetadata: labelDescription
## featureData
## featureNames: AL627309.1 AP006222.2 ... SRSF10.1 (17986 total)
## fvarLabels: GeneSymbol nCell
## fvarMetadata: labelDescription
## experimentData: use 'experimentData(object)'
## Annotation:
- input_matrix: it’s usually but not limited to a sparse matrix of raw UMI count.
- As for the data format, it accepts
dgCMatrix
,dgTMatrix
,dgeMatrix
,matrix
,data.frame
. - As for the type of quantification measures, it takes raw counts, normalized counts (e.g.
CPM
orCP10k
),TPM
(Transcripts Per Million),FPKM/RPKM
(Fragments/Reads Per Kilobase of transcript per Million) and others. - What if a data frame object is given to it? When a non-matrix table is passed to
input_matrix
argument, thecreateSparseEset()
function will automatically convert it to a matrix. And it the matrix, either converted from other format or directly passed from users, is not sparse.createSparseEset()
will automatically convert it into sparse matrix, by default. This is controlled by another argument calleddo.sparseConversion
, the default of which isTRUE
. It’s not recommended but the users can set it asFALSE
to disable the conversion. ThencreateSparseEset()
will create the eSet based on the regular matrix.
- As for the data format, it accepts
- addMetaData: when this argument is set
TRUE
(this is the default),createSparseEset()
will automatically generate 5 statistics, 4 for cells and 1 for features, and add them into thephenoData
andfeatureData
slots. These 5 statistics will be used in quality control and data filtration.
## CellID projectID nUMI nFeature pctMito pctSpikeIn
## CACTTTGACGCAAT CACTTTGACGCAAT PBMC14k 764 354 0.01832461 0
## GTTACGGAAACGAA GTTACGGAAACGAA PBMC14k 956 442 0.01569038 0
## AGTCACGACAGGAG AGTCACGACAGGAG PBMC14k 7940 2163 0.01977330 0
## TTCGAGGACCAGTA TTCGAGGACCAGTA PBMC14k 4177 1277 0.01149150 0
## CACTTATGAGTCGT CACTTATGAGTCGT PBMC14k 629 323 0.02066773 0
## GCATGTGATTCTGT GCATGTGATTCTGT PBMC14k 875 427 0.02628571 0
## GeneSymbol nCell
## AL627309.1 AL627309.1 50
## AP006222.2 AP006222.2 2
## RP11-206L10.3 RP11-206L10.3 1
## RP11-206L10.2 RP11-206L10.2 33
## RP11-206L10.9 RP11-206L10.9 17
## LINC00115 LINC00115 115