preprocess
- preprocess.gene_align(adata, obj)
Conduct gene alignment between our data and panglaoDB dataset
- Adata (AnnData):
the data we used to fine-tune the model or needed to be annotated
- Obj (Array):
gene short name of this dataset
- Returns:
an AnnData Object, the genes of which is common with the PanglaoDB dataset. The element of genes which is absent in the orginal dataset will be zero.
- preprocess.normalize(adata_aligned, experiments, dir, min_genes=200, highly_gene_num=1000)
Normalize the aligned scRNA-seq dataset, including quality control, log-normalization as well as highly variable gene selection.
- Adata_aligned (AnnData):
the scRNA-seq dataset after gene alignment
- Experiments (str):
name of experiments
- Dir (str):
path to save the corresponding index of highly variable genes
- Min_genes (int):
the minimum expressed gene number of a cell
- Highly_gene_num (int):
preserved number of highly variable genes
- Returns:
an AnnData object of the preprocessed scRNA-seq dataset