preprocess

preprocess.gene_align(adata, obj)

Conduct gene alignment between our data and panglaoDB dataset

Adata (AnnData):

the data we used to fine-tune the model or needed to be annotated

Obj (Array):

gene short name of this dataset

Returns:

an AnnData Object, the genes of which is common with the PanglaoDB dataset. The element of genes which is absent in the orginal dataset will be zero.

preprocess.normalize(adata_aligned, experiments, dir, min_genes=200, highly_gene_num=1000)

Normalize the aligned scRNA-seq dataset, including quality control, log-normalization as well as highly variable gene selection.

Adata_aligned (AnnData):

the scRNA-seq dataset after gene alignment

Experiments (str):

name of experiments

Dir (str):

path to save the corresponding index of highly variable genes

Min_genes (int):

the minimum expressed gene number of a cell

Highly_gene_num (int):

preserved number of highly variable genes

Returns:

an AnnData object of the preprocessed scRNA-seq dataset