Tutorial 3: run CANAL on cross-tissue experiments

[1]:
import os
os.chdir('/data/wanh/CANAL/')
import sys
sys.path.append('/data/wanh/CANAL/')
import argparse
from model import *
import time
import resource

construct the CANAL model

[2]:
experiments = 'Cross_tissue'
seed = 1
with open(f'./data/{experiments}/{experiments}_highly_gene_idx.csv') as csvfile:
      spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
      highly_variable_idx = []
      for row in spamreader:
            highly_variable_idx.append(row[0])

highly_variable_idx = [int(i) for i in highly_variable_idx]
highly_variable_idx = np.array(highly_variable_idx)
[3]:
CANAL = CANAL_model(gpu_option = '1')

stage 1

load the pre-processed dataset ‘’Lung’’

[4]:
dataset_stage_1 = "TabulaMuris_Lung_10X"
data_path_stage_1 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_1)
adata_stage_1 = sc.read_h5ad(data_path_stage_1)
cell_type_stage_1 = adata_stage_1.obs['cell_type1']
print(adata_stage_1)
print(np.unique(np.array(cell_type_stage_1), return_counts=True))
AnnData object with n_obs × n_vars = 4122 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'natural killer cell', 'stromal cell'], dtype=object), array([ 187,  212,  412,  314,  721, 2276]))

fine-tune the CANAL model for the first stage

[5]:
CANAL.train(experiments = experiments, pre_dataset = "None", dataset = dataset_stage_1,
            adata = adata_stage_1, cell_type = cell_type_stage_1, current_stage = 1,
            is_final_stage = False, ckpt_dir = './ckpts/', rehearsal_size = 2000,
            highly_variable_idx=highly_variable_idx, SEED = seed)
current data: AnnData object with n_obs × n_vars = 4122 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'


(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'natural killer cell', 'stromal cell'], dtype=object), array([ 187,  212,  412,  314,  721, 2276]))
model constructing begin!

/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at  ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
  q, r = torch.qr(unstructured_block.cpu(), some = True)
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
model constructing finished!

label train: [0 1 2 3 4 5] 6
label val: [0 1 2 3 4 5] 6
  ==  Begin finetuning: | Initial stage | Current stage: 1 | CANAL | Dataset: Cross_tissue TabulaMuris_Lung_10X ==
    ==  Epoch: 1 | Classification Loss: 1.796210 | Representation DL Loss: 0.000000  | Accuracy: 7.8419%  ==
    ==  Epoch: 1 | Validation CLS Loss: 1.795722 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.021864  == | Current accuracy: 0.070197  ==
[[  0   0   0  31   0   0]
 [  0   0   0  38   0   0]
 [  0   0   0  70   0   0]
 [  0   0   0  57   0   0]
 [  0   0   0 163   0   0]
 [  0   0   0 453   0   0]]
Patience: 0 / 10
    ==  Epoch: 2 | Classification Loss: 0.988520 | Representation DL Loss: 0.000000  | Accuracy: 71.1246%  ==
    ==  Epoch: 2 | Validation CLS Loss: 2.602915 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.860588  == | Current accuracy: 0.950739  ==
[[ 34   0   0   3   2   0]
 [  2  11   6   0  22   0]
 [  0   0  73   0   0   2]
 [  0   0   0  57   0   0]
 [  0   0   0   0 149   2]
 [  0   0   0   0   1 448]]
Patience: 0 / 10
    ==  Epoch: 3 | Classification Loss: 0.507423 | Representation DL Loss: 0.000000  | Accuracy: 98.3283%  ==
    ==  Epoch: 3 | Validation CLS Loss: 3.431119 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992711  == | Current accuracy: 0.996305  ==
[[ 39   0   0   0   0   0]
 [  0  37   0   0   1   0]
 [  0   0  68   0   0   0]
 [  0   0   0  57   0   0]
 [  0   0   0   0 156   0]
 [  2   0   0   0   0 452]]
Patience: 0 / 10
    ==  Epoch: 4 | Classification Loss: 0.457459 | Representation DL Loss: 0.000000  | Accuracy: 99.5137%  ==
    ==  Epoch: 4 | Validation CLS Loss: 2.616953 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997402  == | Current accuracy: 0.998768  ==
[[ 34   0   0   0   0   0]
 [  0  38   0   0   0   0]
 [  0   0  89   0   0   0]
 [  0   0   0  56   0   0]
 [  0   0   0   0 139   0]
 [  1   0   0   0   0 455]]
Patience: 0 / 10
    ==  Epoch: 5 | Classification Loss: 0.445244 | Representation DL Loss: 0.000000  | Accuracy: 99.6657%  ==
    ==  Epoch: 5 | Validation CLS Loss: 2.627202 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992965  == | Current accuracy: 0.997537  ==
[[ 24   0   0   0   0   0]
 [  0  32   0   0   0   0]
 [  0   0  92   0   0   0]
 [  0   0   0  66   0   0]
 [  0   0   0   0 145   0]
 [  2   0   0   0   0 451]]
Patience: 1 / 10
    ==  Epoch: 6 | Classification Loss: 0.437198 | Representation DL Loss: 0.000000  | Accuracy: 99.8784%  ==
    ==  Epoch: 6 | Validation CLS Loss: 3.163036 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.991406  == | Current accuracy: 0.995074  ==
[[ 32   0   0   0   0   0]
 [  0  41   0   0   0   0]
 [  0   0  68   0   0   1]
 [  0   0   0  71   0   0]
 [  0   0   1   0 144   0]
 [  2   0   0   0   0 452]]
Patience: 2 / 10
    ==  Epoch: 7 | Classification Loss: 0.433138 | Representation DL Loss: 0.000000  | Accuracy: 99.9088%  ==
    ==  Epoch: 7 | Validation CLS Loss: 3.196552 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.990963  == | Current accuracy: 0.992611  ==
[[ 44   0   0   0   0   0]
 [  0  38   0   0   0   0]
 [  0   0  75   0   0   2]
 [  0   0   0  59   0   0]
 [  0   0   4   0 146   0]
 [  0   0   0   0   0 444]]
Patience: 3 / 10
    ==  Epoch: 8 | Classification Loss: 0.431549 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 8 | Validation CLS Loss: 2.691682 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997326  == | Current accuracy: 0.998768  ==
[[ 33   0   0   0   0   0]
 [  0  43   0   0   0   0]
 [  0   0  91   0   0   0]
 [  0   0   0  69   0   0]
 [  0   0   0   0 129   0]
 [  1   0   0   0   0 446]]
Patience: 4 / 10
    ==  Epoch: 9 | Classification Loss: 0.428413 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 9 | Validation CLS Loss: 2.826878 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.995109  == | Current accuracy: 0.997537  ==
[[ 36   0   0   0   0   0]
 [  0  51   0   0   0   0]
 [  0   0  82   0   0   0]
 [  0   0   0  69   0   0]
 [  0   0   0   0 142   0]
 [  2   0   0   0   0 430]]
Patience: 5 / 10
    ==  Epoch: 10 | Classification Loss: 0.427367 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 10 | Validation CLS Loss: 2.444757 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.996158  == | Current accuracy: 0.997537  ==
[[ 39   0   0   0   0   0]
 [  0  37   0   0   0   0]
 [  0   0  82   0   0   0]
 [  0   0   0  50   0   0]
 [  0   0   1   0 155   0]
 [  1   0   0   0   0 447]]
Patience: 6 / 10
    ==  Epoch: 11 | Classification Loss: 0.426508 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 11 | Validation CLS Loss: 2.471816 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.998359  == | Current accuracy: 0.998768  ==
[[ 41   0   0   0   0   0]
 [  0  46   0   0   0   0]
 [  0   0  79   0   0   0]
 [  0   0   0  59   0   0]
 [  0   0   1   0 140   0]
 [  0   0   0   0   0 446]]
Patience: 7 / 10
    ==  Epoch: 12 | Classification Loss: 0.426080 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 12 | Validation CLS Loss: 2.682750 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.991984  == | Current accuracy: 0.995074  ==
[[ 35   0   0   0   0   0]
 [  0  36   0   0   0   0]
 [  0   0  73   0   0   1]
 [  0   0   0  55   0   0]
 [  0   0   1   0 137   0]
 [  2   0   0   0   0 472]]
Patience: 8 / 10
    ==  Epoch: 13 | Classification Loss: 0.426046 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 13 | Validation CLS Loss: 2.868453 | Validation Representation DL Loss: 0.000000 | F1 Score: 1.000000  == | Current accuracy: 1.000000  ==
[[ 40   0   0   0   0   0]
 [  0  40   0   0   0   0]
 [  0   0  87   0   0   0]
 [  0   0   0  67   0   0]
 [  0   0   0   0 143   0]
 [  0   0   0   0   0 435]]
Patience: 0 / 10
    ==  Epoch: 14 | Classification Loss: 0.425465 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 14 | Validation CLS Loss: 2.835544 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.995565  == | Current accuracy: 0.997537  ==
[[ 32   0   0   0   0   0]
 [  0  35   0   0   0   0]
 [  0   0  76   0   0   0]
 [  0   0   0  65   0   0]
 [  0   0   1   0 138   0]
 [  1   0   0   0   0 464]]
Patience: 1 / 10
    ==  Epoch: 15 | Classification Loss: 0.425373 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 15 | Validation CLS Loss: 2.831876 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997797  == | Current accuracy: 0.998768  ==
[[ 41   0   0   0   0   0]
 [  0  40   0   0   0   0]
 [  0   0  91   0   0   0]
 [  0   0   0  64   0   0]
 [  0   0   0   0 149   0]
 [  1   0   0   0   0 426]]
Patience: 2 / 10
    ==  Epoch: 16 | Classification Loss: 0.425474 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 16 | Validation CLS Loss: 2.901624 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992787  == | Current accuracy: 0.996305  ==
[[ 36   0   0   0   0   0]
 [  0  48   0   0   0   0]
 [  0   0  69   0   0   0]
 [  0   0   0  70   0   0]
 [  0   0   0   0 130   0]
 [  3   0   0   0   0 456]]
Patience: 3 / 10
    ==  Epoch: 17 | Classification Loss: 0.425706 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 17 | Validation CLS Loss: 2.852321 | Validation Representation DL Loss: 0.000000 | F1 Score: 1.000000  == | Current accuracy: 1.000000  ==
[[ 41   0   0   0   0   0]
 [  0  39   0   0   0   0]
 [  0   0  84   0   0   0]
 [  0   0   0  77   0   0]
 [  0   0   0   0 139   0]
 [  0   0   0   0   0 432]]
Patience: 4 / 10
    ==  Epoch: 18 | Classification Loss: 0.425208 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 18 | Validation CLS Loss: 2.825035 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.994437  == | Current accuracy: 0.996305  ==
[[ 39   0   0   0   0   0]
 [  0  42   0   0   0   0]
 [  0   0  84   0   0   0]
 [  0   0   0  57   0   0]
 [  0   0   2   0 126   0]
 [  1   0   0   0   0 461]]
Patience: 5 / 10
    ==  Epoch: 19 | Classification Loss: 0.424822 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 19 | Validation CLS Loss: 2.915099 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.995456  == | Current accuracy: 0.997537  ==
[[ 39   0   0   0   0   0]
 [  0  45   0   0   0   0]
 [  0   0  84   0   0   0]
 [  0   0   0  60   0   0]
 [  0   0   0   0 141   0]
 [  2   0   0   0   0 441]]
Patience: 6 / 10
    ==  Epoch: 20 | Classification Loss: 0.425196 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 20 | Validation CLS Loss: 2.823490 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992012  == | Current accuracy: 0.995074  ==
[[ 42   0   0   0   0   0]
 [  0  35   0   0   0   0]
 [  0   0  75   0   0   0]
 [  0   0   0  63   0   0]
 [  0   0   1   0 142   0]
 [  3   0   0   0   0 451]]
Patience: 7 / 10
    ==  Epoch: 21 | Classification Loss: 0.424549 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 21 | Validation CLS Loss: 2.730419 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.996778  == | Current accuracy: 0.998768  ==
[[ 27   0   0   0   0   0]
 [  0  39   0   0   0   0]
 [  0   0  87   0   0   0]
 [  0   0   0  65   0   0]
 [  0   0   0   0 159   0]
 [  1   0   0   0   0 434]]
Patience: 8 / 10
    ==  Epoch: 22 | Classification Loss: 0.423875 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 22 | Validation CLS Loss: 2.812331 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997473  == | Current accuracy: 0.998768  ==
[[ 35   0   0   0   0   0]
 [  0  36   0   0   0   0]
 [  0   0  95   0   0   0]
 [  0   0   0  49   0   0]
 [  0   0   0   0 132   0]
 [  1   0   0   0   0 464]]
Patience: 9 / 10
    ==  Epoch: 23 | Classification Loss: 0.423604 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 23 | Validation CLS Loss: 2.896816 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.998409  == | Current accuracy: 0.998768  ==
[[ 29   0   0   0   0   0]
 [  0  37   0   0   0   0]
 [  0   0  87   0   0   0]
 [  0   0   0  58   0   0]
 [  0   0   1   0 130   0]
 [  0   0   0   0   0 470]]
Patience: 10 / 10
    ==  Epoch: 24 | Classification Loss: 0.423487 | Representation DL Loss: 0.000000  | Accuracy: 100.0000%  ==
    ==  Epoch: 24 | Validation CLS Loss: 2.216436 | Validation Representation DL Loss: 0.000000 | F1 Score: 1.000000  == | Current accuracy: 1.000000  ==
[[ 43   0   0   0   0   0]
 [  0  36   0   0   0   0]
 [  0   0  91   0   0   0]
 [  0   0   0  60   0   0]
 [  0   0   0   0 127   0]
 [  0   0   0   0   0 455]]
Patience: 11 / 10
/data/wanh/CANAL/model.py:105: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
  adata_i.obs['celltype'] = current_label_dict[i]
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
  [AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],


example bank after updating:

AnnData object with n_obs × n_vars = 1712 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
cell type composition of this example bank:

(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'natural killer cell', 'stromal cell'], dtype=object), array([187, 212, 333, 314, 333, 333]))
dataset composition from each stage of this example bank:

(array([1]), array([1712]))


stage 2

load the pre-processed dataset ‘’Mammary_Gland’’

[5]:
dataset_stage_2 = "TabulaMuris_Mammary_Gland_10X"
data_path_stage_2 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_2)
adata_stage_2 = sc.read_h5ad(data_path_stage_2)
cell_type_stage_2 = adata_stage_2.obs['cell_type1']
print(adata_stage_2)
print(np.unique(np.array(cell_type_stage_2), return_counts=True))
AnnData object with n_obs × n_vars = 3981 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'stromal cell'], dtype=object), array([ 665, 1547,  341,  223,  411,  164,  630]))

fine-tune the CANAL model for the second stage

[7]:
CANAL.train(experiments = experiments, pre_dataset = dataset_stage_1, dataset = dataset_stage_2,
            adata = adata_stage_2, cell_type = cell_type_stage_2, current_stage = 2,
            is_final_stage = False, ckpt_dir='./ckpts/', rehearsal_size=1000, SEED = seed)
current data: AnnData object with n_obs × n_vars = 3981 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'


(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'stromal cell'], dtype=object), array([ 665, 1547,  341,  223,  411,  164,  630]))
model constructing begin!

new cell types: ['basal cell', 'luminal epithelial cell of mammary gland']
example bank for experience replay: AnnData object with n_obs × n_vars = 1712 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
model constructing finished!

label train: [0 1 2 3 4 5 6 7] 8
label val: [0 1 2 3 4 5 6 7] 8
  ==  Begin finetuning: | Incrmental stage | Current stage: 2 | CANAL | Dataset: Cross_tissue TabulaMuris_Mammary_Gland_10X  ==
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
    ==  Epoch: 1 | Classification Loss: 0.943613 | Representation DL Loss: 0.004290  | Accuracy: 83.0989%  ==
    ==  Epoch: 1 | Validation CLS Loss: 3.049186 | Validation Representation DL Loss: 0.008296 | F1 Score: 0.678923  == | Current accuracy: 0.837743  ==
[[160   0   0   0   0   0   0   0]
 [  6 338   0   0   3   0   0   0]
 [  0   0 118   0   0   0   0   0]
 [ 20   1   0  80   1   4   0   0]
 [  0   0   0   0  76   0   0   0]
 [  2   1   0   0   0 178   0   0]
 [  0   0   0   0   0  71   0   0]
 [  5   4   0   0   0  66   0   0]]
Patience: 0 / 10
    ==  Epoch: 2 | Classification Loss: 0.637936 | Representation DL Loss: 0.247472  | Accuracy: 95.6264%  ==
    ==  Epoch: 2 | Validation CLS Loss: 3.465677 | Validation Representation DL Loss: 0.275875 | F1 Score: 0.980252  == | Current accuracy: 0.984127  ==
[[191   4   0   0   0   0   0   0]
 [  0 343   0   0   4   0   0   0]
 [  0   0 101   0   0   0   0   0]
 [  2   2   0  99   0   0   0   0]
 [  0   0   0   0  56   0   0   0]
 [  0   0   0   2   0 193   0   0]
 [  0   0   0   0   0   0  59   1]
 [  0   0   0   0   0   0   3  74]]
Patience: 0 / 10
    ==  Epoch: 3 | Classification Loss: 0.529432 | Representation DL Loss: 0.269740  | Accuracy: 98.7033%  ==
    ==  Epoch: 3 | Validation CLS Loss: 3.574537 | Validation Representation DL Loss: 0.260100 | F1 Score: 0.992246  == | Current accuracy: 0.992945  ==
[[149   0   0   0   0   0   0   0]
 [  3 361   0   0   1   0   0   0]
 [  0   0  97   0   0   0   0   0]
 [  0   0   0 103   0   0   0   0]
 [  0   0   0   0  71   0   0   0]
 [  0   0   0   0   0 205   0   0]
 [  0   0   0   0   0   0  72   0]
 [  0   3   0   0   0   0   1  68]]
Patience: 0 / 10
    ==  Epoch: 4 | Classification Loss: 0.504755 | Representation DL Loss: 0.261056  | Accuracy: 99.4066%  ==
    ==  Epoch: 4 | Validation CLS Loss: 3.403684 | Validation Representation DL Loss: 0.240331 | F1 Score: 0.992208  == | Current accuracy: 0.992945  ==
[[161   2   0   0   0   0   0   0]
 [  0 356   0   0   2   0   0   0]
 [  0   0 104   0   0   0   0   0]
 [  0   0   0  99   0   0   0   0]
 [  0   2   0   0  63   0   0   0]
 [  0   0   0   0   0 205   0   0]
 [  0   0   0   0   0   0  69   0]
 [  0   2   0   0   0   0   0  69]]
Patience: 1 / 10
    ==  Epoch: 5 | Classification Loss: 0.495891 | Representation DL Loss: 0.239483  | Accuracy: 99.5824%  ==
    ==  Epoch: 5 | Validation CLS Loss: 3.789664 | Validation Representation DL Loss: 0.243155 | F1 Score: 0.994531  == | Current accuracy: 0.995591  ==
[[170   1   0   0   0   0   0   0]
 [  0 319   0   0   0   0   0   0]
 [  0   0 110   0   0   0   0   0]
 [  1   0   0  95   0   0   0   0]
 [  0   1   0   0  75   0   0   0]
 [  0   0   0   0   0 187   0   0]
 [  0   0   0   0   0   0  91   0]
 [  0   0   0   0   0   0   2  82]]
Patience: 0 / 10
    ==  Epoch: 6 | Classification Loss: 0.490662 | Representation DL Loss: 0.220643  | Accuracy: 99.6264%  ==
    ==  Epoch: 6 | Validation CLS Loss: 3.460462 | Validation Representation DL Loss: 0.212096 | F1 Score: 0.990578  == | Current accuracy: 0.990300  ==
[[164   5   0   0   0   0   0   0]
 [  0 343   0   0   3   0   0   0]
 [  0   0 113   0   0   0   0   0]
 [  0   0   0  91   0   0   0   0]
 [  0   2   0   0  73   0   0   0]
 [  0   0   0   0   0 184   0   0]
 [  0   0   0   0   0   0  72   0]
 [  0   0   0   0   0   0   1  83]]
Patience: 1 / 10
    ==  Epoch: 7 | Classification Loss: 0.484313 | Representation DL Loss: 0.205188  | Accuracy: 99.8462%  ==
    ==  Epoch: 7 | Validation CLS Loss: 3.764229 | Validation Representation DL Loss: 0.231196 | F1 Score: 0.996811  == | Current accuracy: 0.997354  ==
[[169   0   0   0   0   0   0   0]
 [  0 360   0   0   0   0   0   0]
 [  0   0 109   0   0   0   0   0]
 [  0   0   0  83   0   0   0   0]
 [  0   2   0   0  61   0   0   0]
 [  0   0   0   0   0 176   0   0]
 [  0   0   0   0   0   0  78   0]
 [  0   1   0   0   0   0   0  95]]
Patience: 0 / 10
    ==  Epoch: 8 | Classification Loss: 0.482291 | Representation DL Loss: 0.195704  | Accuracy: 99.9121%  ==
    ==  Epoch: 8 | Validation CLS Loss: 3.383599 | Validation Representation DL Loss: 0.159232 | F1 Score: 0.993031  == | Current accuracy: 0.992063  ==
[[162   5   0   0   0   0   0   0]
 [  0 356   0   0   1   0   0   0]
 [  0   0 120   0   0   0   0   0]
 [  0   0   0 101   0   0   0   0]
 [  0   2   0   0  70   0   0   0]
 [  0   0   0   0   0 179   0   0]
 [  0   0   0   0   0   0  67   0]
 [  0   1   0   0   0   0   0  70]]
Patience: 1 / 10
    ==  Epoch: 9 | Classification Loss: 0.480391 | Representation DL Loss: 0.177330  | Accuracy: 99.8462%  ==
    ==  Epoch: 9 | Validation CLS Loss: 3.714155 | Validation Representation DL Loss: 0.151181 | F1 Score: 0.996509  == | Current accuracy: 0.995591  ==
[[157   3   0   0   0   0   0   0]
 [  0 352   0   0   1   0   0   0]
 [  0   0 110   0   0   0   0   0]
 [  0   0   0  83   0   0   0   0]
 [  0   0   0   0  93   0   0   0]
 [  0   0   0   0   0 191   0   0]
 [  0   0   0   0   0   0  61   0]
 [  0   1   0   0   0   0   0  82]]
Patience: 2 / 10
    ==  Epoch: 10 | Classification Loss: 0.478689 | Representation DL Loss: 0.175591  | Accuracy: 99.9121%  ==
    ==  Epoch: 10 | Validation CLS Loss: 3.540380 | Validation Representation DL Loss: 0.168283 | F1 Score: 0.987462  == | Current accuracy: 0.988536  ==
[[183   5   0   0   0   0   0   0]
 [  0 333   0   0   1   0   0   0]
 [  0   0 106   0   0   0   0   0]
 [  0   0   0 109   0   0   0   0]
 [  0   3   0   0  72   0   0   0]
 [  0   0   0   0   0 185   0   0]
 [  0   0   0   0   0   0  64   0]
 [  0   2   0   0   0   0   2  69]]
Patience: 3 / 10
    ==  Epoch: 11 | Classification Loss: 0.476955 | Representation DL Loss: 0.175457  | Accuracy: 99.9341%  ==
    ==  Epoch: 11 | Validation CLS Loss: 3.778132 | Validation Representation DL Loss: 0.163178 | F1 Score: 0.995504  == | Current accuracy: 0.993827  ==
[[153   4   0   0   0   0   0   0]
 [  2 381   0   0   0   0   0   0]
 [  0   0 111   0   0   0   0   0]
 [  0   0   0  90   0   0   0   0]
 [  0   1   0   0  65   0   0   0]
 [  0   0   0   0   0 187   0   0]
 [  0   0   0   0   0   0  66   0]
 [  0   0   0   0   0   0   0  74]]
Patience: 4 / 10
    ==  Epoch: 12 | Classification Loss: 0.476649 | Representation DL Loss: 0.176870  | Accuracy: 99.9121%  ==
    ==  Epoch: 12 | Validation CLS Loss: 3.764977 | Validation Representation DL Loss: 0.168706 | F1 Score: 0.993996  == | Current accuracy: 0.994709  ==
[[183   1   0   0   0   0   0   0]
 [  0 329   0   0   1   0   0   0]
 [  0   0 119   0   0   0   0   0]
 [  0   0   0  98   0   0   0   0]
 [  0   1   0   0  59   0   0   0]
 [  0   0   0   0   0 187   0   0]
 [  0   0   0   0   0   0  78   0]
 [  0   3   0   0   0   0   0  75]]
Patience: 5 / 10
    ==  Epoch: 13 | Classification Loss: 0.475814 | Representation DL Loss: 0.172002  | Accuracy: 99.9560%  ==
    ==  Epoch: 13 | Validation CLS Loss: 3.900694 | Validation Representation DL Loss: 0.164699 | F1 Score: 0.993045  == | Current accuracy: 0.992945  ==
[[172   3   0   0   0   0   0   0]
 [  0 352   0   0   2   0   0   0]
 [  0   0 116   0   0   0   0   0]
 [  0   0   0  95   0   0   0   0]
 [  0   0   0   0  63   0   0   0]
 [  0   0   0   0   0 181   0   0]
 [  0   0   0   0   0   0  74   0]
 [  0   3   0   0   0   0   0  73]]
Patience: 6 / 10
    ==  Epoch: 14 | Classification Loss: 0.475027 | Representation DL Loss: 0.171514  | Accuracy: 100.0000%  ==
    ==  Epoch: 14 | Validation CLS Loss: 3.630681 | Validation Representation DL Loss: 0.162798 | F1 Score: 0.995602  == | Current accuracy: 0.995591  ==
[[203   2   0   0   0   0   0   0]
 [  0 323   0   0   2   0   0   0]
 [  0   0 104   0   0   0   0   0]
 [  0   0   0  92   0   0   0   0]
 [  0   0   0   0  60   0   0   0]
 [  0   0   0   0   0 198   0   0]
 [  0   0   0   0   0   0  69   0]
 [  0   1   0   0   0   0   0  80]]
Patience: 7 / 10
    ==  Epoch: 15 | Classification Loss: 0.475448 | Representation DL Loss: 0.171358  | Accuracy: 99.9780%  ==
    ==  Epoch: 15 | Validation CLS Loss: 3.577396 | Validation Representation DL Loss: 0.160683 | F1 Score: 0.994676  == | Current accuracy: 0.993827  ==
[[160   4   0   0   0   0   0   0]
 [  0 336   0   0   0   0   0   0]
 [  0   0 111   0   0   0   0   0]
 [  0   0   0  89   0   0   0   0]
 [  0   2   0   0  69   0   0   0]
 [  0   0   0   0   0 213   0   0]
 [  0   0   0   0   0   0  61   0]
 [  0   1   0   0   0   0   0  88]]
Patience: 8 / 10
    ==  Epoch: 16 | Classification Loss: 0.475443 | Representation DL Loss: 0.164815  | Accuracy: 99.9560%  ==
    ==  Epoch: 16 | Validation CLS Loss: 3.531243 | Validation Representation DL Loss: 0.164516 | F1 Score: 0.997349  == | Current accuracy: 0.996473  ==
[[159   3   0   0   0   0   0   0]
 [  0 352   0   0   0   0   0   0]
 [  0   0 100   0   0   0   0   0]
 [  0   0   0 112   0   0   0   0]
 [  0   0   0   0  63   0   0   0]
 [  0   0   0   0   0 191   0   0]
 [  0   0   0   0   0   0  73   0]
 [  0   1   0   0   0   0   0  80]]
Patience: 9 / 10
    ==  Epoch: 17 | Classification Loss: 0.475245 | Representation DL Loss: 0.169262  | Accuracy: 99.9560%  ==
    ==  Epoch: 17 | Validation CLS Loss: 3.615410 | Validation Representation DL Loss: 0.154457 | F1 Score: 0.997205  == | Current accuracy: 0.997354  ==
[[151   1   0   0   0   0   0   0]
 [  0 337   0   0   1   0   0   0]
 [  0   0 123   0   0   0   0   0]
 [  0   0   0  93   0   0   0   0]
 [  0   0   0   0  65   0   0   0]
 [  0   0   0   0   0 219   0   0]
 [  0   0   0   0   0   0  72   0]
 [  0   1   0   0   0   0   0  71]]
Patience: 10 / 10
    ==  Epoch: 18 | Classification Loss: 0.475913 | Representation DL Loss: 0.161787  | Accuracy: 99.9341%  ==
    ==  Epoch: 18 | Validation CLS Loss: 3.638239 | Validation Representation DL Loss: 0.167357 | F1 Score: 0.992559  == | Current accuracy: 0.992945  ==
[[167   3   0   0   0   0   0   0]
 [  0 381   0   0   3   0   0   0]
 [  0   0  96   0   0   0   0   0]
 [  0   0   0 101   0   0   0   0]
 [  0   0   0   0  50   0   0   0]
 [  0   0   0   0   0 174   0   0]
 [  0   0   0   0   0   0  68   0]
 [  0   2   0   0   0   0   0  89]]
Patience: 11 / 10
/data/wanh/CANAL/model.py:105: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
  adata_i.obs['celltype'] = current_label_dict[i]
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
  [AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],


example bank after updating:

AnnData object with n_obs × n_vars = 995 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
cell type composition of this example bank:

(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'natural killer cell', 'stromal cell'], dtype=object), array([124, 124, 125, 124, 125, 124, 125, 124]))
dataset composition from each stage of this example bank:

(array([1, 2]), array([435, 560]))


stage 3

load the pre-processed dataset ‘’Limb_Muscle’’

[6]:
dataset_stage_3 = "TabulaMuris_Limb_Muscle_10X"
data_path_stage_3 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_3)
adata_stage_3 = sc.read_h5ad(data_path_stage_3)
cell_type_stage_3 = adata_stage_3.obs['cell_type1']
print(adata_stage_3)
print(np.unique(np.array(cell_type_stage_3), return_counts=True))
AnnData object with n_obs × n_vars = 3409 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'mesenchymal stem cell', 'skeletal muscle satellite cell'],
      dtype=object), array([ 398,  269, 1157,  274,  998,  313]))

fine-tune the CANAL model for the third stage

[7]:
CANAL.train(experiments = experiments, pre_dataset = dataset_stage_2, dataset = dataset_stage_3,
            adata = adata_stage_3, cell_type = cell_type_stage_3, current_stage = 3, is_final_stage = False,
            ckpt_dir='./ckpts/', rehearsal_size=1000, SEED = seed)
current data: AnnData object with n_obs × n_vars = 3409 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'


(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'mesenchymal stem cell', 'skeletal muscle satellite cell'],
      dtype=object), array([ 398,  269, 1157,  274,  998,  313]))
model constructing begin!

new cell types: ['mesenchymal stem cell', 'skeletal muscle satellite cell']
example bank for experience replay: AnnData object with n_obs × n_vars = 995 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at  ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
  q, r = torch.qr(unstructured_block.cpu(), some = True)
model constructing finished!

label train: [0 1 2 3 4 5 6 7 8 9] 10
label val: [0 1 2 3 4 5 6 7 8 9] 10
  ==  Begin finetuning: | Incrmental stage | Current stage: 3 | CANAL | Dataset: Cross_tissue TabulaMuris_Limb_Muscle_10X  ==
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
    ==  Epoch: 1 | Classification Loss: 1.269150 | Representation DL Loss: 0.014623  | Accuracy: 67.7291%  ==
    ==  Epoch: 1 | Validation CLS Loss: 3.424092 | Validation Representation DL Loss: 0.033891 | F1 Score: 0.700836  == | Current accuracy: 0.686636  ==
[[ 98   0   0   0   0   0   0   0   0   0]
 [  0  73   1   0   1   0   0   0   0   0]
 [  0   0 254   0   0   0   0   0   0   0]
 [  0   0   0  67   0   1   0   0   0   0]
 [  0   0   0   0  18   0   0   0   0   0]
 [  0   0   0   0   0  26   0   0   0   0]
 [  0   0   0   0   0   0  29   0   0   0]
 [  0   0   0   0   0   0   0  31   0   0]
 [  0   0   2   0   0 200   0   0   0   0]
 [  2  17   0   0   0  46   0   0   2   0]]
Patience: 0 / 10
    ==  Epoch: 2 | Classification Loss: 0.700298 | Representation DL Loss: 0.393118  | Accuracy: 93.5970%  ==
    ==  Epoch: 2 | Validation CLS Loss: 3.925053 | Validation Representation DL Loss: 0.438186 | F1 Score: 0.963895  == | Current accuracy: 0.981567  ==
[[110   0   0   0   0   0   0   0   0   0]
 [  0  71   0   0   0   0   0   0   0   0]
 [  0   0 238   0   0   0   0   0   0   0]
 [  0   0   0  69   0   1   0   0   0   0]
 [  0   0   0   0  27   0   0   0   0   0]
 [  0   0   0   0   0  17   0   0  15   0]
 [  0   0   0   0   0   0  29   0   0   0]
 [  0   0   0   0   0   0   0  21   0   0]
 [  0   0   0   0   0   0   0   0 214   0]
 [  0   0   0   0   0   0   0   0   0  56]]
Patience: 0 / 10
    ==  Epoch: 3 | Classification Loss: 0.571220 | Representation DL Loss: 0.381694  | Accuracy: 98.6056%  ==
    ==  Epoch: 3 | Validation CLS Loss: 3.659403 | Validation Representation DL Loss: 0.350093 | F1 Score: 0.970678  == | Current accuracy: 0.981567  ==
[[ 98   0   0   0   0   0   0   0   0   0]
 [  0  82   0   0   1   0   0   0   0   0]
 [  0   0 250   0   0   0   0   0   0   0]
 [  0   0   2  61   0   0   0   0   0   0]
 [  0   0   0   0  22   0   0   0   0   0]
 [  0   0   0   0   0  16   0   0   7   0]
 [  0   0   0   0   0   0  25   0   0   0]
 [  0   0   0   0   0   0   0  23   0   0]
 [  0   0   4   1   0   1   0   0 217   0]
 [  0   0   0   0   0   0   0   0   0  58]]
Patience: 1 / 10
    ==  Epoch: 4 | Classification Loss: 0.547451 | Representation DL Loss: 0.349536  | Accuracy: 98.9186%  ==
    ==  Epoch: 4 | Validation CLS Loss: 4.033183 | Validation Representation DL Loss: 0.387530 | F1 Score: 0.976299  == | Current accuracy: 0.983871  ==
[[110   0   0   0   0   0   0   0   0   0]
 [  0  86   0   0   1   0   0   0   0   0]
 [  0   0 223   0   0   0   0   0   0   0]
 [  0   0   0  64   0   0   0   0   0   0]
 [  0   0   0   0  30   0   0   0   0   0]
 [  0   0   0   0   0  22   0   0   9   0]
 [  0   0   0   0   0   0  25   0   0   0]
 [  0   0   0   0   0   0   0  21   0   0]
 [  0   0   3   0   0   0   0   0 199   0]
 [  0   0   0   0   0   0   0   0   1  74]]
Patience: 0 / 10
    ==  Epoch: 5 | Classification Loss: 0.536278 | Representation DL Loss: 0.325964  | Accuracy: 99.2032%  ==
    ==  Epoch: 5 | Validation CLS Loss: 3.911783 | Validation Representation DL Loss: 0.308773 | F1 Score: 0.983484  == | Current accuracy: 0.987327  ==
[[115   0   0   0   0   0   0   0   0   0]
 [  0  75   0   0   1   0   0   0   0   0]
 [  0   0 236   0   0   0   0   0   0   0]
 [  0   0   1  85   0   0   0   0   0   0]
 [  0   0   0   0  23   0   0   0   0   0]
 [  0   0   0   0   0  24   0   0   3   0]
 [  0   0   0   0   0   0  20   0   0   0]
 [  0   0   0   0   0   0   0  21   0   0]
 [  0   0   1   2   0   1   0   0 199   0]
 [  0   0   0   0   0   0   0   0   2  59]]
Patience: 0 / 10
    ==  Epoch: 6 | Classification Loss: 0.524765 | Representation DL Loss: 0.294313  | Accuracy: 99.7154%  ==
    ==  Epoch: 6 | Validation CLS Loss: 3.702503 | Validation Representation DL Loss: 0.267107 | F1 Score: 0.991185  == | Current accuracy: 0.991935  ==
[[100   0   0   0   0   0   0   0   0   0]
 [  0  66   0   0   1   0   0   0   0   0]
 [  0   0 257   0   0   0   0   0   0   0]
 [  0   0   2  86   0   0   0   0   0   0]
 [  0   0   0   0  34   0   0   0   0   0]
 [  0   0   0   0   0  20   0   0   1   0]
 [  0   0   0   0   0   0  29   0   0   0]
 [  0   0   0   0   0   0   0  25   0   0]
 [  0   0   1   1   0   0   0   0 181   0]
 [  0   0   0   0   0   0   0   0   1  63]]
Patience: 0 / 10
    ==  Epoch: 7 | Classification Loss: 0.519414 | Representation DL Loss: 0.273046  | Accuracy: 99.8577%  ==
    ==  Epoch: 7 | Validation CLS Loss: 3.830669 | Validation Representation DL Loss: 0.249767 | F1 Score: 0.986853  == | Current accuracy: 0.989631  ==
[[ 91   0   0   0   0   0   0   0   0   0]
 [  0  95   0   0   1   0   0   0   0   0]
 [  0   0 244   0   0   0   0   0   0   0]
 [  0   0   1  81   0   0   0   0   0   0]
 [  0   0   0   0  28   0   0   0   0   0]
 [  0   0   0   0   0  22   0   0   3   0]
 [  0   0   0   0   0   0  20   0   0   0]
 [  0   0   0   0   0   0   0  24   0   0]
 [  0   0   2   1   0   0   0   0 202   0]
 [  0   0   0   0   0   0   0   0   1  52]]
Patience: 1 / 10
    ==  Epoch: 8 | Classification Loss: 0.516827 | Representation DL Loss: 0.252431  | Accuracy: 99.8862%  ==
    ==  Epoch: 8 | Validation CLS Loss: 4.209375 | Validation Representation DL Loss: 0.231310 | F1 Score: 0.987909  == | Current accuracy: 0.990783  ==
[[109   0   0   0   0   0   0   0   0   0]
 [  0  74   0   0   0   0   0   0   0   0]
 [  0   0 259   0   0   0   0   0   0   0]
 [  0   0   1  70   0   0   0   0   0   0]
 [  0   0   0   0  28   0   0   0   0   0]
 [  0   0   0   0   0  22   0   0   3   0]
 [  0   0   0   0   0   0  32   0   0   0]
 [  0   0   0   0   0   0   0  27   0   0]
 [  0   0   0   3   0   0   0   0 176   0]
 [  0   0   0   0   0   0   0   0   1  63]]
Patience: 2 / 10
    ==  Epoch: 9 | Classification Loss: 0.514036 | Representation DL Loss: 0.241894  | Accuracy: 100.0000%  ==
    ==  Epoch: 9 | Validation CLS Loss: 3.690444 | Validation Representation DL Loss: 0.191970 | F1 Score: 0.983331  == | Current accuracy: 0.990783  ==
[[ 97   0   0   0   0   0   0   0   0   0]
 [  0  87   0   0   2   0   0   0   0   0]
 [  0   0 243   0   0   0   0   0   0   0]
 [  0   0   2  91   0   0   0   0   0   0]
 [  0   0   0   0  27   0   0   0   0   0]
 [  0   0   2   0   0  16   0   0   0   0]
 [  0   0   0   0   0   0  18   0   0   0]
 [  0   0   0   0   0   0   0  23   0   0]
 [  0   0   0   0   0   1   0   0 209   0]
 [  0   0   0   0   0   0   0   0   1  49]]
Patience: 3 / 10
    ==  Epoch: 10 | Classification Loss: 0.512463 | Representation DL Loss: 0.219028  | Accuracy: 99.9715%  ==
    ==  Epoch: 10 | Validation CLS Loss: 3.722399 | Validation Representation DL Loss: 0.214890 | F1 Score: 0.989406  == | Current accuracy: 0.991935  ==
[[100   0   0   0   0   0   0   0   0   0]
 [  0  72   0   0   1   0   0   0   0   0]
 [  0   0 226   0   0   0   0   0   0   0]
 [  0   0   2  77   0   0   0   0   0   0]
 [  0   0   0   0  23   0   0   0   0   0]
 [  0   0   0   0   0  29   0   0   3   0]
 [  0   0   0   0   0   0  24   0   0   0]
 [  0   0   0   0   0   0   0  30   0   0]
 [  0   0   1   0   0   0   0   0 216   0]
 [  0   0   0   0   0   0   0   0   0  64]]
Patience: 4 / 10
    ==  Epoch: 11 | Classification Loss: 0.510772 | Representation DL Loss: 0.208613  | Accuracy: 100.0000%  ==
    ==  Epoch: 11 | Validation CLS Loss: 3.784714 | Validation Representation DL Loss: 0.201576 | F1 Score: 0.983138  == | Current accuracy: 0.991935  ==
[[101   0   0   0   0   0   0   0   0   0]
 [  0  78   0   0   1   0   0   0   0   0]
 [  0   0 247   0   0   0   0   0   0   0]
 [  0   0   1  79   0   0   0   0   0   0]
 [  0   0   0   0  26   0   0   0   0   0]
 [  0   0   0   0   0  18   0   0   5   0]
 [  0   0   0   0   0   0  21   0   0   0]
 [  0   0   0   0   0   0   0  26   0   0]
 [  0   0   0   0   0   0   0   0 188   0]
 [  0   0   0   0   0   0   0   0   0  77]]
Patience: 5 / 10
    ==  Epoch: 12 | Classification Loss: 0.509933 | Representation DL Loss: 0.193282  | Accuracy: 100.0000%  ==
    ==  Epoch: 12 | Validation CLS Loss: 3.688917 | Validation Representation DL Loss: 0.188578 | F1 Score: 0.988748  == | Current accuracy: 0.993088  ==
[[112   0   0   0   0   0   0   0   0   0]
 [  0  70   0   0   1   0   0   0   0   0]
 [  0   0 222   0   0   0   0   0   0   0]
 [  0   0   2  70   0   0   0   0   0   0]
 [  0   0   0   0  17   0   0   0   0   0]
 [  0   0   0   0   0  28   0   0   3   0]
 [  0   0   0   0   0   0  31   0   0   0]
 [  0   0   0   0   0   0   0  37   0   0]
 [  0   0   0   0   0   0   0   0 200   0]
 [  0   0   0   0   0   0   0   0   0  75]]
Patience: 0 / 10
    ==  Epoch: 13 | Classification Loss: 0.509605 | Representation DL Loss: 0.187941  | Accuracy: 100.0000%  ==
    ==  Epoch: 13 | Validation CLS Loss: 3.599265 | Validation Representation DL Loss: 0.181978 | F1 Score: 0.978821  == | Current accuracy: 0.988479  ==
[[ 82   0   0   0   0   0   0   0   0   0]
 [  0  71   0   0   1   0   0   0   0   0]
 [  0   0 272   0   0   0   0   0   0   0]
 [  0   0   1  68   0   0   0   0   0   0]
 [  0   0   0   0  19   0   0   0   0   0]
 [  0   0   1   0   0  17   0   0   4   0]
 [  0   0   0   0   0   0  22   0   0   0]
 [  0   0   0   0   0   0   0  25   0   0]
 [  0   0   0   1   0   0   0   0 226   0]
 [  0   0   0   0   0   0   0   0   2  56]]
Patience: 1 / 10
    ==  Epoch: 14 | Classification Loss: 0.509407 | Representation DL Loss: 0.181221  | Accuracy: 100.0000%  ==
    ==  Epoch: 14 | Validation CLS Loss: 3.799653 | Validation Representation DL Loss: 0.157986 | F1 Score: 0.976329  == | Current accuracy: 0.983871  ==
[[105   0   0   0   0   0   0   0   0   0]
 [  0  79   0   0   2   0   0   0   0   0]
 [  0   0 266   0   0   0   0   0   0   0]
 [  0   0   2  74   0   0   0   0   0   0]
 [  0   0   0   0  24   0   0   0   0   0]
 [  0   0   1   0   0  30   0   0   7   0]
 [  0   0   0   0   0   0  24   0   0   0]
 [  0   0   0   0   0   0   0  26   0   0]
 [  0   0   0   0   0   1   0   0 176   0]
 [  0   0   0   0   0   0   0   0   1  50]]
Patience: 2 / 10
    ==  Epoch: 15 | Classification Loss: 0.509605 | Representation DL Loss: 0.179538  | Accuracy: 100.0000%  ==
    ==  Epoch: 15 | Validation CLS Loss: 3.713005 | Validation Representation DL Loss: 0.158739 | F1 Score: 0.984616  == | Current accuracy: 0.990783  ==
[[100   0   0   0   0   0   0   0   0   0]
 [  0  91   0   0   0   0   0   0   0   0]
 [  0   0 245   0   0   0   0   0   0   0]
 [  0   0   0  82   0   0   0   0   0   0]
 [  0   0   0   0  25   0   0   0   0   0]
 [  0   0   1   0   0  24   0   0   3   0]
 [  0   0   0   0   0   0  36   0   0   0]
 [  0   0   0   0   0   0   0  22   0   0]
 [  0   0   0   1   0   3   0   0 186   0]
 [  0   0   0   0   0   0   0   0   0  49]]
Patience: 3 / 10
    ==  Epoch: 16 | Classification Loss: 0.510118 | Representation DL Loss: 0.202223  | Accuracy: 100.0000%  ==
    ==  Epoch: 16 | Validation CLS Loss: 3.923636 | Validation Representation DL Loss: 0.204042 | F1 Score: 0.971375  == | Current accuracy: 0.983871  ==
[[ 99   0   0   0   0   0   0   0   0   0]
 [  0  58   0   0   2   0   0   0   0   0]
 [  0   0 266   0   0   0   0   0   0   0]
 [  0   0   1  65   0   0   0   0   0   0]
 [  0   0   0   0  28   0   0   0   0   0]
 [  0   0   4   0   0  17   0   0   1   0]
 [  0   0   0   0   0   0  26   0   0   0]
 [  0   0   0   0   0   0   0  33   0   0]
 [  0   0   0   2   0   2   0   0 194   0]
 [  0   0   0   0   0   0   0   0   2  68]]
Patience: 4 / 10
    ==  Epoch: 17 | Classification Loss: 0.509321 | Representation DL Loss: 0.192837  | Accuracy: 100.0000%  ==
    ==  Epoch: 17 | Validation CLS Loss: 3.836862 | Validation Representation DL Loss: 0.181496 | F1 Score: 0.996914  == | Current accuracy: 0.997696  ==
[[103   0   0   0   0   0   0   0   0   0]
 [  0  86   0   0   0   0   0   0   0   0]
 [  0   0 265   0   0   0   0   0   0   0]
 [  0   0   0  68   0   0   0   0   0   0]
 [  0   0   0   0  18   0   0   0   0   0]
 [  0   0   0   0   0  27   0   0   1   0]
 [  0   0   0   0   0   0  24   0   0   0]
 [  0   0   0   0   0   0   0  18   0   0]
 [  0   0   0   1   0   0   0   0 185   0]
 [  0   0   0   0   0   0   0   0   0  72]]
Patience: 0 / 10
    ==  Epoch: 18 | Classification Loss: 0.508944 | Representation DL Loss: 0.184233  | Accuracy: 100.0000%  ==
    ==  Epoch: 18 | Validation CLS Loss: 3.757966 | Validation Representation DL Loss: 0.172289 | F1 Score: 0.985945  == | Current accuracy: 0.991935  ==
[[ 88   0   0   0   0   0   0   0   0   0]
 [  0  94   0   0   1   0   0   0   0   0]
 [  0   0 261   0   0   0   0   0   0   0]
 [  0   0   1  75   0   0   0   0   0   0]
 [  0   0   0   0  26   0   0   0   0   0]
 [  0   0   1   0   0  21   0   0   3   0]
 [  0   0   0   0   0   0  32   0   0   0]
 [  0   0   0   0   0   0   0  21   0   0]
 [  0   0   0   0   0   0   0   0 182   0]
 [  0   0   0   0   0   0   0   0   1  61]]
Patience: 1 / 10
    ==  Epoch: 19 | Classification Loss: 0.508631 | Representation DL Loss: 0.180292  | Accuracy: 100.0000%  ==
    ==  Epoch: 19 | Validation CLS Loss: 3.911557 | Validation Representation DL Loss: 0.169076 | F1 Score: 0.970801  == | Current accuracy: 0.986175  ==
[[109   0   0   0   0   0   0   0   0   0]
 [  0  71   0   0   3   0   0   0   0   0]
 [  0   0 247   0   0   0   0   0   0   0]
 [  0   0   0  78   0   0   0   0   0   0]
 [  0   0   0   0  32   0   0   0   0   0]
 [  0   0   1   0   0  15   0   0   4   0]
 [  0   0   0   0   0   0  20   0   0   0]
 [  0   0   0   0   0   0   0  30   0   0]
 [  0   0   0   1   0   2   0   0 202   0]
 [  0   0   0   0   0   0   0   0   1  52]]
Patience: 2 / 10
    ==  Epoch: 20 | Classification Loss: 0.508338 | Representation DL Loss: 0.175783  | Accuracy: 100.0000%  ==
    ==  Epoch: 20 | Validation CLS Loss: 3.691231 | Validation Representation DL Loss: 0.149613 | F1 Score: 0.983890  == | Current accuracy: 0.990783  ==
[[108   0   0   0   0   0   0   0   0   0]
 [  0  64   0   0   3   0   0   0   0   0]
 [  0   0 252   0   0   0   0   0   0   0]
 [  0   0   0  82   0   0   0   0   0   0]
 [  0   0   0   0  33   0   0   0   0   0]
 [  0   0   0   0   0  26   0   0   4   0]
 [  0   0   0   0   0   0  36   0   0   0]
 [  0   0   0   0   0   0   0  21   0   0]
 [  0   0   0   0   0   0   0   0 190   0]
 [  0   0   0   0   0   0   0   0   1  48]]
Patience: 3 / 10
    ==  Epoch: 21 | Classification Loss: 0.507941 | Representation DL Loss: 0.168473  | Accuracy: 100.0000%  ==
    ==  Epoch: 21 | Validation CLS Loss: 3.721588 | Validation Representation DL Loss: 0.146946 | F1 Score: 0.981061  == | Current accuracy: 0.988479  ==
[[100   0   0   0   0   0   0   0   0   0]
 [  0  91   0   0   2   0   0   0   0   0]
 [  0   0 252   0   0   0   0   0   0   0]
 [  0   0   1  81   0   0   0   0   0   0]
 [  0   0   0   0  21   0   0   0   0   0]
 [  0   0   0   0   0  24   0   0   2   0]
 [  0   0   0   0   0   0  22   0   0   0]
 [  0   0   0   0   0   0   0  22   0   0]
 [  0   0   0   2   0   3   0   0 185   0]
 [  0   0   0   0   0   0   0   0   0  60]]
Patience: 4 / 10
    ==  Epoch: 22 | Classification Loss: 0.507223 | Representation DL Loss: 0.155871  | Accuracy: 100.0000%  ==
    ==  Epoch: 22 | Validation CLS Loss: 3.725919 | Validation Representation DL Loss: 0.140846 | F1 Score: 0.981810  == | Current accuracy: 0.989631  ==
[[119   0   0   0   0   0   0   0   0   0]
 [  0  74   0   0   3   0   0   0   0   0]
 [  0   0 256   0   0   0   0   0   0   0]
 [  0   0   0  59   0   0   0   0   0   0]
 [  0   0   0   0  38   0   0   0   0   0]
 [  0   0   0   0   0  20   0   0   3   0]
 [  0   0   0   0   0   0  15   0   0   0]
 [  0   0   0   0   0   0   0  23   0   0]
 [  0   0   0   0   0   1   0   0 202   0]
 [  0   0   0   0   0   0   0   0   2  53]]
Patience: 5 / 10
    ==  Epoch: 23 | Classification Loss: 0.506500 | Representation DL Loss: 0.144887  | Accuracy: 100.0000%  ==
    ==  Epoch: 23 | Validation CLS Loss: 3.806984 | Validation Representation DL Loss: 0.135816 | F1 Score: 0.981420  == | Current accuracy: 0.986175  ==
[[109   0   0   0   0   0   0   0   0   0]
 [  0  62   0   0   1   0   0   0   0   0]
 [  0   0 220   0   0   0   0   0   0   0]
 [  0   0   2  94   0   0   0   0   0   0]
 [  0   0   0   0  30   0   0   0   0   0]
 [  0   0   2   0   0  25   0   0   2   0]
 [  0   0   0   0   0   0  25   0   0   0]
 [  0   0   0   0   0   0   0  21   0   0]
 [  0   0   0   2   0   2   0   0 205   0]
 [  0   0   0   0   0   0   0   0   1  65]]
Patience: 6 / 10
    ==  Epoch: 24 | Classification Loss: 0.506209 | Representation DL Loss: 0.136717  | Accuracy: 100.0000%  ==
    ==  Epoch: 24 | Validation CLS Loss: 3.772007 | Validation Representation DL Loss: 0.135068 | F1 Score: 0.980062  == | Current accuracy: 0.988479  ==
[[119   0   0   0   0   0   0   0   0   0]
 [  0  63   0   0   1   0   0   0   0   0]
 [  0   0 236   0   0   0   0   0   0   0]
 [  0   0   2  79   0   0   0   0   0   0]
 [  0   0   0   0  23   0   0   0   0   0]
 [  0   0   2   0   0  15   0   0   1   0]
 [  0   0   0   0   0   0  27   0   0   0]
 [  0   0   0   0   0   0   0  28   0   0]
 [  0   0   0   2   0   1   0   0 196   0]
 [  0   0   0   0   0   0   0   0   1  72]]
Patience: 7 / 10
    ==  Epoch: 25 | Classification Loss: 0.506000 | Representation DL Loss: 0.136285  | Accuracy: 100.0000%  ==
    ==  Epoch: 25 | Validation CLS Loss: 3.553699 | Validation Representation DL Loss: 0.117606 | F1 Score: 0.981745  == | Current accuracy: 0.986175  ==
[[ 91   0   0   0   0   0   0   0   0   0]
 [  0  66   0   0   0   0   0   0   0   0]
 [  0   0 261   0   0   0   0   0   0   0]
 [  0   0   2  84   0   0   0   0   0   0]
 [  0   0   0   0  29   0   0   0   0   0]
 [  0   0   3   0   0  26   0   0   4   0]
 [  0   0   0   0   0   0  24   0   0   0]
 [  0   0   0   0   0   0   0  20   0   0]
 [  0   0   0   2   0   1   0   0 209   0]
 [  0   0   0   0   0   0   0   0   0  46]]
Patience: 8 / 10
    ==  Epoch: 26 | Classification Loss: 0.505666 | Representation DL Loss: 0.139120  | Accuracy: 100.0000%  ==
    ==  Epoch: 26 | Validation CLS Loss: 3.644901 | Validation Representation DL Loss: 0.135973 | F1 Score: 0.984632  == | Current accuracy: 0.990783  ==
[[ 94   0   0   0   0   0   0   0   0   0]
 [  0  77   0   0   2   0   0   0   0   0]
 [  0   0 258   0   0   0   0   0   0   0]
 [  0   0   0  84   0   0   0   0   0   0]
 [  0   0   0   0  26   0   0   0   0   0]
 [  0   0   1   0   0  25   0   0   2   0]
 [  0   0   0   0   0   0  16   0   0   0]
 [  0   0   0   0   0   0   0  20   0   0]
 [  0   0   0   0   0   1   0   0 195   0]
 [  0   0   0   0   0   0   0   0   2  65]]
Patience: 9 / 10
    ==  Epoch: 27 | Classification Loss: 0.505538 | Representation DL Loss: 0.139681  | Accuracy: 100.0000%  ==
    ==  Epoch: 27 | Validation CLS Loss: 3.967983 | Validation Representation DL Loss: 0.126298 | F1 Score: 0.984596  == | Current accuracy: 0.990783  ==
[[118   0   0   0   0   0   0   0   0   0]
 [  0  74   0   0   2   0   0   0   0   0]
 [  0   0 238   0   0   0   0   0   0   0]
 [  0   0   1  79   0   0   0   0   0   0]
 [  0   0   0   0  22   0   0   0   0   0]
 [  0   0   1   0   0  22   0   0   2   0]
 [  0   0   0   0   0   0  24   0   0   0]
 [  0   0   0   0   0   0   0  31   0   0]
 [  0   0   0   2   0   0   0   0 187   0]
 [  0   0   0   0   0   0   0   0   0  65]]
Patience: 10 / 10
    ==  Epoch: 28 | Classification Loss: 0.505209 | Representation DL Loss: 0.133734  | Accuracy: 100.0000%  ==
    ==  Epoch: 28 | Validation CLS Loss: 3.805728 | Validation Representation DL Loss: 0.117855 | F1 Score: 0.985936  == | Current accuracy: 0.991935  ==
[[116   0   0   0   0   0   0   0   0   0]
 [  0  85   0   0   3   0   0   0   0   0]
 [  0   0 240   0   0   0   0   0   0   0]
 [  0   0   1  78   0   0   0   0   0   0]
 [  0   0   0   0  24   0   0   0   0   0]
 [  0   0   0   0   0  25   0   0   1   0]
 [  0   0   0   0   0   0  33   0   0   0]
 [  0   0   0   0   0   0   0  23   0   0]
 [  0   0   0   0   0   1   0   0 185   0]
 [  0   0   0   0   0   0   0   0   1  52]]
Patience: 11 / 10
/data/wanh/CANAL/model.py:105: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
  adata_i.obs['celltype'] = current_label_dict[i]
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
  [AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],


example bank after updating:
AnnData object with n_obs × n_vars = 996 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'


cell type composition of this example bank:
(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'mesenchymal stem cell', 'natural killer cell',
       'skeletal muscle satellite cell', 'stromal cell'], dtype=object), array([ 99,  99, 100,  99, 100,  99, 100, 100, 100, 100]))


dataset composition from each stage of this example bank:
(array([1, 2, 3]), array([282, 382, 332]))


stage 4

load the pre-processed dataset ‘’Spleen’’

[7]:
dataset_stage_4 = "TabulaMuris_Spleen_10X"
data_path_stage_4 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_4)
adata_stage_4 = sc.read_h5ad(data_path_stage_4)
cell_type_stage_4 = adata_stage_4.obs['cell_type1']
print(adata_stage_4)
print(np.unique(np.array(cell_type_stage_4), return_counts=True))
AnnData object with n_obs × n_vars = 9010 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'macrophage', 'natural killer cell'],
      dtype=object), array([6541, 1816,  431,  222]))

fine-tune the CANAL model for the forth stage

[8]:
CANAL.train(experiments = experiments, pre_dataset = dataset_stage_3, dataset = dataset_stage_4,
            adata = adata_stage_4, cell_type = cell_type_stage_4, current_stage = 4, is_final_stage = True,
            ckpt_dir='./ckpts/', rehearsal_size=1000, SEED = seed)
current data: AnnData object with n_obs × n_vars = 9010 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p'


(array(['B cell', 'T cell', 'macrophage', 'natural killer cell'],
      dtype=object), array([6541, 1816,  431,  222]))
model constructing begin!

new cell types: []
example bank for experience replay: AnnData object with n_obs × n_vars = 996 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at  ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
  q, r = torch.qr(unstructured_block.cpu(), some = True)
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
model constructing finished!

label train: [0 1 2 3 4 5 6 7 8 9] 10
label val: [0 1 2 3 4 5 6 7 8 9] 10
  ==  Begin finetuning: | Final stage | Current stage: 4 | CANAL | Dataset: Cross_tissue TabulaMuris_Spleen_10X ==
    ==  Epoch: 1 | Classification Loss: 0.566440 | Representation DL Loss: 0.005053  | Accuracy: 97.8359%  ==
    ==  Epoch: 1 | Validation CLS Loss: 2.590173 | Validation Representation DL Loss: 0.018307 | F1 Score: 0.969873  == | Current accuracy: 0.979520  ==
[[1292   16    0    1    0    0    0    7    0    0]
 [   6  364    0    1    0    0    0    0    0    0]
 [   0    0   23    0    0    0    0    0    0    0]
 [   4    0    0  108    0    3    0    0    0    0]
 [   0    3    0    0   64    0    0    0    0    0]
 [   0    0    0    0    0   24    0    0    0    0]
 [   0    0    0    0    0    0   33    0    0    0]
 [   0    0    0    0    0    0    0   23    0    0]
 [   0    0    0    0    0    0    0    0   18    0]
 [   0    0    0    0    0    0    0    0    0   12]]
Patience: 0 / 10
    ==  Epoch: 2 | Classification Loss: 0.554336 | Representation DL Loss: 0.032166  | Accuracy: 97.9610%  ==
    ==  Epoch: 2 | Validation CLS Loss: 2.036643 | Validation Representation DL Loss: 0.039645 | F1 Score: 0.989236  == | Current accuracy: 0.985514  ==
[[1309    6    0    4    0    0    0    0    0    0]
 [   7  372    0    2    0    0    0    0    0    0]
 [   0    0   19    0    0    0    0    0    0    0]
 [  10    0    0   95    0    0    0    0    0    0]
 [   0    0    0    0   54    0    0    0    0    0]
 [   0    0    0    0    0   24    0    0    0    0]
 [   0    0    0    0    0    0   23    0    0    0]
 [   0    0    0    0    0    0    0   24    0    0]
 [   0    0    0    0    0    0    0    0   31    0]
 [   0    0    0    0    0    0    0    0    0   22]]
Patience: 0 / 10
    ==  Epoch: 3 | Classification Loss: 0.541880 | Representation DL Loss: 0.025442  | Accuracy: 98.4989%  ==
    ==  Epoch: 3 | Validation CLS Loss: 2.205218 | Validation Representation DL Loss: 0.024612 | F1 Score: 0.991630  == | Current accuracy: 0.985514  ==
[[1306   11    0    1    0    0    0    0    0    0]
 [   9  383    0    1    1    0    0    0    0    0]
 [   0    0   24    0    0    0    0    0    0    0]
 [   5    1    0  105    0    0    0    0    0    0]
 [   0    0    0    0   62    0    0    0    0    0]
 [   0    0    0    0    0   17    0    0    0    0]
 [   0    0    0    0    0    0   17    0    0    0]
 [   0    0    0    0    0    0    0   13    0    0]
 [   0    0    0    0    0    0    0    0   20    0]
 [   0    0    0    0    0    0    0    0    0   26]]
Patience: 1 / 10
    ==  Epoch: 4 | Classification Loss: 0.531079 | Representation DL Loss: 0.028071  | Accuracy: 99.0368%  ==
    ==  Epoch: 4 | Validation CLS Loss: 2.666437 | Validation Representation DL Loss: 0.032383 | F1 Score: 0.989042  == | Current accuracy: 0.986014  ==
[[1315   11    0    1    0    0    0    0    0    0]
 [   5  386    0    1    5    0    0    0    0    0]
 [   0    0   18    0    0    0    0    0    0    0]
 [   5    0    0  105    0    0    0    0    0    0]
 [   0    0    0    0   58    0    0    0    0    0]
 [   0    0    0    0    0   20    0    0    0    0]
 [   0    0    0    0    0    0   21    0    0    0]
 [   0    0    0    0    0    0    0   18    0    0]
 [   0    0    0    0    0    0    0    0   16    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 0 / 10
    ==  Epoch: 5 | Classification Loss: 0.526279 | Representation DL Loss: 0.033105  | Accuracy: 99.1243%  ==
    ==  Epoch: 5 | Validation CLS Loss: 2.989892 | Validation Representation DL Loss: 0.030650 | F1 Score: 0.989795  == | Current accuracy: 0.987512  ==
[[1291   14    0    3    0    0    0    0    0    0]
 [   0  410    0    2    4    0    0    0    0    0]
 [   0    0   18    0    0    0    0    0    0    0]
 [   2    0    0   87    0    0    0    0    0    0]
 [   0    0    0    0   60    0    0    0    0    0]
 [   0    0    0    0    0   25    0    0    0    0]
 [   0    0    0    0    0    0   18    0    0    0]
 [   0    0    0    0    0    0    0   19    0    0]
 [   0    0    0    0    0    0    0    0   25    0]
 [   0    0    0    0    0    0    0    0    0   24]]
Patience: 0 / 10
    ==  Epoch: 6 | Classification Loss: 0.520931 | Representation DL Loss: 0.033596  | Accuracy: 99.4996%  ==
    ==  Epoch: 6 | Validation CLS Loss: 2.708388 | Validation Representation DL Loss: 0.031730 | F1 Score: 0.989429  == | Current accuracy: 0.984515  ==
[[1323   15    0    1    0    0    0    0    0    0]
 [   6  360    0    2    3    0    0    0    0    0]
 [   0    0   25    0    0    0    0    0    0    0]
 [   2    2    0  105    0    0    0    0    0    0]
 [   0    0    0    0   54    0    0    0    0    0]
 [   0    0    0    0    0   24    0    0    0    0]
 [   0    0    0    0    0    0   28    0    0    0]
 [   0    0    0    0    0    0    0   17    0    0]
 [   0    0    0    0    0    0    0    0   18    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 1 / 10
    ==  Epoch: 7 | Classification Loss: 0.516694 | Representation DL Loss: 0.037306  | Accuracy: 99.6122%  ==
    ==  Epoch: 7 | Validation CLS Loss: 2.837335 | Validation Representation DL Loss: 0.034606 | F1 Score: 0.990969  == | Current accuracy: 0.983516  ==
[[1325   16    0    0    0    0    0    0    0    0]
 [  10  369    0    1    1    0    0    0    0    0]
 [   0    0   23    0    0    0    0    0    0    0]
 [   3    1    0  104    0    0    0    0    0    0]
 [   0    1    0    0   58    0    0    0    0    0]
 [   0    0    0    0    0   16    0    0    0    0]
 [   0    0    0    0    0    0   18    0    0    0]
 [   0    0    0    0    0    0    0   18    0    0]
 [   0    0    0    0    0    0    0    0   21    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 2 / 10
    ==  Epoch: 8 | Classification Loss: 0.512596 | Representation DL Loss: 0.036774  | Accuracy: 99.7123%  ==
    ==  Epoch: 8 | Validation CLS Loss: 2.370672 | Validation Representation DL Loss: 0.034355 | F1 Score: 0.987327  == | Current accuracy: 0.985015  ==
[[1311    8    0    3    0    0    0    0    0    0]
 [   3  343    0    0    5    0    0    0    0    0]
 [   0    0   22    0    0    0    0    0    0    0]
 [   7    4    0  108    0    0    0    0    0    0]
 [   0    0    0    0   82    0    0    0    0    0]
 [   0    0    0    0    0   20    0    0    0    0]
 [   0    0    0    0    0    0   20    0    0    0]
 [   0    0    0    0    0    0    0   21    0    0]
 [   0    0    0    0    0    0    0    0   24    0]
 [   0    0    0    0    0    0    0    0    0   21]]
Patience: 3 / 10
    ==  Epoch: 9 | Classification Loss: 0.513157 | Representation DL Loss: 0.037117  | Accuracy: 99.6748%  ==
    ==  Epoch: 9 | Validation CLS Loss: 2.588495 | Validation Representation DL Loss: 0.033176 | F1 Score: 0.988307  == | Current accuracy: 0.982018  ==
[[1301   19    0    2    0    0    0    0    0    0]
 [   2  371    0    1    2    0    0    0    0    0]
 [   0    0   18    0    0    0    0    0    0    0]
 [   9    1    0  105    0    0    0    0    0    0]
 [   0    0    0    0   71    0    0    0    0    0]
 [   0    0    0    0    0   22    0    0    0    0]
 [   0    0    0    0    0    0   22    0    0    0]
 [   0    0    0    0    0    0    0   16    0    0]
 [   0    0    0    0    0    0    0    0   16    0]
 [   0    0    0    0    0    0    0    0    0   24]]
Patience: 4 / 10
    ==  Epoch: 10 | Classification Loss: 0.510636 | Representation DL Loss: 0.036718  | Accuracy: 99.7623%  ==
    ==  Epoch: 10 | Validation CLS Loss: 2.359062 | Validation Representation DL Loss: 0.035854 | F1 Score: 0.992940  == | Current accuracy: 0.988511  ==
[[1328   12    0    2    0    0    0    0    0    0]
 [   4  357    0    2    2    0    0    0    0    0]
 [   0    0   13    0    0    0    0    0    0    0]
 [   1    0    0  116    0    0    0    0    0    0]
 [   0    0    0    0   65    0    0    0    0    0]
 [   0    0    0    0    0   15    0    0    0    0]
 [   0    0    0    0    0    0   25    0    0    0]
 [   0    0    0    0    0    0    0   22    0    0]
 [   0    0    0    0    0    0    0    0   24    0]
 [   0    0    0    0    0    0    0    0    0   14]]
Patience: 0 / 10
    ==  Epoch: 11 | Classification Loss: 0.508377 | Representation DL Loss: 0.036215  | Accuracy: 99.8374%  ==
    ==  Epoch: 11 | Validation CLS Loss: 2.127766 | Validation Representation DL Loss: 0.039598 | F1 Score: 0.993257  == | Current accuracy: 0.991009  ==
[[1331    8    0    1    0    0    0    0    0    0]
 [   3  360    0    0    3    0    0    0    0    0]
 [   0    0   25    0    0    0    0    0    0    0]
 [   3    0    0  112    0    0    0    0    0    0]
 [   0    0    0    0   58    0    0    0    0    0]
 [   0    0    0    0    0   18    0    0    0    0]
 [   0    0    0    0    0    0   21    0    0    0]
 [   0    0    0    0    0    0    0   27    0    0]
 [   0    0    0    0    0    0    0    0   16    0]
 [   0    0    0    0    0    0    0    0    0   16]]
Patience: 0 / 10
    ==  Epoch: 12 | Classification Loss: 0.508645 | Representation DL Loss: 0.041663  | Accuracy: 99.8374%  ==
    ==  Epoch: 12 | Validation CLS Loss: 3.278699 | Validation Representation DL Loss: 0.034119 | F1 Score: 0.990750  == | Current accuracy: 0.987013  ==
[[1339   14    0    0    0    0    0    0    0    0]
 [   3  354    0    1    3    0    0    0    0    0]
 [   0    0   14    0    0    0    0    0    0    0]
 [   2    3    0  108    0    0    0    0    0    0]
 [   0    0    0    0   57    0    0    0    0    0]
 [   0    0    0    0    0   19    0    0    0    0]
 [   0    0    0    0    0    0   18    0    0    0]
 [   0    0    0    0    0    0    0   29    0    0]
 [   0    0    0    0    0    0    0    0   25    0]
 [   0    0    0    0    0    0    0    0    0   13]]
Patience: 1 / 10
    ==  Epoch: 13 | Classification Loss: 0.508340 | Representation DL Loss: 0.035979  | Accuracy: 99.8624%  ==
    ==  Epoch: 13 | Validation CLS Loss: 1.934230 | Validation Representation DL Loss: 0.032762 | F1 Score: 0.989765  == | Current accuracy: 0.986513  ==
[[1334    9    0    1    0    0    0    0    0    0]
 [   7  355    0    3    4    0    0    0    0    0]
 [   0    0   29    0    0    0    0    0    0    0]
 [   2    1    0   93    0    0    0    0    0    0]
 [   0    0    0    0   74    0    0    0    0    0]
 [   0    0    0    0    0   18    0    0    0    0]
 [   0    0    0    0    0    0   15    0    0    0]
 [   0    0    0    0    0    0    0   16    0    0]
 [   0    0    0    0    0    0    0    0   24    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 2 / 10
    ==  Epoch: 14 | Classification Loss: 0.508412 | Representation DL Loss: 0.034812  | Accuracy: 99.8749%  ==
    ==  Epoch: 14 | Validation CLS Loss: 2.697604 | Validation Representation DL Loss: 0.034268 | F1 Score: 0.990647  == | Current accuracy: 0.987013  ==
[[1285   13    0    0    0    0    0    0    0    0]
 [   4  426    0    1    4    0    0    0    0    0]
 [   0    0   15    0    0    0    0    0    0    0]
 [   3    1    0   91    0    0    0    0    0    0]
 [   0    0    0    0   59    0    0    0    0    0]
 [   0    0    0    0    0   17    0    0    0    0]
 [   0    0    0    0    0    0   17    0    0    0]
 [   0    0    0    0    0    0    0   29    0    0]
 [   0    0    0    0    0    0    0    0   20    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 3 / 10
    ==  Epoch: 15 | Classification Loss: 0.506742 | Representation DL Loss: 0.034353  | Accuracy: 99.9375%  ==
    ==  Epoch: 15 | Validation CLS Loss: 2.750621 | Validation Representation DL Loss: 0.032656 | F1 Score: 0.987614  == | Current accuracy: 0.985015  ==
[[1297   10    0    3    0    0    0    0    0    0]
 [   6  369    0    2    8    0    0    0    0    0]
 [   0    0   20    0    0    0    0    0    0    0]
 [   1    0    0  101    0    0    0    0    0    0]
 [   0    0    0    0   71    0    0    0    0    0]
 [   0    0    0    0    0   18    0    0    0    0]
 [   0    0    0    0    0    0   28    0    0    0]
 [   0    0    0    0    0    0    0   20    0    0]
 [   0    0    0    0    0    0    0    0   19    0]
 [   0    0    0    0    0    0    0    0    0   29]]
Patience: 4 / 10
    ==  Epoch: 16 | Classification Loss: 0.507812 | Representation DL Loss: 0.035951  | Accuracy: 99.8749%  ==
    ==  Epoch: 16 | Validation CLS Loss: 2.147116 | Validation Representation DL Loss: 0.033321 | F1 Score: 0.989715  == | Current accuracy: 0.986513  ==
[[1323   12    0    2    0    0    0    0    0    0]
 [   5  377    0    0    6    0    0    0    0    0]
 [   0    0   21    0    0    0    0    0    0    0]
 [   1    1    0  107    0    0    0    0    0    0]
 [   0    0    0    0   62    0    0    0    0    0]
 [   0    0    0    0    0   17    0    0    0    0]
 [   0    0    0    0    0    0   19    0    0    0]
 [   0    0    0    0    0    0    0   16    0    0]
 [   0    0    0    0    0    0    0    0   16    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 5 / 10
    ==  Epoch: 17 | Classification Loss: 0.508008 | Representation DL Loss: 0.035319  | Accuracy: 99.8749%  ==
    ==  Epoch: 17 | Validation CLS Loss: 2.506884 | Validation Representation DL Loss: 0.034275 | F1 Score: 0.989352  == | Current accuracy: 0.984515  ==
[[1320    9    0    4    0    0    0    0    0    0]
 [   8  364    0    3    1    0    0    0    0    0]
 [   0    0   17    0    0    0    0    0    0    0]
 [   4    2    0  108    0    0    0    0    0    0]
 [   0    0    0    0   51    0    0    0    0    0]
 [   0    0    0    0    0   23    0    0    0    0]
 [   0    0    0    0    0    0   15    0    0    0]
 [   0    0    0    0    0    0    0   24    0    0]
 [   0    0    0    0    0    0    0    0   24    0]
 [   0    0    0    0    0    0    0    0    0   25]]
Patience: 6 / 10
    ==  Epoch: 18 | Classification Loss: 0.506743 | Representation DL Loss: 0.034827  | Accuracy: 99.9249%  ==
    ==  Epoch: 18 | Validation CLS Loss: 2.453036 | Validation Representation DL Loss: 0.030274 | F1 Score: 0.988141  == | Current accuracy: 0.986014  ==
[[1314   10    0    2    0    0    0    0    0    0]
 [   5  376    0    1    5    0    0    0    0    0]
 [   0    0   30    0    0    0    0    0    0    0]
 [   3    1    0   86    0    0    0    0    0    0]
 [   0    1    0    0   68    0    0    0    0    0]
 [   0    0    0    0    0   17    0    0    0    0]
 [   0    0    0    0    0    0   18    0    0    0]
 [   0    0    0    0    0    0    0   19    0    0]
 [   0    0    0    0    0    0    0    0   26    0]
 [   0    0    0    0    0    0    0    0    0   20]]
Patience: 7 / 10
    ==  Epoch: 19 | Classification Loss: 0.506775 | Representation DL Loss: 0.033257  | Accuracy: 99.9249%  ==
    ==  Epoch: 19 | Validation CLS Loss: 2.646988 | Validation Representation DL Loss: 0.030506 | F1 Score: 0.990151  == | Current accuracy: 0.986513  ==
[[1291   12    0    4    0    0    0    0    0    0]
 [   3  395    0    0    4    0    0    0    0    0]
 [   0    0   23    0    0    0    0    0    0    0]
 [   3    1    0  106    0    0    0    0    0    0]
 [   0    0    0    0   67    0    0    0    0    0]
 [   0    0    0    0    0   19    0    0    0    0]
 [   0    0    0    0    0    0   18    0    0    0]
 [   0    0    0    0    0    0    0   23    0    0]
 [   0    0    0    0    0    0    0    0   17    0]
 [   0    0    0    0    0    0    0    0    0   16]]
Patience: 8 / 10
    ==  Epoch: 20 | Classification Loss: 0.506588 | Representation DL Loss: 0.035759  | Accuracy: 99.9500%  ==
    ==  Epoch: 20 | Validation CLS Loss: 2.489999 | Validation Representation DL Loss: 0.031861 | F1 Score: 0.987091  == | Current accuracy: 0.979021  ==
[[1310   24    0    1    0    0    0    0    0    0]
 [   6  389    0    2    4    0    0    0    0    0]
 [   0    0   12    0    0    0    0    0    0    0]
 [   3    2    0  100    0    0    0    0    0    0]
 [   0    0    0    0   62    0    0    0    0    0]
 [   0    0    0    0    0   24    0    0    0    0]
 [   0    0    0    0    0    0   15    0    0    0]
 [   0    0    0    0    0    0    0   17    0    0]
 [   0    0    0    0    0    0    0    0   14    0]
 [   0    0    0    0    0    0    0    0    0   17]]
Patience: 9 / 10
    ==  Epoch: 21 | Classification Loss: 0.506617 | Representation DL Loss: 0.037148  | Accuracy: 99.9124%  ==
    ==  Epoch: 21 | Validation CLS Loss: 2.641935 | Validation Representation DL Loss: 0.032302 | F1 Score: 0.991922  == | Current accuracy: 0.987512  ==
[[1322    7    0    2    0    0    0    0    0    0]
 [   8  362    0    0    2    0    0    0    0    0]
 [   0    0   11    0    0    0    0    0    0    0]
 [   5    1    0  113    0    0    0    0    0    0]
 [   0    0    0    0   70    0    0    0    0    0]
 [   0    0    0    0    0   18    0    0    0    0]
 [   0    0    0    0    0    0   25    0    0    0]
 [   0    0    0    0    0    0    0   17    0    0]
 [   0    0    0    0    0    0    0    0   16    0]
 [   0    0    0    0    0    0    0    0    0   23]]
Patience: 10 / 10
    ==  Epoch: 22 | Classification Loss: 0.506135 | Representation DL Loss: 0.036922  | Accuracy: 99.9249%  ==
    ==  Epoch: 22 | Validation CLS Loss: 2.578008 | Validation Representation DL Loss: 0.031964 | F1 Score: 0.996073  == | Current accuracy: 0.990509  ==
[[1295   12    0    0    0    0    0    0    0    0]
 [   5  385    0    0    0    0    0    0    0    0]
 [   0    0   14    0    0    0    0    0    0    0]
 [   1    1    0  104    0    0    0    0    0    0]
 [   0    0    0    0   61    0    0    0    0    0]
 [   0    0    0    0    0   30    0    0    0    0]
 [   0    0    0    0    0    0   21    0    0    0]
 [   0    0    0    0    0    0    0   25    0    0]
 [   0    0    0    0    0    0    0    0   30    0]
 [   0    0    0    0    0    0    0    0    0   18]]
Patience: 11 / 10

load the unlabeled 10X test data and apply CANAL to predict cell types

[8]:
adata_test1_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Lung_10X_test.h5ad")
print(adata_test1_10X,np.unique(np.array(adata_test1_10X.obs['cell_type1']),return_counts=True))

adata_test2_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Mammary_Gland_10X_test.h5ad")
print(adata_test2_10X,np.unique(np.array(adata_test2_10X.obs['cell_type1']),return_counts=True))

adata_test3_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Limb_Muscle_10X_test.h5ad")
print(adata_test3_10X,np.unique(np.array(adata_test3_10X.obs['cell_type1']),return_counts=True))

adata_test4_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Spleen_10X_test.h5ad")
print(adata_test4_10X,np.unique(np.array(adata_test4_10X.obs['cell_type1']),return_counts=True))

adata_test_10X = sc.AnnData.concatenate(adata_test1_10X,adata_test2_10X,adata_test3_10X,adata_test4_10X)
print(adata_test_10X,np.unique(np.array(adata_test_10X.obs['cell_type1']),return_counts=True))

pred_cell_type_10X = CANAL.predict(adata_predict = adata_test_10X, ckpt_dir = './ckpts/', experiments = experiments,
                               stage_num=4, dataset = dataset_stage_4)
AnnData object with n_obs × n_vars = 500 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'natural killer cell', 'stromal cell'], dtype=object), array([ 17,  35,  50,  31, 103, 264]))
AnnData object with n_obs × n_vars = 500 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'stromal cell'], dtype=object), array([ 78, 203,  51,  28,  48,  22,  70]))
AnnData object with n_obs × n_vars = 500 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'mesenchymal stem cell', 'skeletal muscle satellite cell'],
      dtype=object), array([ 63,  51, 173,  34, 138,  41]))
AnnData object with n_obs × n_vars = 500 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'macrophage', 'natural killer cell'],
      dtype=object), array([345, 114,  33,   8]))
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
  [AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
AnnData object with n_obs × n_vars = 2000 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm' (array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'mesenchymal stem cell', 'natural killer cell',
       'skeletal muscle satellite cell', 'stromal cell'], dtype=object), array([503, 403,  51, 251,  48, 120, 138, 111,  41, 334]))
    ==  Begin predicting after 4 fine-tuning stages: | Experiments: Cross_tissue ==
Annotation: ['B cell' 'T cell' 'endothelial cell' 'macrophage' 'natural killer cell'
 'stromal cell' 'basal cell' 'luminal epithelial cell of mammary gland'
 'mesenchymal stem cell' 'skeletal muscle satellite cell']

/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at  ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
  q, r = torch.qr(unstructured_block.cpu(), some = True)

evaluate the annotation performance

[9]:
true_celltype_10X = np.array(adata_test_10X.obs['cell_type1'])
CANAL.evaluation(pred_cell_type=pred_cell_type_10X, true_celltype=true_celltype_10X)
  ==  Predict total accuracy: 0.969000 ==|== F1 Score: 0.971300  ==|==  ARI: 0.938800 ==

Confusion matrix:
[[501   2   0   0   0   0   0   0   0   0]
 [ 10 393   0   0   0   0   0   0   0   0]
 [  0   0  51   0   0   0   0   0   0   0]
 [  0   0   0 251   0   0   0   0   0   0]
 [  0   0   0   0  48   0   0   0   0   0]
 [  2   1   0   0   0 116   1   0   0   0]
 [  0   0   0   0   0   0 134   0   0   4]
 [  0   0   0   0   0   0   0 111   0   0]
 [  0   0   0   0   0   0   0   0  40   1]
 [  0   0   0   2   0   8  31   0   0 293]]

load the unlabeled SS2 test data and apply CANAL to predict cell types

[10]:
adata_test_Lung_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Lung_SS2.h5ad")
print(adata_test_Lung_SS2,np.unique(np.array(adata_test_Lung_SS2.obs['cell_type1']),return_counts=True))

adata_test_Mammary_Gland_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Mammary_Gland_SS2.h5ad")
print(adata_test_Mammary_Gland_SS2,np.unique(np.array(adata_test_Mammary_Gland_SS2.obs['cell_type1']),return_counts=True))

adata_test_Limb_Muscle_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Limb_Muscle_SS2.h5ad")
print(adata_test_Limb_Muscle_SS2,np.unique(np.array(adata_test_Limb_Muscle_SS2.obs['cell_type1']),return_counts=True))

adata_test_Spleen_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Spleen_SS2.h5ad")
print(adata_test_Spleen_SS2,np.unique(np.array(adata_test_Spleen_SS2.obs['cell_type1']),return_counts=True))

adata_test_SS2 = sc.AnnData.concatenate(adata_test_Lung_SS2,adata_test_Mammary_Gland_SS2,adata_test_Limb_Muscle_SS2,adata_test_Spleen_SS2)
print(adata_test_SS2,np.unique(np.array(adata_test_SS2.obs['cell_type1']),return_counts=True))

pred_cell_type_SS2 = CANAL.predict(adata_predict = adata_test_SS2, ckpt_dir = './ckpts/', experiments = experiments,
                                   stage_num=4, dataset = dataset_stage_4)
AnnData object with n_obs × n_vars = 1263 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'natural killer cell',
       'stromal cell'], dtype=object), array([ 57,  53, 693,  37, 423]))
AnnData object with n_obs × n_vars = 2405 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'stromal cell'],
      dtype=object), array([1340,   47,  578,  440]))
AnnData object with n_obs × n_vars = 1090 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
       'mesenchymal stem cell', 'skeletal muscle satellite cell'],
      dtype=object), array([ 71,  35, 141,  45, 258, 540]))
AnnData object with n_obs × n_vars = 1697 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'macrophage'], dtype=object), array([1297,  352,   48]))
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
  [AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
AnnData object with n_obs × n_vars = 6455 × 1000
    obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm' (array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
       'luminal epithelial cell of mammary gland', 'macrophage',
       'mesenchymal stem cell', 'natural killer cell',
       'skeletal muscle satellite cell', 'stromal cell'], dtype=object), array([1425,  440, 1340,  881,  578,   93,  258,   37,  540,  863]))
    ==  Begin predicting after 4 fine-tuning stages: | Experiments: Cross_tissue ==
Annotation: ['B cell' 'T cell' 'endothelial cell' 'macrophage' 'natural killer cell'
 'stromal cell' 'basal cell' 'luminal epithelial cell of mammary gland'
 'mesenchymal stem cell' 'skeletal muscle satellite cell']

evaluate the annotation performance

[11]:
true_celltype_SS2 = np.array(adata_test_SS2.obs['cell_type1'])
CANAL.evaluation(pred_cell_type=pred_cell_type_SS2, true_celltype=true_celltype_SS2)
  ==  Predict total accuracy: 0.936000 ==|== F1 Score: 0.842000  ==|==  ARI: 0.919200 ==

Confusion matrix:
[[1365    8    0    0    1   49    0    2    0    0]
 [   0  438    0    0    0    1    0    1    0    0]
 [   4    1 1295    3   15    5    0    0    4   13]
 [   0    0    1  877    0    0    1    0    1    1]
 [   0    0    0    0  578    0    0    0    0    0]
 [   0    0    0    0    0   45    0   48    0    0]
 [   0    0    0    0    0    0  250    0    0    8]
 [   0    4    0    0    0    0    0   33    0    0]
 [   0    1    0    2    0    3    3    0  530    1]
 [   0    0    0    8    0    0  224    0    0  631]]
[ ]: