Tutorial 3: run CANAL on cross-tissue experiments
[1]:
import os
os.chdir('/data/wanh/CANAL/')
import sys
sys.path.append('/data/wanh/CANAL/')
import argparse
from model import *
import time
import resource
construct the CANAL model
[2]:
experiments = 'Cross_tissue'
seed = 1
with open(f'./data/{experiments}/{experiments}_highly_gene_idx.csv') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
highly_variable_idx = []
for row in spamreader:
highly_variable_idx.append(row[0])
highly_variable_idx = [int(i) for i in highly_variable_idx]
highly_variable_idx = np.array(highly_variable_idx)
[3]:
CANAL = CANAL_model(gpu_option = '1')
stage 1
load the pre-processed dataset ‘’Lung’’
[4]:
dataset_stage_1 = "TabulaMuris_Lung_10X"
data_path_stage_1 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_1)
adata_stage_1 = sc.read_h5ad(data_path_stage_1)
cell_type_stage_1 = adata_stage_1.obs['cell_type1']
print(adata_stage_1)
print(np.unique(np.array(cell_type_stage_1), return_counts=True))
AnnData object with n_obs × n_vars = 4122 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'natural killer cell', 'stromal cell'], dtype=object), array([ 187, 212, 412, 314, 721, 2276]))
fine-tune the CANAL model for the first stage
[5]:
CANAL.train(experiments = experiments, pre_dataset = "None", dataset = dataset_stage_1,
adata = adata_stage_1, cell_type = cell_type_stage_1, current_stage = 1,
is_final_stage = False, ckpt_dir = './ckpts/', rehearsal_size = 2000,
highly_variable_idx=highly_variable_idx, SEED = seed)
current data: AnnData object with n_obs × n_vars = 4122 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'natural killer cell', 'stromal cell'], dtype=object), array([ 187, 212, 412, 314, 721, 2276]))
model constructing begin!
/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
q, r = torch.qr(unstructured_block.cpu(), some = True)
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
model constructing finished!
label train: [0 1 2 3 4 5] 6
label val: [0 1 2 3 4 5] 6
== Begin finetuning: | Initial stage | Current stage: 1 | CANAL | Dataset: Cross_tissue TabulaMuris_Lung_10X ==
== Epoch: 1 | Classification Loss: 1.796210 | Representation DL Loss: 0.000000 | Accuracy: 7.8419% ==
== Epoch: 1 | Validation CLS Loss: 1.795722 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.021864 == | Current accuracy: 0.070197 ==
[[ 0 0 0 31 0 0]
[ 0 0 0 38 0 0]
[ 0 0 0 70 0 0]
[ 0 0 0 57 0 0]
[ 0 0 0 163 0 0]
[ 0 0 0 453 0 0]]
Patience: 0 / 10
== Epoch: 2 | Classification Loss: 0.988520 | Representation DL Loss: 0.000000 | Accuracy: 71.1246% ==
== Epoch: 2 | Validation CLS Loss: 2.602915 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.860588 == | Current accuracy: 0.950739 ==
[[ 34 0 0 3 2 0]
[ 2 11 6 0 22 0]
[ 0 0 73 0 0 2]
[ 0 0 0 57 0 0]
[ 0 0 0 0 149 2]
[ 0 0 0 0 1 448]]
Patience: 0 / 10
== Epoch: 3 | Classification Loss: 0.507423 | Representation DL Loss: 0.000000 | Accuracy: 98.3283% ==
== Epoch: 3 | Validation CLS Loss: 3.431119 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992711 == | Current accuracy: 0.996305 ==
[[ 39 0 0 0 0 0]
[ 0 37 0 0 1 0]
[ 0 0 68 0 0 0]
[ 0 0 0 57 0 0]
[ 0 0 0 0 156 0]
[ 2 0 0 0 0 452]]
Patience: 0 / 10
== Epoch: 4 | Classification Loss: 0.457459 | Representation DL Loss: 0.000000 | Accuracy: 99.5137% ==
== Epoch: 4 | Validation CLS Loss: 2.616953 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997402 == | Current accuracy: 0.998768 ==
[[ 34 0 0 0 0 0]
[ 0 38 0 0 0 0]
[ 0 0 89 0 0 0]
[ 0 0 0 56 0 0]
[ 0 0 0 0 139 0]
[ 1 0 0 0 0 455]]
Patience: 0 / 10
== Epoch: 5 | Classification Loss: 0.445244 | Representation DL Loss: 0.000000 | Accuracy: 99.6657% ==
== Epoch: 5 | Validation CLS Loss: 2.627202 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992965 == | Current accuracy: 0.997537 ==
[[ 24 0 0 0 0 0]
[ 0 32 0 0 0 0]
[ 0 0 92 0 0 0]
[ 0 0 0 66 0 0]
[ 0 0 0 0 145 0]
[ 2 0 0 0 0 451]]
Patience: 1 / 10
== Epoch: 6 | Classification Loss: 0.437198 | Representation DL Loss: 0.000000 | Accuracy: 99.8784% ==
== Epoch: 6 | Validation CLS Loss: 3.163036 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.991406 == | Current accuracy: 0.995074 ==
[[ 32 0 0 0 0 0]
[ 0 41 0 0 0 0]
[ 0 0 68 0 0 1]
[ 0 0 0 71 0 0]
[ 0 0 1 0 144 0]
[ 2 0 0 0 0 452]]
Patience: 2 / 10
== Epoch: 7 | Classification Loss: 0.433138 | Representation DL Loss: 0.000000 | Accuracy: 99.9088% ==
== Epoch: 7 | Validation CLS Loss: 3.196552 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.990963 == | Current accuracy: 0.992611 ==
[[ 44 0 0 0 0 0]
[ 0 38 0 0 0 0]
[ 0 0 75 0 0 2]
[ 0 0 0 59 0 0]
[ 0 0 4 0 146 0]
[ 0 0 0 0 0 444]]
Patience: 3 / 10
== Epoch: 8 | Classification Loss: 0.431549 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 8 | Validation CLS Loss: 2.691682 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997326 == | Current accuracy: 0.998768 ==
[[ 33 0 0 0 0 0]
[ 0 43 0 0 0 0]
[ 0 0 91 0 0 0]
[ 0 0 0 69 0 0]
[ 0 0 0 0 129 0]
[ 1 0 0 0 0 446]]
Patience: 4 / 10
== Epoch: 9 | Classification Loss: 0.428413 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 9 | Validation CLS Loss: 2.826878 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.995109 == | Current accuracy: 0.997537 ==
[[ 36 0 0 0 0 0]
[ 0 51 0 0 0 0]
[ 0 0 82 0 0 0]
[ 0 0 0 69 0 0]
[ 0 0 0 0 142 0]
[ 2 0 0 0 0 430]]
Patience: 5 / 10
== Epoch: 10 | Classification Loss: 0.427367 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 10 | Validation CLS Loss: 2.444757 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.996158 == | Current accuracy: 0.997537 ==
[[ 39 0 0 0 0 0]
[ 0 37 0 0 0 0]
[ 0 0 82 0 0 0]
[ 0 0 0 50 0 0]
[ 0 0 1 0 155 0]
[ 1 0 0 0 0 447]]
Patience: 6 / 10
== Epoch: 11 | Classification Loss: 0.426508 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 11 | Validation CLS Loss: 2.471816 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.998359 == | Current accuracy: 0.998768 ==
[[ 41 0 0 0 0 0]
[ 0 46 0 0 0 0]
[ 0 0 79 0 0 0]
[ 0 0 0 59 0 0]
[ 0 0 1 0 140 0]
[ 0 0 0 0 0 446]]
Patience: 7 / 10
== Epoch: 12 | Classification Loss: 0.426080 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 12 | Validation CLS Loss: 2.682750 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.991984 == | Current accuracy: 0.995074 ==
[[ 35 0 0 0 0 0]
[ 0 36 0 0 0 0]
[ 0 0 73 0 0 1]
[ 0 0 0 55 0 0]
[ 0 0 1 0 137 0]
[ 2 0 0 0 0 472]]
Patience: 8 / 10
== Epoch: 13 | Classification Loss: 0.426046 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 13 | Validation CLS Loss: 2.868453 | Validation Representation DL Loss: 0.000000 | F1 Score: 1.000000 == | Current accuracy: 1.000000 ==
[[ 40 0 0 0 0 0]
[ 0 40 0 0 0 0]
[ 0 0 87 0 0 0]
[ 0 0 0 67 0 0]
[ 0 0 0 0 143 0]
[ 0 0 0 0 0 435]]
Patience: 0 / 10
== Epoch: 14 | Classification Loss: 0.425465 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 14 | Validation CLS Loss: 2.835544 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.995565 == | Current accuracy: 0.997537 ==
[[ 32 0 0 0 0 0]
[ 0 35 0 0 0 0]
[ 0 0 76 0 0 0]
[ 0 0 0 65 0 0]
[ 0 0 1 0 138 0]
[ 1 0 0 0 0 464]]
Patience: 1 / 10
== Epoch: 15 | Classification Loss: 0.425373 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 15 | Validation CLS Loss: 2.831876 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997797 == | Current accuracy: 0.998768 ==
[[ 41 0 0 0 0 0]
[ 0 40 0 0 0 0]
[ 0 0 91 0 0 0]
[ 0 0 0 64 0 0]
[ 0 0 0 0 149 0]
[ 1 0 0 0 0 426]]
Patience: 2 / 10
== Epoch: 16 | Classification Loss: 0.425474 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 16 | Validation CLS Loss: 2.901624 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992787 == | Current accuracy: 0.996305 ==
[[ 36 0 0 0 0 0]
[ 0 48 0 0 0 0]
[ 0 0 69 0 0 0]
[ 0 0 0 70 0 0]
[ 0 0 0 0 130 0]
[ 3 0 0 0 0 456]]
Patience: 3 / 10
== Epoch: 17 | Classification Loss: 0.425706 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 17 | Validation CLS Loss: 2.852321 | Validation Representation DL Loss: 0.000000 | F1 Score: 1.000000 == | Current accuracy: 1.000000 ==
[[ 41 0 0 0 0 0]
[ 0 39 0 0 0 0]
[ 0 0 84 0 0 0]
[ 0 0 0 77 0 0]
[ 0 0 0 0 139 0]
[ 0 0 0 0 0 432]]
Patience: 4 / 10
== Epoch: 18 | Classification Loss: 0.425208 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 18 | Validation CLS Loss: 2.825035 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.994437 == | Current accuracy: 0.996305 ==
[[ 39 0 0 0 0 0]
[ 0 42 0 0 0 0]
[ 0 0 84 0 0 0]
[ 0 0 0 57 0 0]
[ 0 0 2 0 126 0]
[ 1 0 0 0 0 461]]
Patience: 5 / 10
== Epoch: 19 | Classification Loss: 0.424822 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 19 | Validation CLS Loss: 2.915099 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.995456 == | Current accuracy: 0.997537 ==
[[ 39 0 0 0 0 0]
[ 0 45 0 0 0 0]
[ 0 0 84 0 0 0]
[ 0 0 0 60 0 0]
[ 0 0 0 0 141 0]
[ 2 0 0 0 0 441]]
Patience: 6 / 10
== Epoch: 20 | Classification Loss: 0.425196 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 20 | Validation CLS Loss: 2.823490 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.992012 == | Current accuracy: 0.995074 ==
[[ 42 0 0 0 0 0]
[ 0 35 0 0 0 0]
[ 0 0 75 0 0 0]
[ 0 0 0 63 0 0]
[ 0 0 1 0 142 0]
[ 3 0 0 0 0 451]]
Patience: 7 / 10
== Epoch: 21 | Classification Loss: 0.424549 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 21 | Validation CLS Loss: 2.730419 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.996778 == | Current accuracy: 0.998768 ==
[[ 27 0 0 0 0 0]
[ 0 39 0 0 0 0]
[ 0 0 87 0 0 0]
[ 0 0 0 65 0 0]
[ 0 0 0 0 159 0]
[ 1 0 0 0 0 434]]
Patience: 8 / 10
== Epoch: 22 | Classification Loss: 0.423875 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 22 | Validation CLS Loss: 2.812331 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.997473 == | Current accuracy: 0.998768 ==
[[ 35 0 0 0 0 0]
[ 0 36 0 0 0 0]
[ 0 0 95 0 0 0]
[ 0 0 0 49 0 0]
[ 0 0 0 0 132 0]
[ 1 0 0 0 0 464]]
Patience: 9 / 10
== Epoch: 23 | Classification Loss: 0.423604 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 23 | Validation CLS Loss: 2.896816 | Validation Representation DL Loss: 0.000000 | F1 Score: 0.998409 == | Current accuracy: 0.998768 ==
[[ 29 0 0 0 0 0]
[ 0 37 0 0 0 0]
[ 0 0 87 0 0 0]
[ 0 0 0 58 0 0]
[ 0 0 1 0 130 0]
[ 0 0 0 0 0 470]]
Patience: 10 / 10
== Epoch: 24 | Classification Loss: 0.423487 | Representation DL Loss: 0.000000 | Accuracy: 100.0000% ==
== Epoch: 24 | Validation CLS Loss: 2.216436 | Validation Representation DL Loss: 0.000000 | F1 Score: 1.000000 == | Current accuracy: 1.000000 ==
[[ 43 0 0 0 0 0]
[ 0 36 0 0 0 0]
[ 0 0 91 0 0 0]
[ 0 0 0 60 0 0]
[ 0 0 0 0 127 0]
[ 0 0 0 0 0 455]]
Patience: 11 / 10
/data/wanh/CANAL/model.py:105: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
adata_i.obs['celltype'] = current_label_dict[i]
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
example bank after updating:
AnnData object with n_obs × n_vars = 1712 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
cell type composition of this example bank:
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'natural killer cell', 'stromal cell'], dtype=object), array([187, 212, 333, 314, 333, 333]))
dataset composition from each stage of this example bank:
(array([1]), array([1712]))
stage 2
load the pre-processed dataset ‘’Mammary_Gland’’
[5]:
dataset_stage_2 = "TabulaMuris_Mammary_Gland_10X"
data_path_stage_2 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_2)
adata_stage_2 = sc.read_h5ad(data_path_stage_2)
cell_type_stage_2 = adata_stage_2.obs['cell_type1']
print(adata_stage_2)
print(np.unique(np.array(cell_type_stage_2), return_counts=True))
AnnData object with n_obs × n_vars = 3981 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'stromal cell'], dtype=object), array([ 665, 1547, 341, 223, 411, 164, 630]))
fine-tune the CANAL model for the second stage
[7]:
CANAL.train(experiments = experiments, pre_dataset = dataset_stage_1, dataset = dataset_stage_2,
adata = adata_stage_2, cell_type = cell_type_stage_2, current_stage = 2,
is_final_stage = False, ckpt_dir='./ckpts/', rehearsal_size=1000, SEED = seed)
current data: AnnData object with n_obs × n_vars = 3981 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'stromal cell'], dtype=object), array([ 665, 1547, 341, 223, 411, 164, 630]))
model constructing begin!
new cell types: ['basal cell', 'luminal epithelial cell of mammary gland']
example bank for experience replay: AnnData object with n_obs × n_vars = 1712 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
model constructing finished!
label train: [0 1 2 3 4 5 6 7] 8
label val: [0 1 2 3 4 5 6 7] 8
== Begin finetuning: | Incrmental stage | Current stage: 2 | CANAL | Dataset: Cross_tissue TabulaMuris_Mammary_Gland_10X ==
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
== Epoch: 1 | Classification Loss: 0.943613 | Representation DL Loss: 0.004290 | Accuracy: 83.0989% ==
== Epoch: 1 | Validation CLS Loss: 3.049186 | Validation Representation DL Loss: 0.008296 | F1 Score: 0.678923 == | Current accuracy: 0.837743 ==
[[160 0 0 0 0 0 0 0]
[ 6 338 0 0 3 0 0 0]
[ 0 0 118 0 0 0 0 0]
[ 20 1 0 80 1 4 0 0]
[ 0 0 0 0 76 0 0 0]
[ 2 1 0 0 0 178 0 0]
[ 0 0 0 0 0 71 0 0]
[ 5 4 0 0 0 66 0 0]]
Patience: 0 / 10
== Epoch: 2 | Classification Loss: 0.637936 | Representation DL Loss: 0.247472 | Accuracy: 95.6264% ==
== Epoch: 2 | Validation CLS Loss: 3.465677 | Validation Representation DL Loss: 0.275875 | F1 Score: 0.980252 == | Current accuracy: 0.984127 ==
[[191 4 0 0 0 0 0 0]
[ 0 343 0 0 4 0 0 0]
[ 0 0 101 0 0 0 0 0]
[ 2 2 0 99 0 0 0 0]
[ 0 0 0 0 56 0 0 0]
[ 0 0 0 2 0 193 0 0]
[ 0 0 0 0 0 0 59 1]
[ 0 0 0 0 0 0 3 74]]
Patience: 0 / 10
== Epoch: 3 | Classification Loss: 0.529432 | Representation DL Loss: 0.269740 | Accuracy: 98.7033% ==
== Epoch: 3 | Validation CLS Loss: 3.574537 | Validation Representation DL Loss: 0.260100 | F1 Score: 0.992246 == | Current accuracy: 0.992945 ==
[[149 0 0 0 0 0 0 0]
[ 3 361 0 0 1 0 0 0]
[ 0 0 97 0 0 0 0 0]
[ 0 0 0 103 0 0 0 0]
[ 0 0 0 0 71 0 0 0]
[ 0 0 0 0 0 205 0 0]
[ 0 0 0 0 0 0 72 0]
[ 0 3 0 0 0 0 1 68]]
Patience: 0 / 10
== Epoch: 4 | Classification Loss: 0.504755 | Representation DL Loss: 0.261056 | Accuracy: 99.4066% ==
== Epoch: 4 | Validation CLS Loss: 3.403684 | Validation Representation DL Loss: 0.240331 | F1 Score: 0.992208 == | Current accuracy: 0.992945 ==
[[161 2 0 0 0 0 0 0]
[ 0 356 0 0 2 0 0 0]
[ 0 0 104 0 0 0 0 0]
[ 0 0 0 99 0 0 0 0]
[ 0 2 0 0 63 0 0 0]
[ 0 0 0 0 0 205 0 0]
[ 0 0 0 0 0 0 69 0]
[ 0 2 0 0 0 0 0 69]]
Patience: 1 / 10
== Epoch: 5 | Classification Loss: 0.495891 | Representation DL Loss: 0.239483 | Accuracy: 99.5824% ==
== Epoch: 5 | Validation CLS Loss: 3.789664 | Validation Representation DL Loss: 0.243155 | F1 Score: 0.994531 == | Current accuracy: 0.995591 ==
[[170 1 0 0 0 0 0 0]
[ 0 319 0 0 0 0 0 0]
[ 0 0 110 0 0 0 0 0]
[ 1 0 0 95 0 0 0 0]
[ 0 1 0 0 75 0 0 0]
[ 0 0 0 0 0 187 0 0]
[ 0 0 0 0 0 0 91 0]
[ 0 0 0 0 0 0 2 82]]
Patience: 0 / 10
== Epoch: 6 | Classification Loss: 0.490662 | Representation DL Loss: 0.220643 | Accuracy: 99.6264% ==
== Epoch: 6 | Validation CLS Loss: 3.460462 | Validation Representation DL Loss: 0.212096 | F1 Score: 0.990578 == | Current accuracy: 0.990300 ==
[[164 5 0 0 0 0 0 0]
[ 0 343 0 0 3 0 0 0]
[ 0 0 113 0 0 0 0 0]
[ 0 0 0 91 0 0 0 0]
[ 0 2 0 0 73 0 0 0]
[ 0 0 0 0 0 184 0 0]
[ 0 0 0 0 0 0 72 0]
[ 0 0 0 0 0 0 1 83]]
Patience: 1 / 10
== Epoch: 7 | Classification Loss: 0.484313 | Representation DL Loss: 0.205188 | Accuracy: 99.8462% ==
== Epoch: 7 | Validation CLS Loss: 3.764229 | Validation Representation DL Loss: 0.231196 | F1 Score: 0.996811 == | Current accuracy: 0.997354 ==
[[169 0 0 0 0 0 0 0]
[ 0 360 0 0 0 0 0 0]
[ 0 0 109 0 0 0 0 0]
[ 0 0 0 83 0 0 0 0]
[ 0 2 0 0 61 0 0 0]
[ 0 0 0 0 0 176 0 0]
[ 0 0 0 0 0 0 78 0]
[ 0 1 0 0 0 0 0 95]]
Patience: 0 / 10
== Epoch: 8 | Classification Loss: 0.482291 | Representation DL Loss: 0.195704 | Accuracy: 99.9121% ==
== Epoch: 8 | Validation CLS Loss: 3.383599 | Validation Representation DL Loss: 0.159232 | F1 Score: 0.993031 == | Current accuracy: 0.992063 ==
[[162 5 0 0 0 0 0 0]
[ 0 356 0 0 1 0 0 0]
[ 0 0 120 0 0 0 0 0]
[ 0 0 0 101 0 0 0 0]
[ 0 2 0 0 70 0 0 0]
[ 0 0 0 0 0 179 0 0]
[ 0 0 0 0 0 0 67 0]
[ 0 1 0 0 0 0 0 70]]
Patience: 1 / 10
== Epoch: 9 | Classification Loss: 0.480391 | Representation DL Loss: 0.177330 | Accuracy: 99.8462% ==
== Epoch: 9 | Validation CLS Loss: 3.714155 | Validation Representation DL Loss: 0.151181 | F1 Score: 0.996509 == | Current accuracy: 0.995591 ==
[[157 3 0 0 0 0 0 0]
[ 0 352 0 0 1 0 0 0]
[ 0 0 110 0 0 0 0 0]
[ 0 0 0 83 0 0 0 0]
[ 0 0 0 0 93 0 0 0]
[ 0 0 0 0 0 191 0 0]
[ 0 0 0 0 0 0 61 0]
[ 0 1 0 0 0 0 0 82]]
Patience: 2 / 10
== Epoch: 10 | Classification Loss: 0.478689 | Representation DL Loss: 0.175591 | Accuracy: 99.9121% ==
== Epoch: 10 | Validation CLS Loss: 3.540380 | Validation Representation DL Loss: 0.168283 | F1 Score: 0.987462 == | Current accuracy: 0.988536 ==
[[183 5 0 0 0 0 0 0]
[ 0 333 0 0 1 0 0 0]
[ 0 0 106 0 0 0 0 0]
[ 0 0 0 109 0 0 0 0]
[ 0 3 0 0 72 0 0 0]
[ 0 0 0 0 0 185 0 0]
[ 0 0 0 0 0 0 64 0]
[ 0 2 0 0 0 0 2 69]]
Patience: 3 / 10
== Epoch: 11 | Classification Loss: 0.476955 | Representation DL Loss: 0.175457 | Accuracy: 99.9341% ==
== Epoch: 11 | Validation CLS Loss: 3.778132 | Validation Representation DL Loss: 0.163178 | F1 Score: 0.995504 == | Current accuracy: 0.993827 ==
[[153 4 0 0 0 0 0 0]
[ 2 381 0 0 0 0 0 0]
[ 0 0 111 0 0 0 0 0]
[ 0 0 0 90 0 0 0 0]
[ 0 1 0 0 65 0 0 0]
[ 0 0 0 0 0 187 0 0]
[ 0 0 0 0 0 0 66 0]
[ 0 0 0 0 0 0 0 74]]
Patience: 4 / 10
== Epoch: 12 | Classification Loss: 0.476649 | Representation DL Loss: 0.176870 | Accuracy: 99.9121% ==
== Epoch: 12 | Validation CLS Loss: 3.764977 | Validation Representation DL Loss: 0.168706 | F1 Score: 0.993996 == | Current accuracy: 0.994709 ==
[[183 1 0 0 0 0 0 0]
[ 0 329 0 0 1 0 0 0]
[ 0 0 119 0 0 0 0 0]
[ 0 0 0 98 0 0 0 0]
[ 0 1 0 0 59 0 0 0]
[ 0 0 0 0 0 187 0 0]
[ 0 0 0 0 0 0 78 0]
[ 0 3 0 0 0 0 0 75]]
Patience: 5 / 10
== Epoch: 13 | Classification Loss: 0.475814 | Representation DL Loss: 0.172002 | Accuracy: 99.9560% ==
== Epoch: 13 | Validation CLS Loss: 3.900694 | Validation Representation DL Loss: 0.164699 | F1 Score: 0.993045 == | Current accuracy: 0.992945 ==
[[172 3 0 0 0 0 0 0]
[ 0 352 0 0 2 0 0 0]
[ 0 0 116 0 0 0 0 0]
[ 0 0 0 95 0 0 0 0]
[ 0 0 0 0 63 0 0 0]
[ 0 0 0 0 0 181 0 0]
[ 0 0 0 0 0 0 74 0]
[ 0 3 0 0 0 0 0 73]]
Patience: 6 / 10
== Epoch: 14 | Classification Loss: 0.475027 | Representation DL Loss: 0.171514 | Accuracy: 100.0000% ==
== Epoch: 14 | Validation CLS Loss: 3.630681 | Validation Representation DL Loss: 0.162798 | F1 Score: 0.995602 == | Current accuracy: 0.995591 ==
[[203 2 0 0 0 0 0 0]
[ 0 323 0 0 2 0 0 0]
[ 0 0 104 0 0 0 0 0]
[ 0 0 0 92 0 0 0 0]
[ 0 0 0 0 60 0 0 0]
[ 0 0 0 0 0 198 0 0]
[ 0 0 0 0 0 0 69 0]
[ 0 1 0 0 0 0 0 80]]
Patience: 7 / 10
== Epoch: 15 | Classification Loss: 0.475448 | Representation DL Loss: 0.171358 | Accuracy: 99.9780% ==
== Epoch: 15 | Validation CLS Loss: 3.577396 | Validation Representation DL Loss: 0.160683 | F1 Score: 0.994676 == | Current accuracy: 0.993827 ==
[[160 4 0 0 0 0 0 0]
[ 0 336 0 0 0 0 0 0]
[ 0 0 111 0 0 0 0 0]
[ 0 0 0 89 0 0 0 0]
[ 0 2 0 0 69 0 0 0]
[ 0 0 0 0 0 213 0 0]
[ 0 0 0 0 0 0 61 0]
[ 0 1 0 0 0 0 0 88]]
Patience: 8 / 10
== Epoch: 16 | Classification Loss: 0.475443 | Representation DL Loss: 0.164815 | Accuracy: 99.9560% ==
== Epoch: 16 | Validation CLS Loss: 3.531243 | Validation Representation DL Loss: 0.164516 | F1 Score: 0.997349 == | Current accuracy: 0.996473 ==
[[159 3 0 0 0 0 0 0]
[ 0 352 0 0 0 0 0 0]
[ 0 0 100 0 0 0 0 0]
[ 0 0 0 112 0 0 0 0]
[ 0 0 0 0 63 0 0 0]
[ 0 0 0 0 0 191 0 0]
[ 0 0 0 0 0 0 73 0]
[ 0 1 0 0 0 0 0 80]]
Patience: 9 / 10
== Epoch: 17 | Classification Loss: 0.475245 | Representation DL Loss: 0.169262 | Accuracy: 99.9560% ==
== Epoch: 17 | Validation CLS Loss: 3.615410 | Validation Representation DL Loss: 0.154457 | F1 Score: 0.997205 == | Current accuracy: 0.997354 ==
[[151 1 0 0 0 0 0 0]
[ 0 337 0 0 1 0 0 0]
[ 0 0 123 0 0 0 0 0]
[ 0 0 0 93 0 0 0 0]
[ 0 0 0 0 65 0 0 0]
[ 0 0 0 0 0 219 0 0]
[ 0 0 0 0 0 0 72 0]
[ 0 1 0 0 0 0 0 71]]
Patience: 10 / 10
== Epoch: 18 | Classification Loss: 0.475913 | Representation DL Loss: 0.161787 | Accuracy: 99.9341% ==
== Epoch: 18 | Validation CLS Loss: 3.638239 | Validation Representation DL Loss: 0.167357 | F1 Score: 0.992559 == | Current accuracy: 0.992945 ==
[[167 3 0 0 0 0 0 0]
[ 0 381 0 0 3 0 0 0]
[ 0 0 96 0 0 0 0 0]
[ 0 0 0 101 0 0 0 0]
[ 0 0 0 0 50 0 0 0]
[ 0 0 0 0 0 174 0 0]
[ 0 0 0 0 0 0 68 0]
[ 0 2 0 0 0 0 0 89]]
Patience: 11 / 10
/data/wanh/CANAL/model.py:105: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
adata_i.obs['celltype'] = current_label_dict[i]
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
example bank after updating:
AnnData object with n_obs × n_vars = 995 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
cell type composition of this example bank:
(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'natural killer cell', 'stromal cell'], dtype=object), array([124, 124, 125, 124, 125, 124, 125, 124]))
dataset composition from each stage of this example bank:
(array([1, 2]), array([435, 560]))
stage 3
load the pre-processed dataset ‘’Limb_Muscle’’
[6]:
dataset_stage_3 = "TabulaMuris_Limb_Muscle_10X"
data_path_stage_3 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_3)
adata_stage_3 = sc.read_h5ad(data_path_stage_3)
cell_type_stage_3 = adata_stage_3.obs['cell_type1']
print(adata_stage_3)
print(np.unique(np.array(cell_type_stage_3), return_counts=True))
AnnData object with n_obs × n_vars = 3409 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'mesenchymal stem cell', 'skeletal muscle satellite cell'],
dtype=object), array([ 398, 269, 1157, 274, 998, 313]))
fine-tune the CANAL model for the third stage
[7]:
CANAL.train(experiments = experiments, pre_dataset = dataset_stage_2, dataset = dataset_stage_3,
adata = adata_stage_3, cell_type = cell_type_stage_3, current_stage = 3, is_final_stage = False,
ckpt_dir='./ckpts/', rehearsal_size=1000, SEED = seed)
current data: AnnData object with n_obs × n_vars = 3409 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'mesenchymal stem cell', 'skeletal muscle satellite cell'],
dtype=object), array([ 398, 269, 1157, 274, 998, 313]))
model constructing begin!
new cell types: ['mesenchymal stem cell', 'skeletal muscle satellite cell']
example bank for experience replay: AnnData object with n_obs × n_vars = 995 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
q, r = torch.qr(unstructured_block.cpu(), some = True)
model constructing finished!
label train: [0 1 2 3 4 5 6 7 8 9] 10
label val: [0 1 2 3 4 5 6 7 8 9] 10
== Begin finetuning: | Incrmental stage | Current stage: 3 | CANAL | Dataset: Cross_tissue TabulaMuris_Limb_Muscle_10X ==
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
== Epoch: 1 | Classification Loss: 1.269150 | Representation DL Loss: 0.014623 | Accuracy: 67.7291% ==
== Epoch: 1 | Validation CLS Loss: 3.424092 | Validation Representation DL Loss: 0.033891 | F1 Score: 0.700836 == | Current accuracy: 0.686636 ==
[[ 98 0 0 0 0 0 0 0 0 0]
[ 0 73 1 0 1 0 0 0 0 0]
[ 0 0 254 0 0 0 0 0 0 0]
[ 0 0 0 67 0 1 0 0 0 0]
[ 0 0 0 0 18 0 0 0 0 0]
[ 0 0 0 0 0 26 0 0 0 0]
[ 0 0 0 0 0 0 29 0 0 0]
[ 0 0 0 0 0 0 0 31 0 0]
[ 0 0 2 0 0 200 0 0 0 0]
[ 2 17 0 0 0 46 0 0 2 0]]
Patience: 0 / 10
== Epoch: 2 | Classification Loss: 0.700298 | Representation DL Loss: 0.393118 | Accuracy: 93.5970% ==
== Epoch: 2 | Validation CLS Loss: 3.925053 | Validation Representation DL Loss: 0.438186 | F1 Score: 0.963895 == | Current accuracy: 0.981567 ==
[[110 0 0 0 0 0 0 0 0 0]
[ 0 71 0 0 0 0 0 0 0 0]
[ 0 0 238 0 0 0 0 0 0 0]
[ 0 0 0 69 0 1 0 0 0 0]
[ 0 0 0 0 27 0 0 0 0 0]
[ 0 0 0 0 0 17 0 0 15 0]
[ 0 0 0 0 0 0 29 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 0 0 0 0 0 0 214 0]
[ 0 0 0 0 0 0 0 0 0 56]]
Patience: 0 / 10
== Epoch: 3 | Classification Loss: 0.571220 | Representation DL Loss: 0.381694 | Accuracy: 98.6056% ==
== Epoch: 3 | Validation CLS Loss: 3.659403 | Validation Representation DL Loss: 0.350093 | F1 Score: 0.970678 == | Current accuracy: 0.981567 ==
[[ 98 0 0 0 0 0 0 0 0 0]
[ 0 82 0 0 1 0 0 0 0 0]
[ 0 0 250 0 0 0 0 0 0 0]
[ 0 0 2 61 0 0 0 0 0 0]
[ 0 0 0 0 22 0 0 0 0 0]
[ 0 0 0 0 0 16 0 0 7 0]
[ 0 0 0 0 0 0 25 0 0 0]
[ 0 0 0 0 0 0 0 23 0 0]
[ 0 0 4 1 0 1 0 0 217 0]
[ 0 0 0 0 0 0 0 0 0 58]]
Patience: 1 / 10
== Epoch: 4 | Classification Loss: 0.547451 | Representation DL Loss: 0.349536 | Accuracy: 98.9186% ==
== Epoch: 4 | Validation CLS Loss: 4.033183 | Validation Representation DL Loss: 0.387530 | F1 Score: 0.976299 == | Current accuracy: 0.983871 ==
[[110 0 0 0 0 0 0 0 0 0]
[ 0 86 0 0 1 0 0 0 0 0]
[ 0 0 223 0 0 0 0 0 0 0]
[ 0 0 0 64 0 0 0 0 0 0]
[ 0 0 0 0 30 0 0 0 0 0]
[ 0 0 0 0 0 22 0 0 9 0]
[ 0 0 0 0 0 0 25 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 3 0 0 0 0 0 199 0]
[ 0 0 0 0 0 0 0 0 1 74]]
Patience: 0 / 10
== Epoch: 5 | Classification Loss: 0.536278 | Representation DL Loss: 0.325964 | Accuracy: 99.2032% ==
== Epoch: 5 | Validation CLS Loss: 3.911783 | Validation Representation DL Loss: 0.308773 | F1 Score: 0.983484 == | Current accuracy: 0.987327 ==
[[115 0 0 0 0 0 0 0 0 0]
[ 0 75 0 0 1 0 0 0 0 0]
[ 0 0 236 0 0 0 0 0 0 0]
[ 0 0 1 85 0 0 0 0 0 0]
[ 0 0 0 0 23 0 0 0 0 0]
[ 0 0 0 0 0 24 0 0 3 0]
[ 0 0 0 0 0 0 20 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 1 2 0 1 0 0 199 0]
[ 0 0 0 0 0 0 0 0 2 59]]
Patience: 0 / 10
== Epoch: 6 | Classification Loss: 0.524765 | Representation DL Loss: 0.294313 | Accuracy: 99.7154% ==
== Epoch: 6 | Validation CLS Loss: 3.702503 | Validation Representation DL Loss: 0.267107 | F1 Score: 0.991185 == | Current accuracy: 0.991935 ==
[[100 0 0 0 0 0 0 0 0 0]
[ 0 66 0 0 1 0 0 0 0 0]
[ 0 0 257 0 0 0 0 0 0 0]
[ 0 0 2 86 0 0 0 0 0 0]
[ 0 0 0 0 34 0 0 0 0 0]
[ 0 0 0 0 0 20 0 0 1 0]
[ 0 0 0 0 0 0 29 0 0 0]
[ 0 0 0 0 0 0 0 25 0 0]
[ 0 0 1 1 0 0 0 0 181 0]
[ 0 0 0 0 0 0 0 0 1 63]]
Patience: 0 / 10
== Epoch: 7 | Classification Loss: 0.519414 | Representation DL Loss: 0.273046 | Accuracy: 99.8577% ==
== Epoch: 7 | Validation CLS Loss: 3.830669 | Validation Representation DL Loss: 0.249767 | F1 Score: 0.986853 == | Current accuracy: 0.989631 ==
[[ 91 0 0 0 0 0 0 0 0 0]
[ 0 95 0 0 1 0 0 0 0 0]
[ 0 0 244 0 0 0 0 0 0 0]
[ 0 0 1 81 0 0 0 0 0 0]
[ 0 0 0 0 28 0 0 0 0 0]
[ 0 0 0 0 0 22 0 0 3 0]
[ 0 0 0 0 0 0 20 0 0 0]
[ 0 0 0 0 0 0 0 24 0 0]
[ 0 0 2 1 0 0 0 0 202 0]
[ 0 0 0 0 0 0 0 0 1 52]]
Patience: 1 / 10
== Epoch: 8 | Classification Loss: 0.516827 | Representation DL Loss: 0.252431 | Accuracy: 99.8862% ==
== Epoch: 8 | Validation CLS Loss: 4.209375 | Validation Representation DL Loss: 0.231310 | F1 Score: 0.987909 == | Current accuracy: 0.990783 ==
[[109 0 0 0 0 0 0 0 0 0]
[ 0 74 0 0 0 0 0 0 0 0]
[ 0 0 259 0 0 0 0 0 0 0]
[ 0 0 1 70 0 0 0 0 0 0]
[ 0 0 0 0 28 0 0 0 0 0]
[ 0 0 0 0 0 22 0 0 3 0]
[ 0 0 0 0 0 0 32 0 0 0]
[ 0 0 0 0 0 0 0 27 0 0]
[ 0 0 0 3 0 0 0 0 176 0]
[ 0 0 0 0 0 0 0 0 1 63]]
Patience: 2 / 10
== Epoch: 9 | Classification Loss: 0.514036 | Representation DL Loss: 0.241894 | Accuracy: 100.0000% ==
== Epoch: 9 | Validation CLS Loss: 3.690444 | Validation Representation DL Loss: 0.191970 | F1 Score: 0.983331 == | Current accuracy: 0.990783 ==
[[ 97 0 0 0 0 0 0 0 0 0]
[ 0 87 0 0 2 0 0 0 0 0]
[ 0 0 243 0 0 0 0 0 0 0]
[ 0 0 2 91 0 0 0 0 0 0]
[ 0 0 0 0 27 0 0 0 0 0]
[ 0 0 2 0 0 16 0 0 0 0]
[ 0 0 0 0 0 0 18 0 0 0]
[ 0 0 0 0 0 0 0 23 0 0]
[ 0 0 0 0 0 1 0 0 209 0]
[ 0 0 0 0 0 0 0 0 1 49]]
Patience: 3 / 10
== Epoch: 10 | Classification Loss: 0.512463 | Representation DL Loss: 0.219028 | Accuracy: 99.9715% ==
== Epoch: 10 | Validation CLS Loss: 3.722399 | Validation Representation DL Loss: 0.214890 | F1 Score: 0.989406 == | Current accuracy: 0.991935 ==
[[100 0 0 0 0 0 0 0 0 0]
[ 0 72 0 0 1 0 0 0 0 0]
[ 0 0 226 0 0 0 0 0 0 0]
[ 0 0 2 77 0 0 0 0 0 0]
[ 0 0 0 0 23 0 0 0 0 0]
[ 0 0 0 0 0 29 0 0 3 0]
[ 0 0 0 0 0 0 24 0 0 0]
[ 0 0 0 0 0 0 0 30 0 0]
[ 0 0 1 0 0 0 0 0 216 0]
[ 0 0 0 0 0 0 0 0 0 64]]
Patience: 4 / 10
== Epoch: 11 | Classification Loss: 0.510772 | Representation DL Loss: 0.208613 | Accuracy: 100.0000% ==
== Epoch: 11 | Validation CLS Loss: 3.784714 | Validation Representation DL Loss: 0.201576 | F1 Score: 0.983138 == | Current accuracy: 0.991935 ==
[[101 0 0 0 0 0 0 0 0 0]
[ 0 78 0 0 1 0 0 0 0 0]
[ 0 0 247 0 0 0 0 0 0 0]
[ 0 0 1 79 0 0 0 0 0 0]
[ 0 0 0 0 26 0 0 0 0 0]
[ 0 0 0 0 0 18 0 0 5 0]
[ 0 0 0 0 0 0 21 0 0 0]
[ 0 0 0 0 0 0 0 26 0 0]
[ 0 0 0 0 0 0 0 0 188 0]
[ 0 0 0 0 0 0 0 0 0 77]]
Patience: 5 / 10
== Epoch: 12 | Classification Loss: 0.509933 | Representation DL Loss: 0.193282 | Accuracy: 100.0000% ==
== Epoch: 12 | Validation CLS Loss: 3.688917 | Validation Representation DL Loss: 0.188578 | F1 Score: 0.988748 == | Current accuracy: 0.993088 ==
[[112 0 0 0 0 0 0 0 0 0]
[ 0 70 0 0 1 0 0 0 0 0]
[ 0 0 222 0 0 0 0 0 0 0]
[ 0 0 2 70 0 0 0 0 0 0]
[ 0 0 0 0 17 0 0 0 0 0]
[ 0 0 0 0 0 28 0 0 3 0]
[ 0 0 0 0 0 0 31 0 0 0]
[ 0 0 0 0 0 0 0 37 0 0]
[ 0 0 0 0 0 0 0 0 200 0]
[ 0 0 0 0 0 0 0 0 0 75]]
Patience: 0 / 10
== Epoch: 13 | Classification Loss: 0.509605 | Representation DL Loss: 0.187941 | Accuracy: 100.0000% ==
== Epoch: 13 | Validation CLS Loss: 3.599265 | Validation Representation DL Loss: 0.181978 | F1 Score: 0.978821 == | Current accuracy: 0.988479 ==
[[ 82 0 0 0 0 0 0 0 0 0]
[ 0 71 0 0 1 0 0 0 0 0]
[ 0 0 272 0 0 0 0 0 0 0]
[ 0 0 1 68 0 0 0 0 0 0]
[ 0 0 0 0 19 0 0 0 0 0]
[ 0 0 1 0 0 17 0 0 4 0]
[ 0 0 0 0 0 0 22 0 0 0]
[ 0 0 0 0 0 0 0 25 0 0]
[ 0 0 0 1 0 0 0 0 226 0]
[ 0 0 0 0 0 0 0 0 2 56]]
Patience: 1 / 10
== Epoch: 14 | Classification Loss: 0.509407 | Representation DL Loss: 0.181221 | Accuracy: 100.0000% ==
== Epoch: 14 | Validation CLS Loss: 3.799653 | Validation Representation DL Loss: 0.157986 | F1 Score: 0.976329 == | Current accuracy: 0.983871 ==
[[105 0 0 0 0 0 0 0 0 0]
[ 0 79 0 0 2 0 0 0 0 0]
[ 0 0 266 0 0 0 0 0 0 0]
[ 0 0 2 74 0 0 0 0 0 0]
[ 0 0 0 0 24 0 0 0 0 0]
[ 0 0 1 0 0 30 0 0 7 0]
[ 0 0 0 0 0 0 24 0 0 0]
[ 0 0 0 0 0 0 0 26 0 0]
[ 0 0 0 0 0 1 0 0 176 0]
[ 0 0 0 0 0 0 0 0 1 50]]
Patience: 2 / 10
== Epoch: 15 | Classification Loss: 0.509605 | Representation DL Loss: 0.179538 | Accuracy: 100.0000% ==
== Epoch: 15 | Validation CLS Loss: 3.713005 | Validation Representation DL Loss: 0.158739 | F1 Score: 0.984616 == | Current accuracy: 0.990783 ==
[[100 0 0 0 0 0 0 0 0 0]
[ 0 91 0 0 0 0 0 0 0 0]
[ 0 0 245 0 0 0 0 0 0 0]
[ 0 0 0 82 0 0 0 0 0 0]
[ 0 0 0 0 25 0 0 0 0 0]
[ 0 0 1 0 0 24 0 0 3 0]
[ 0 0 0 0 0 0 36 0 0 0]
[ 0 0 0 0 0 0 0 22 0 0]
[ 0 0 0 1 0 3 0 0 186 0]
[ 0 0 0 0 0 0 0 0 0 49]]
Patience: 3 / 10
== Epoch: 16 | Classification Loss: 0.510118 | Representation DL Loss: 0.202223 | Accuracy: 100.0000% ==
== Epoch: 16 | Validation CLS Loss: 3.923636 | Validation Representation DL Loss: 0.204042 | F1 Score: 0.971375 == | Current accuracy: 0.983871 ==
[[ 99 0 0 0 0 0 0 0 0 0]
[ 0 58 0 0 2 0 0 0 0 0]
[ 0 0 266 0 0 0 0 0 0 0]
[ 0 0 1 65 0 0 0 0 0 0]
[ 0 0 0 0 28 0 0 0 0 0]
[ 0 0 4 0 0 17 0 0 1 0]
[ 0 0 0 0 0 0 26 0 0 0]
[ 0 0 0 0 0 0 0 33 0 0]
[ 0 0 0 2 0 2 0 0 194 0]
[ 0 0 0 0 0 0 0 0 2 68]]
Patience: 4 / 10
== Epoch: 17 | Classification Loss: 0.509321 | Representation DL Loss: 0.192837 | Accuracy: 100.0000% ==
== Epoch: 17 | Validation CLS Loss: 3.836862 | Validation Representation DL Loss: 0.181496 | F1 Score: 0.996914 == | Current accuracy: 0.997696 ==
[[103 0 0 0 0 0 0 0 0 0]
[ 0 86 0 0 0 0 0 0 0 0]
[ 0 0 265 0 0 0 0 0 0 0]
[ 0 0 0 68 0 0 0 0 0 0]
[ 0 0 0 0 18 0 0 0 0 0]
[ 0 0 0 0 0 27 0 0 1 0]
[ 0 0 0 0 0 0 24 0 0 0]
[ 0 0 0 0 0 0 0 18 0 0]
[ 0 0 0 1 0 0 0 0 185 0]
[ 0 0 0 0 0 0 0 0 0 72]]
Patience: 0 / 10
== Epoch: 18 | Classification Loss: 0.508944 | Representation DL Loss: 0.184233 | Accuracy: 100.0000% ==
== Epoch: 18 | Validation CLS Loss: 3.757966 | Validation Representation DL Loss: 0.172289 | F1 Score: 0.985945 == | Current accuracy: 0.991935 ==
[[ 88 0 0 0 0 0 0 0 0 0]
[ 0 94 0 0 1 0 0 0 0 0]
[ 0 0 261 0 0 0 0 0 0 0]
[ 0 0 1 75 0 0 0 0 0 0]
[ 0 0 0 0 26 0 0 0 0 0]
[ 0 0 1 0 0 21 0 0 3 0]
[ 0 0 0 0 0 0 32 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 0 0 0 0 0 0 182 0]
[ 0 0 0 0 0 0 0 0 1 61]]
Patience: 1 / 10
== Epoch: 19 | Classification Loss: 0.508631 | Representation DL Loss: 0.180292 | Accuracy: 100.0000% ==
== Epoch: 19 | Validation CLS Loss: 3.911557 | Validation Representation DL Loss: 0.169076 | F1 Score: 0.970801 == | Current accuracy: 0.986175 ==
[[109 0 0 0 0 0 0 0 0 0]
[ 0 71 0 0 3 0 0 0 0 0]
[ 0 0 247 0 0 0 0 0 0 0]
[ 0 0 0 78 0 0 0 0 0 0]
[ 0 0 0 0 32 0 0 0 0 0]
[ 0 0 1 0 0 15 0 0 4 0]
[ 0 0 0 0 0 0 20 0 0 0]
[ 0 0 0 0 0 0 0 30 0 0]
[ 0 0 0 1 0 2 0 0 202 0]
[ 0 0 0 0 0 0 0 0 1 52]]
Patience: 2 / 10
== Epoch: 20 | Classification Loss: 0.508338 | Representation DL Loss: 0.175783 | Accuracy: 100.0000% ==
== Epoch: 20 | Validation CLS Loss: 3.691231 | Validation Representation DL Loss: 0.149613 | F1 Score: 0.983890 == | Current accuracy: 0.990783 ==
[[108 0 0 0 0 0 0 0 0 0]
[ 0 64 0 0 3 0 0 0 0 0]
[ 0 0 252 0 0 0 0 0 0 0]
[ 0 0 0 82 0 0 0 0 0 0]
[ 0 0 0 0 33 0 0 0 0 0]
[ 0 0 0 0 0 26 0 0 4 0]
[ 0 0 0 0 0 0 36 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 0 0 0 0 0 0 190 0]
[ 0 0 0 0 0 0 0 0 1 48]]
Patience: 3 / 10
== Epoch: 21 | Classification Loss: 0.507941 | Representation DL Loss: 0.168473 | Accuracy: 100.0000% ==
== Epoch: 21 | Validation CLS Loss: 3.721588 | Validation Representation DL Loss: 0.146946 | F1 Score: 0.981061 == | Current accuracy: 0.988479 ==
[[100 0 0 0 0 0 0 0 0 0]
[ 0 91 0 0 2 0 0 0 0 0]
[ 0 0 252 0 0 0 0 0 0 0]
[ 0 0 1 81 0 0 0 0 0 0]
[ 0 0 0 0 21 0 0 0 0 0]
[ 0 0 0 0 0 24 0 0 2 0]
[ 0 0 0 0 0 0 22 0 0 0]
[ 0 0 0 0 0 0 0 22 0 0]
[ 0 0 0 2 0 3 0 0 185 0]
[ 0 0 0 0 0 0 0 0 0 60]]
Patience: 4 / 10
== Epoch: 22 | Classification Loss: 0.507223 | Representation DL Loss: 0.155871 | Accuracy: 100.0000% ==
== Epoch: 22 | Validation CLS Loss: 3.725919 | Validation Representation DL Loss: 0.140846 | F1 Score: 0.981810 == | Current accuracy: 0.989631 ==
[[119 0 0 0 0 0 0 0 0 0]
[ 0 74 0 0 3 0 0 0 0 0]
[ 0 0 256 0 0 0 0 0 0 0]
[ 0 0 0 59 0 0 0 0 0 0]
[ 0 0 0 0 38 0 0 0 0 0]
[ 0 0 0 0 0 20 0 0 3 0]
[ 0 0 0 0 0 0 15 0 0 0]
[ 0 0 0 0 0 0 0 23 0 0]
[ 0 0 0 0 0 1 0 0 202 0]
[ 0 0 0 0 0 0 0 0 2 53]]
Patience: 5 / 10
== Epoch: 23 | Classification Loss: 0.506500 | Representation DL Loss: 0.144887 | Accuracy: 100.0000% ==
== Epoch: 23 | Validation CLS Loss: 3.806984 | Validation Representation DL Loss: 0.135816 | F1 Score: 0.981420 == | Current accuracy: 0.986175 ==
[[109 0 0 0 0 0 0 0 0 0]
[ 0 62 0 0 1 0 0 0 0 0]
[ 0 0 220 0 0 0 0 0 0 0]
[ 0 0 2 94 0 0 0 0 0 0]
[ 0 0 0 0 30 0 0 0 0 0]
[ 0 0 2 0 0 25 0 0 2 0]
[ 0 0 0 0 0 0 25 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 0 2 0 2 0 0 205 0]
[ 0 0 0 0 0 0 0 0 1 65]]
Patience: 6 / 10
== Epoch: 24 | Classification Loss: 0.506209 | Representation DL Loss: 0.136717 | Accuracy: 100.0000% ==
== Epoch: 24 | Validation CLS Loss: 3.772007 | Validation Representation DL Loss: 0.135068 | F1 Score: 0.980062 == | Current accuracy: 0.988479 ==
[[119 0 0 0 0 0 0 0 0 0]
[ 0 63 0 0 1 0 0 0 0 0]
[ 0 0 236 0 0 0 0 0 0 0]
[ 0 0 2 79 0 0 0 0 0 0]
[ 0 0 0 0 23 0 0 0 0 0]
[ 0 0 2 0 0 15 0 0 1 0]
[ 0 0 0 0 0 0 27 0 0 0]
[ 0 0 0 0 0 0 0 28 0 0]
[ 0 0 0 2 0 1 0 0 196 0]
[ 0 0 0 0 0 0 0 0 1 72]]
Patience: 7 / 10
== Epoch: 25 | Classification Loss: 0.506000 | Representation DL Loss: 0.136285 | Accuracy: 100.0000% ==
== Epoch: 25 | Validation CLS Loss: 3.553699 | Validation Representation DL Loss: 0.117606 | F1 Score: 0.981745 == | Current accuracy: 0.986175 ==
[[ 91 0 0 0 0 0 0 0 0 0]
[ 0 66 0 0 0 0 0 0 0 0]
[ 0 0 261 0 0 0 0 0 0 0]
[ 0 0 2 84 0 0 0 0 0 0]
[ 0 0 0 0 29 0 0 0 0 0]
[ 0 0 3 0 0 26 0 0 4 0]
[ 0 0 0 0 0 0 24 0 0 0]
[ 0 0 0 0 0 0 0 20 0 0]
[ 0 0 0 2 0 1 0 0 209 0]
[ 0 0 0 0 0 0 0 0 0 46]]
Patience: 8 / 10
== Epoch: 26 | Classification Loss: 0.505666 | Representation DL Loss: 0.139120 | Accuracy: 100.0000% ==
== Epoch: 26 | Validation CLS Loss: 3.644901 | Validation Representation DL Loss: 0.135973 | F1 Score: 0.984632 == | Current accuracy: 0.990783 ==
[[ 94 0 0 0 0 0 0 0 0 0]
[ 0 77 0 0 2 0 0 0 0 0]
[ 0 0 258 0 0 0 0 0 0 0]
[ 0 0 0 84 0 0 0 0 0 0]
[ 0 0 0 0 26 0 0 0 0 0]
[ 0 0 1 0 0 25 0 0 2 0]
[ 0 0 0 0 0 0 16 0 0 0]
[ 0 0 0 0 0 0 0 20 0 0]
[ 0 0 0 0 0 1 0 0 195 0]
[ 0 0 0 0 0 0 0 0 2 65]]
Patience: 9 / 10
== Epoch: 27 | Classification Loss: 0.505538 | Representation DL Loss: 0.139681 | Accuracy: 100.0000% ==
== Epoch: 27 | Validation CLS Loss: 3.967983 | Validation Representation DL Loss: 0.126298 | F1 Score: 0.984596 == | Current accuracy: 0.990783 ==
[[118 0 0 0 0 0 0 0 0 0]
[ 0 74 0 0 2 0 0 0 0 0]
[ 0 0 238 0 0 0 0 0 0 0]
[ 0 0 1 79 0 0 0 0 0 0]
[ 0 0 0 0 22 0 0 0 0 0]
[ 0 0 1 0 0 22 0 0 2 0]
[ 0 0 0 0 0 0 24 0 0 0]
[ 0 0 0 0 0 0 0 31 0 0]
[ 0 0 0 2 0 0 0 0 187 0]
[ 0 0 0 0 0 0 0 0 0 65]]
Patience: 10 / 10
== Epoch: 28 | Classification Loss: 0.505209 | Representation DL Loss: 0.133734 | Accuracy: 100.0000% ==
== Epoch: 28 | Validation CLS Loss: 3.805728 | Validation Representation DL Loss: 0.117855 | F1 Score: 0.985936 == | Current accuracy: 0.991935 ==
[[116 0 0 0 0 0 0 0 0 0]
[ 0 85 0 0 3 0 0 0 0 0]
[ 0 0 240 0 0 0 0 0 0 0]
[ 0 0 1 78 0 0 0 0 0 0]
[ 0 0 0 0 24 0 0 0 0 0]
[ 0 0 0 0 0 25 0 0 1 0]
[ 0 0 0 0 0 0 33 0 0 0]
[ 0 0 0 0 0 0 0 23 0 0]
[ 0 0 0 0 0 1 0 0 185 0]
[ 0 0 0 0 0 0 0 0 1 52]]
Patience: 11 / 10
/data/wanh/CANAL/model.py:105: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
adata_i.obs['celltype'] = current_label_dict[i]
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
example bank after updating:
AnnData object with n_obs × n_vars = 996 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
cell type composition of this example bank:
(array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'mesenchymal stem cell', 'natural killer cell',
'skeletal muscle satellite cell', 'stromal cell'], dtype=object), array([ 99, 99, 100, 99, 100, 99, 100, 100, 100, 100]))
dataset composition from each stage of this example bank:
(array([1, 2, 3]), array([282, 382, 332]))
stage 4
load the pre-processed dataset ‘’Spleen’’
[7]:
dataset_stage_4 = "TabulaMuris_Spleen_10X"
data_path_stage_4 = './data/{}/{}_train.h5ad'.format(experiments, dataset_stage_4)
adata_stage_4 = sc.read_h5ad(data_path_stage_4)
cell_type_stage_4 = adata_stage_4.obs['cell_type1']
print(adata_stage_4)
print(np.unique(np.array(cell_type_stage_4), return_counts=True))
AnnData object with n_obs × n_vars = 9010 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'macrophage', 'natural killer cell'],
dtype=object), array([6541, 1816, 431, 222]))
fine-tune the CANAL model for the forth stage
[8]:
CANAL.train(experiments = experiments, pre_dataset = dataset_stage_3, dataset = dataset_stage_4,
adata = adata_stage_4, cell_type = cell_type_stage_4, current_stage = 4, is_final_stage = True,
ckpt_dir='./ckpts/', rehearsal_size=1000, SEED = seed)
current data: AnnData object with n_obs × n_vars = 9010 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p'
(array(['B cell', 'T cell', 'macrophage', 'natural killer cell'],
dtype=object), array([6541, 1816, 431, 222]))
model constructing begin!
new cell types: []
example bank for experience replay: AnnData object with n_obs × n_vars = 996 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch', 'celltype', 'rank', 'stage'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
q, r = torch.qr(unstructured_block.cpu(), some = True)
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
model constructing finished!
label train: [0 1 2 3 4 5 6 7 8 9] 10
label val: [0 1 2 3 4 5 6 7 8 9] 10
== Begin finetuning: | Final stage | Current stage: 4 | CANAL | Dataset: Cross_tissue TabulaMuris_Spleen_10X ==
== Epoch: 1 | Classification Loss: 0.566440 | Representation DL Loss: 0.005053 | Accuracy: 97.8359% ==
== Epoch: 1 | Validation CLS Loss: 2.590173 | Validation Representation DL Loss: 0.018307 | F1 Score: 0.969873 == | Current accuracy: 0.979520 ==
[[1292 16 0 1 0 0 0 7 0 0]
[ 6 364 0 1 0 0 0 0 0 0]
[ 0 0 23 0 0 0 0 0 0 0]
[ 4 0 0 108 0 3 0 0 0 0]
[ 0 3 0 0 64 0 0 0 0 0]
[ 0 0 0 0 0 24 0 0 0 0]
[ 0 0 0 0 0 0 33 0 0 0]
[ 0 0 0 0 0 0 0 23 0 0]
[ 0 0 0 0 0 0 0 0 18 0]
[ 0 0 0 0 0 0 0 0 0 12]]
Patience: 0 / 10
== Epoch: 2 | Classification Loss: 0.554336 | Representation DL Loss: 0.032166 | Accuracy: 97.9610% ==
== Epoch: 2 | Validation CLS Loss: 2.036643 | Validation Representation DL Loss: 0.039645 | F1 Score: 0.989236 == | Current accuracy: 0.985514 ==
[[1309 6 0 4 0 0 0 0 0 0]
[ 7 372 0 2 0 0 0 0 0 0]
[ 0 0 19 0 0 0 0 0 0 0]
[ 10 0 0 95 0 0 0 0 0 0]
[ 0 0 0 0 54 0 0 0 0 0]
[ 0 0 0 0 0 24 0 0 0 0]
[ 0 0 0 0 0 0 23 0 0 0]
[ 0 0 0 0 0 0 0 24 0 0]
[ 0 0 0 0 0 0 0 0 31 0]
[ 0 0 0 0 0 0 0 0 0 22]]
Patience: 0 / 10
== Epoch: 3 | Classification Loss: 0.541880 | Representation DL Loss: 0.025442 | Accuracy: 98.4989% ==
== Epoch: 3 | Validation CLS Loss: 2.205218 | Validation Representation DL Loss: 0.024612 | F1 Score: 0.991630 == | Current accuracy: 0.985514 ==
[[1306 11 0 1 0 0 0 0 0 0]
[ 9 383 0 1 1 0 0 0 0 0]
[ 0 0 24 0 0 0 0 0 0 0]
[ 5 1 0 105 0 0 0 0 0 0]
[ 0 0 0 0 62 0 0 0 0 0]
[ 0 0 0 0 0 17 0 0 0 0]
[ 0 0 0 0 0 0 17 0 0 0]
[ 0 0 0 0 0 0 0 13 0 0]
[ 0 0 0 0 0 0 0 0 20 0]
[ 0 0 0 0 0 0 0 0 0 26]]
Patience: 1 / 10
== Epoch: 4 | Classification Loss: 0.531079 | Representation DL Loss: 0.028071 | Accuracy: 99.0368% ==
== Epoch: 4 | Validation CLS Loss: 2.666437 | Validation Representation DL Loss: 0.032383 | F1 Score: 0.989042 == | Current accuracy: 0.986014 ==
[[1315 11 0 1 0 0 0 0 0 0]
[ 5 386 0 1 5 0 0 0 0 0]
[ 0 0 18 0 0 0 0 0 0 0]
[ 5 0 0 105 0 0 0 0 0 0]
[ 0 0 0 0 58 0 0 0 0 0]
[ 0 0 0 0 0 20 0 0 0 0]
[ 0 0 0 0 0 0 21 0 0 0]
[ 0 0 0 0 0 0 0 18 0 0]
[ 0 0 0 0 0 0 0 0 16 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 0 / 10
== Epoch: 5 | Classification Loss: 0.526279 | Representation DL Loss: 0.033105 | Accuracy: 99.1243% ==
== Epoch: 5 | Validation CLS Loss: 2.989892 | Validation Representation DL Loss: 0.030650 | F1 Score: 0.989795 == | Current accuracy: 0.987512 ==
[[1291 14 0 3 0 0 0 0 0 0]
[ 0 410 0 2 4 0 0 0 0 0]
[ 0 0 18 0 0 0 0 0 0 0]
[ 2 0 0 87 0 0 0 0 0 0]
[ 0 0 0 0 60 0 0 0 0 0]
[ 0 0 0 0 0 25 0 0 0 0]
[ 0 0 0 0 0 0 18 0 0 0]
[ 0 0 0 0 0 0 0 19 0 0]
[ 0 0 0 0 0 0 0 0 25 0]
[ 0 0 0 0 0 0 0 0 0 24]]
Patience: 0 / 10
== Epoch: 6 | Classification Loss: 0.520931 | Representation DL Loss: 0.033596 | Accuracy: 99.4996% ==
== Epoch: 6 | Validation CLS Loss: 2.708388 | Validation Representation DL Loss: 0.031730 | F1 Score: 0.989429 == | Current accuracy: 0.984515 ==
[[1323 15 0 1 0 0 0 0 0 0]
[ 6 360 0 2 3 0 0 0 0 0]
[ 0 0 25 0 0 0 0 0 0 0]
[ 2 2 0 105 0 0 0 0 0 0]
[ 0 0 0 0 54 0 0 0 0 0]
[ 0 0 0 0 0 24 0 0 0 0]
[ 0 0 0 0 0 0 28 0 0 0]
[ 0 0 0 0 0 0 0 17 0 0]
[ 0 0 0 0 0 0 0 0 18 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 1 / 10
== Epoch: 7 | Classification Loss: 0.516694 | Representation DL Loss: 0.037306 | Accuracy: 99.6122% ==
== Epoch: 7 | Validation CLS Loss: 2.837335 | Validation Representation DL Loss: 0.034606 | F1 Score: 0.990969 == | Current accuracy: 0.983516 ==
[[1325 16 0 0 0 0 0 0 0 0]
[ 10 369 0 1 1 0 0 0 0 0]
[ 0 0 23 0 0 0 0 0 0 0]
[ 3 1 0 104 0 0 0 0 0 0]
[ 0 1 0 0 58 0 0 0 0 0]
[ 0 0 0 0 0 16 0 0 0 0]
[ 0 0 0 0 0 0 18 0 0 0]
[ 0 0 0 0 0 0 0 18 0 0]
[ 0 0 0 0 0 0 0 0 21 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 2 / 10
== Epoch: 8 | Classification Loss: 0.512596 | Representation DL Loss: 0.036774 | Accuracy: 99.7123% ==
== Epoch: 8 | Validation CLS Loss: 2.370672 | Validation Representation DL Loss: 0.034355 | F1 Score: 0.987327 == | Current accuracy: 0.985015 ==
[[1311 8 0 3 0 0 0 0 0 0]
[ 3 343 0 0 5 0 0 0 0 0]
[ 0 0 22 0 0 0 0 0 0 0]
[ 7 4 0 108 0 0 0 0 0 0]
[ 0 0 0 0 82 0 0 0 0 0]
[ 0 0 0 0 0 20 0 0 0 0]
[ 0 0 0 0 0 0 20 0 0 0]
[ 0 0 0 0 0 0 0 21 0 0]
[ 0 0 0 0 0 0 0 0 24 0]
[ 0 0 0 0 0 0 0 0 0 21]]
Patience: 3 / 10
== Epoch: 9 | Classification Loss: 0.513157 | Representation DL Loss: 0.037117 | Accuracy: 99.6748% ==
== Epoch: 9 | Validation CLS Loss: 2.588495 | Validation Representation DL Loss: 0.033176 | F1 Score: 0.988307 == | Current accuracy: 0.982018 ==
[[1301 19 0 2 0 0 0 0 0 0]
[ 2 371 0 1 2 0 0 0 0 0]
[ 0 0 18 0 0 0 0 0 0 0]
[ 9 1 0 105 0 0 0 0 0 0]
[ 0 0 0 0 71 0 0 0 0 0]
[ 0 0 0 0 0 22 0 0 0 0]
[ 0 0 0 0 0 0 22 0 0 0]
[ 0 0 0 0 0 0 0 16 0 0]
[ 0 0 0 0 0 0 0 0 16 0]
[ 0 0 0 0 0 0 0 0 0 24]]
Patience: 4 / 10
== Epoch: 10 | Classification Loss: 0.510636 | Representation DL Loss: 0.036718 | Accuracy: 99.7623% ==
== Epoch: 10 | Validation CLS Loss: 2.359062 | Validation Representation DL Loss: 0.035854 | F1 Score: 0.992940 == | Current accuracy: 0.988511 ==
[[1328 12 0 2 0 0 0 0 0 0]
[ 4 357 0 2 2 0 0 0 0 0]
[ 0 0 13 0 0 0 0 0 0 0]
[ 1 0 0 116 0 0 0 0 0 0]
[ 0 0 0 0 65 0 0 0 0 0]
[ 0 0 0 0 0 15 0 0 0 0]
[ 0 0 0 0 0 0 25 0 0 0]
[ 0 0 0 0 0 0 0 22 0 0]
[ 0 0 0 0 0 0 0 0 24 0]
[ 0 0 0 0 0 0 0 0 0 14]]
Patience: 0 / 10
== Epoch: 11 | Classification Loss: 0.508377 | Representation DL Loss: 0.036215 | Accuracy: 99.8374% ==
== Epoch: 11 | Validation CLS Loss: 2.127766 | Validation Representation DL Loss: 0.039598 | F1 Score: 0.993257 == | Current accuracy: 0.991009 ==
[[1331 8 0 1 0 0 0 0 0 0]
[ 3 360 0 0 3 0 0 0 0 0]
[ 0 0 25 0 0 0 0 0 0 0]
[ 3 0 0 112 0 0 0 0 0 0]
[ 0 0 0 0 58 0 0 0 0 0]
[ 0 0 0 0 0 18 0 0 0 0]
[ 0 0 0 0 0 0 21 0 0 0]
[ 0 0 0 0 0 0 0 27 0 0]
[ 0 0 0 0 0 0 0 0 16 0]
[ 0 0 0 0 0 0 0 0 0 16]]
Patience: 0 / 10
== Epoch: 12 | Classification Loss: 0.508645 | Representation DL Loss: 0.041663 | Accuracy: 99.8374% ==
== Epoch: 12 | Validation CLS Loss: 3.278699 | Validation Representation DL Loss: 0.034119 | F1 Score: 0.990750 == | Current accuracy: 0.987013 ==
[[1339 14 0 0 0 0 0 0 0 0]
[ 3 354 0 1 3 0 0 0 0 0]
[ 0 0 14 0 0 0 0 0 0 0]
[ 2 3 0 108 0 0 0 0 0 0]
[ 0 0 0 0 57 0 0 0 0 0]
[ 0 0 0 0 0 19 0 0 0 0]
[ 0 0 0 0 0 0 18 0 0 0]
[ 0 0 0 0 0 0 0 29 0 0]
[ 0 0 0 0 0 0 0 0 25 0]
[ 0 0 0 0 0 0 0 0 0 13]]
Patience: 1 / 10
== Epoch: 13 | Classification Loss: 0.508340 | Representation DL Loss: 0.035979 | Accuracy: 99.8624% ==
== Epoch: 13 | Validation CLS Loss: 1.934230 | Validation Representation DL Loss: 0.032762 | F1 Score: 0.989765 == | Current accuracy: 0.986513 ==
[[1334 9 0 1 0 0 0 0 0 0]
[ 7 355 0 3 4 0 0 0 0 0]
[ 0 0 29 0 0 0 0 0 0 0]
[ 2 1 0 93 0 0 0 0 0 0]
[ 0 0 0 0 74 0 0 0 0 0]
[ 0 0 0 0 0 18 0 0 0 0]
[ 0 0 0 0 0 0 15 0 0 0]
[ 0 0 0 0 0 0 0 16 0 0]
[ 0 0 0 0 0 0 0 0 24 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 2 / 10
== Epoch: 14 | Classification Loss: 0.508412 | Representation DL Loss: 0.034812 | Accuracy: 99.8749% ==
== Epoch: 14 | Validation CLS Loss: 2.697604 | Validation Representation DL Loss: 0.034268 | F1 Score: 0.990647 == | Current accuracy: 0.987013 ==
[[1285 13 0 0 0 0 0 0 0 0]
[ 4 426 0 1 4 0 0 0 0 0]
[ 0 0 15 0 0 0 0 0 0 0]
[ 3 1 0 91 0 0 0 0 0 0]
[ 0 0 0 0 59 0 0 0 0 0]
[ 0 0 0 0 0 17 0 0 0 0]
[ 0 0 0 0 0 0 17 0 0 0]
[ 0 0 0 0 0 0 0 29 0 0]
[ 0 0 0 0 0 0 0 0 20 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 3 / 10
== Epoch: 15 | Classification Loss: 0.506742 | Representation DL Loss: 0.034353 | Accuracy: 99.9375% ==
== Epoch: 15 | Validation CLS Loss: 2.750621 | Validation Representation DL Loss: 0.032656 | F1 Score: 0.987614 == | Current accuracy: 0.985015 ==
[[1297 10 0 3 0 0 0 0 0 0]
[ 6 369 0 2 8 0 0 0 0 0]
[ 0 0 20 0 0 0 0 0 0 0]
[ 1 0 0 101 0 0 0 0 0 0]
[ 0 0 0 0 71 0 0 0 0 0]
[ 0 0 0 0 0 18 0 0 0 0]
[ 0 0 0 0 0 0 28 0 0 0]
[ 0 0 0 0 0 0 0 20 0 0]
[ 0 0 0 0 0 0 0 0 19 0]
[ 0 0 0 0 0 0 0 0 0 29]]
Patience: 4 / 10
== Epoch: 16 | Classification Loss: 0.507812 | Representation DL Loss: 0.035951 | Accuracy: 99.8749% ==
== Epoch: 16 | Validation CLS Loss: 2.147116 | Validation Representation DL Loss: 0.033321 | F1 Score: 0.989715 == | Current accuracy: 0.986513 ==
[[1323 12 0 2 0 0 0 0 0 0]
[ 5 377 0 0 6 0 0 0 0 0]
[ 0 0 21 0 0 0 0 0 0 0]
[ 1 1 0 107 0 0 0 0 0 0]
[ 0 0 0 0 62 0 0 0 0 0]
[ 0 0 0 0 0 17 0 0 0 0]
[ 0 0 0 0 0 0 19 0 0 0]
[ 0 0 0 0 0 0 0 16 0 0]
[ 0 0 0 0 0 0 0 0 16 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 5 / 10
== Epoch: 17 | Classification Loss: 0.508008 | Representation DL Loss: 0.035319 | Accuracy: 99.8749% ==
== Epoch: 17 | Validation CLS Loss: 2.506884 | Validation Representation DL Loss: 0.034275 | F1 Score: 0.989352 == | Current accuracy: 0.984515 ==
[[1320 9 0 4 0 0 0 0 0 0]
[ 8 364 0 3 1 0 0 0 0 0]
[ 0 0 17 0 0 0 0 0 0 0]
[ 4 2 0 108 0 0 0 0 0 0]
[ 0 0 0 0 51 0 0 0 0 0]
[ 0 0 0 0 0 23 0 0 0 0]
[ 0 0 0 0 0 0 15 0 0 0]
[ 0 0 0 0 0 0 0 24 0 0]
[ 0 0 0 0 0 0 0 0 24 0]
[ 0 0 0 0 0 0 0 0 0 25]]
Patience: 6 / 10
== Epoch: 18 | Classification Loss: 0.506743 | Representation DL Loss: 0.034827 | Accuracy: 99.9249% ==
== Epoch: 18 | Validation CLS Loss: 2.453036 | Validation Representation DL Loss: 0.030274 | F1 Score: 0.988141 == | Current accuracy: 0.986014 ==
[[1314 10 0 2 0 0 0 0 0 0]
[ 5 376 0 1 5 0 0 0 0 0]
[ 0 0 30 0 0 0 0 0 0 0]
[ 3 1 0 86 0 0 0 0 0 0]
[ 0 1 0 0 68 0 0 0 0 0]
[ 0 0 0 0 0 17 0 0 0 0]
[ 0 0 0 0 0 0 18 0 0 0]
[ 0 0 0 0 0 0 0 19 0 0]
[ 0 0 0 0 0 0 0 0 26 0]
[ 0 0 0 0 0 0 0 0 0 20]]
Patience: 7 / 10
== Epoch: 19 | Classification Loss: 0.506775 | Representation DL Loss: 0.033257 | Accuracy: 99.9249% ==
== Epoch: 19 | Validation CLS Loss: 2.646988 | Validation Representation DL Loss: 0.030506 | F1 Score: 0.990151 == | Current accuracy: 0.986513 ==
[[1291 12 0 4 0 0 0 0 0 0]
[ 3 395 0 0 4 0 0 0 0 0]
[ 0 0 23 0 0 0 0 0 0 0]
[ 3 1 0 106 0 0 0 0 0 0]
[ 0 0 0 0 67 0 0 0 0 0]
[ 0 0 0 0 0 19 0 0 0 0]
[ 0 0 0 0 0 0 18 0 0 0]
[ 0 0 0 0 0 0 0 23 0 0]
[ 0 0 0 0 0 0 0 0 17 0]
[ 0 0 0 0 0 0 0 0 0 16]]
Patience: 8 / 10
== Epoch: 20 | Classification Loss: 0.506588 | Representation DL Loss: 0.035759 | Accuracy: 99.9500% ==
== Epoch: 20 | Validation CLS Loss: 2.489999 | Validation Representation DL Loss: 0.031861 | F1 Score: 0.987091 == | Current accuracy: 0.979021 ==
[[1310 24 0 1 0 0 0 0 0 0]
[ 6 389 0 2 4 0 0 0 0 0]
[ 0 0 12 0 0 0 0 0 0 0]
[ 3 2 0 100 0 0 0 0 0 0]
[ 0 0 0 0 62 0 0 0 0 0]
[ 0 0 0 0 0 24 0 0 0 0]
[ 0 0 0 0 0 0 15 0 0 0]
[ 0 0 0 0 0 0 0 17 0 0]
[ 0 0 0 0 0 0 0 0 14 0]
[ 0 0 0 0 0 0 0 0 0 17]]
Patience: 9 / 10
== Epoch: 21 | Classification Loss: 0.506617 | Representation DL Loss: 0.037148 | Accuracy: 99.9124% ==
== Epoch: 21 | Validation CLS Loss: 2.641935 | Validation Representation DL Loss: 0.032302 | F1 Score: 0.991922 == | Current accuracy: 0.987512 ==
[[1322 7 0 2 0 0 0 0 0 0]
[ 8 362 0 0 2 0 0 0 0 0]
[ 0 0 11 0 0 0 0 0 0 0]
[ 5 1 0 113 0 0 0 0 0 0]
[ 0 0 0 0 70 0 0 0 0 0]
[ 0 0 0 0 0 18 0 0 0 0]
[ 0 0 0 0 0 0 25 0 0 0]
[ 0 0 0 0 0 0 0 17 0 0]
[ 0 0 0 0 0 0 0 0 16 0]
[ 0 0 0 0 0 0 0 0 0 23]]
Patience: 10 / 10
== Epoch: 22 | Classification Loss: 0.506135 | Representation DL Loss: 0.036922 | Accuracy: 99.9249% ==
== Epoch: 22 | Validation CLS Loss: 2.578008 | Validation Representation DL Loss: 0.031964 | F1 Score: 0.996073 == | Current accuracy: 0.990509 ==
[[1295 12 0 0 0 0 0 0 0 0]
[ 5 385 0 0 0 0 0 0 0 0]
[ 0 0 14 0 0 0 0 0 0 0]
[ 1 1 0 104 0 0 0 0 0 0]
[ 0 0 0 0 61 0 0 0 0 0]
[ 0 0 0 0 0 30 0 0 0 0]
[ 0 0 0 0 0 0 21 0 0 0]
[ 0 0 0 0 0 0 0 25 0 0]
[ 0 0 0 0 0 0 0 0 30 0]
[ 0 0 0 0 0 0 0 0 0 18]]
Patience: 11 / 10
load the unlabeled 10X test data and apply CANAL to predict cell types
[8]:
adata_test1_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Lung_10X_test.h5ad")
print(adata_test1_10X,np.unique(np.array(adata_test1_10X.obs['cell_type1']),return_counts=True))
adata_test2_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Mammary_Gland_10X_test.h5ad")
print(adata_test2_10X,np.unique(np.array(adata_test2_10X.obs['cell_type1']),return_counts=True))
adata_test3_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Limb_Muscle_10X_test.h5ad")
print(adata_test3_10X,np.unique(np.array(adata_test3_10X.obs['cell_type1']),return_counts=True))
adata_test4_10X=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Spleen_10X_test.h5ad")
print(adata_test4_10X,np.unique(np.array(adata_test4_10X.obs['cell_type1']),return_counts=True))
adata_test_10X = sc.AnnData.concatenate(adata_test1_10X,adata_test2_10X,adata_test3_10X,adata_test4_10X)
print(adata_test_10X,np.unique(np.array(adata_test_10X.obs['cell_type1']),return_counts=True))
pred_cell_type_10X = CANAL.predict(adata_predict = adata_test_10X, ckpt_dir = './ckpts/', experiments = experiments,
stage_num=4, dataset = dataset_stage_4)
AnnData object with n_obs × n_vars = 500 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'natural killer cell', 'stromal cell'], dtype=object), array([ 17, 35, 50, 31, 103, 264]))
AnnData object with n_obs × n_vars = 500 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'stromal cell'], dtype=object), array([ 78, 203, 51, 28, 48, 22, 70]))
AnnData object with n_obs × n_vars = 500 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'mesenchymal stem cell', 'skeletal muscle satellite cell'],
dtype=object), array([ 63, 51, 173, 34, 138, 41]))
AnnData object with n_obs × n_vars = 500 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'macrophage', 'natural killer cell'],
dtype=object), array([345, 114, 33, 8]))
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
AnnData object with n_obs × n_vars = 2000 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm' (array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'mesenchymal stem cell', 'natural killer cell',
'skeletal muscle satellite cell', 'stromal cell'], dtype=object), array([503, 403, 51, 251, 48, 120, 138, 111, 41, 334]))
== Begin predicting after 4 fine-tuning stages: | Experiments: Cross_tissue ==
Annotation: ['B cell' 'T cell' 'endothelial cell' 'macrophage' 'natural killer cell'
'stromal cell' 'basal cell' 'luminal epithelial cell of mammary gland'
'mesenchymal stem cell' 'skeletal muscle satellite cell']
/data/wanh/CANAL/performer_pytorch/performer_pytorch.py:115: UserWarning: torch.qr is deprecated in favor of torch.linalg.qr and will be removed in a future PyTorch release.
The boolean parameter 'some' has been replaced with a string parameter 'mode'.
Q, R = torch.qr(A, some)
should be replaced with
Q, R = torch.linalg.qr(A, 'reduced' if some else 'complete') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2497.)
q, r = torch.qr(unstructured_block.cpu(), some = True)
evaluate the annotation performance
[9]:
true_celltype_10X = np.array(adata_test_10X.obs['cell_type1'])
CANAL.evaluation(pred_cell_type=pred_cell_type_10X, true_celltype=true_celltype_10X)
== Predict total accuracy: 0.969000 ==|== F1 Score: 0.971300 ==|== ARI: 0.938800 ==
Confusion matrix:
[[501 2 0 0 0 0 0 0 0 0]
[ 10 393 0 0 0 0 0 0 0 0]
[ 0 0 51 0 0 0 0 0 0 0]
[ 0 0 0 251 0 0 0 0 0 0]
[ 0 0 0 0 48 0 0 0 0 0]
[ 2 1 0 0 0 116 1 0 0 0]
[ 0 0 0 0 0 0 134 0 0 4]
[ 0 0 0 0 0 0 0 111 0 0]
[ 0 0 0 0 0 0 0 0 40 1]
[ 0 0 0 2 0 8 31 0 0 293]]
load the unlabeled SS2 test data and apply CANAL to predict cell types
[10]:
adata_test_Lung_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Lung_SS2.h5ad")
print(adata_test_Lung_SS2,np.unique(np.array(adata_test_Lung_SS2.obs['cell_type1']),return_counts=True))
adata_test_Mammary_Gland_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Mammary_Gland_SS2.h5ad")
print(adata_test_Mammary_Gland_SS2,np.unique(np.array(adata_test_Mammary_Gland_SS2.obs['cell_type1']),return_counts=True))
adata_test_Limb_Muscle_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Limb_Muscle_SS2.h5ad")
print(adata_test_Limb_Muscle_SS2,np.unique(np.array(adata_test_Limb_Muscle_SS2.obs['cell_type1']),return_counts=True))
adata_test_Spleen_SS2=sc.read_h5ad("/data/wanh/DA/scBERT-master/data/TabulaMuris/TabulaMuris_Spleen_SS2.h5ad")
print(adata_test_Spleen_SS2,np.unique(np.array(adata_test_Spleen_SS2.obs['cell_type1']),return_counts=True))
adata_test_SS2 = sc.AnnData.concatenate(adata_test_Lung_SS2,adata_test_Mammary_Gland_SS2,adata_test_Limb_Muscle_SS2,adata_test_Spleen_SS2)
print(adata_test_SS2,np.unique(np.array(adata_test_SS2.obs['cell_type1']),return_counts=True))
pred_cell_type_SS2 = CANAL.predict(adata_predict = adata_test_SS2, ckpt_dir = './ckpts/', experiments = experiments,
stage_num=4, dataset = dataset_stage_4)
AnnData object with n_obs × n_vars = 1263 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'natural killer cell',
'stromal cell'], dtype=object), array([ 57, 53, 693, 37, 423]))
AnnData object with n_obs × n_vars = 2405 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'stromal cell'],
dtype=object), array([1340, 47, 578, 440]))
AnnData object with n_obs × n_vars = 1090 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'endothelial cell', 'macrophage',
'mesenchymal stem cell', 'skeletal muscle satellite cell'],
dtype=object), array([ 71, 35, 141, 45, 258, 540]))
AnnData object with n_obs × n_vars = 1697 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
uns: 'hvg', 'log1p' (array(['B cell', 'T cell', 'macrophage'], dtype=object), array([1297, 352, 48]))
/data/wanh/ENTER/envs/pytorch/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
[AnnData(sparse.csr_matrix(a.shape), obs=a.obs) for a in all_adatas],
AnnData object with n_obs × n_vars = 6455 × 1000
obs: 'cell_type1', 'cell_ontology_id', 'cluster', 'free_annotation', 'donor', 'gender', 'channel', 'region', 'organ', 'tissue_tSNE_1', 'tissue_tSNE_2', 'cell_ontology_class', 'platform', 'dataset_name', 'organism', 'data_type', 'n_genes', 'n_counts', '__libsize__', 'method', 'plate', 'subsetA', 'subsetA_cluster.ids', 'subsetB', 'subsetB_cluster.ids', 'batch'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm' (array(['B cell', 'T cell', 'basal cell', 'endothelial cell',
'luminal epithelial cell of mammary gland', 'macrophage',
'mesenchymal stem cell', 'natural killer cell',
'skeletal muscle satellite cell', 'stromal cell'], dtype=object), array([1425, 440, 1340, 881, 578, 93, 258, 37, 540, 863]))
== Begin predicting after 4 fine-tuning stages: | Experiments: Cross_tissue ==
Annotation: ['B cell' 'T cell' 'endothelial cell' 'macrophage' 'natural killer cell'
'stromal cell' 'basal cell' 'luminal epithelial cell of mammary gland'
'mesenchymal stem cell' 'skeletal muscle satellite cell']
evaluate the annotation performance
[11]:
true_celltype_SS2 = np.array(adata_test_SS2.obs['cell_type1'])
CANAL.evaluation(pred_cell_type=pred_cell_type_SS2, true_celltype=true_celltype_SS2)
== Predict total accuracy: 0.936000 ==|== F1 Score: 0.842000 ==|== ARI: 0.919200 ==
Confusion matrix:
[[1365 8 0 0 1 49 0 2 0 0]
[ 0 438 0 0 0 1 0 1 0 0]
[ 4 1 1295 3 15 5 0 0 4 13]
[ 0 0 1 877 0 0 1 0 1 1]
[ 0 0 0 0 578 0 0 0 0 0]
[ 0 0 0 0 0 45 0 48 0 0]
[ 0 0 0 0 0 0 250 0 0 8]
[ 0 4 0 0 0 0 0 33 0 0]
[ 0 1 0 2 0 3 3 0 530 1]
[ 0 0 0 8 0 0 224 0 0 631]]
[ ]: