Date of Award

5-2023

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Discipline

Electrical Engineering

Abstract

Single-cell sequencing is a recently advanced revolutionary technology which enables researchers to obtain genomic, transcriptomic, or multi-omics information through gene expression analysis. It gives the advantage of analyzing highly heterogenous cell type information compared to traditional sequencing methods, which is gaining popularity in the biomedical area. Moreover, this analysis can help for early diagnosis and drug development of tumor cells, and cancer cell types. In the workflow of gene expression data profiling, identification of the cell types is an important task, but it faces many challenges like the curse of dimensionality, sparsity, batch effect, and overfitting. However, these challenges can be overcome by performing a feature selection technique which selects more relevant features by reducing feature dimensions. In this research work, recurrent neural network-based feature selection model is proposed to extract relevant features from high dimensional, and low sample size data. Moreover, a deep learning-based gene embedding model is also proposed to reduce data sparsity of single-cell data for cell type identification. The proposed frameworks have been implemented with different architectures of recurrent neural networks, and demonstrated via real-world micro-array datasets and single-cell RNA-seq data and observed that the proposed models perform better than other feature selection models. A semi-supervised model is also implemented using the same workflow of gene embedding concept since labeling data is very cumbersome, time consuming, and requires manual effort and expertise in the field. Therefore, different ratios of labeled data are used in the experiment to validate the concept. Experimental results show that the proposed semi-supervised approach represents very encouraging performance even though a limited number of labeled data is used via the gene embedding concept. In addition, graph attention based autoencoder model has also been studied to learn the latent features by incorporating prior knowledge with gene expression data for cell type classification.

Index Terms — Single-Cell Gene Expression Data, Gene Embedding, Semi-Supervised model, Incorporate Prior Knowledge, Gene-gene Interaction Network, Deep Learning, Graph Auto Encoder

Committee Chair/Advisor

Xiangfang Li

Committee Co-Chair:

Xishuang Dong

Committee Member

John Fuller

Committee Member

Lijun Qian

Committee Member

Lin Li

Publisher

Prairie View A&M University

Rights

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Date of Digitization

6/09/2023

Contributing Institution

John B Coleman Library

City of Publication

Prairie View

MIME Type

Application/PDF

Recommended Citation

Chowdhury, S. (2023). Cell Type Classification Via Deep Learning On Single-Cell Gene Expression Data. Retrieved from https://digitalcommons.pvamu.edu/pvamu-dissertations/15

Download

COinS

All Dissertations

Cell Type Classification Via Deep Learning On Single-Cell Gene Expression Data

Date of Award

Document Type

Degree Name

Degree Discipline

Abstract

Committee Chair/Advisor

Committee Co-Chair:

Committee Member

Committee Member

Committee Member

Publisher

Rights

Date of Digitization

Contributing Institution

City of Publication

MIME Type

Recommended Citation

Browse

Search

Author Corner

All Dissertations

Cell Type Classification Via Deep Learning On Single-Cell Gene Expression Data

Author

Date of Award

Document Type

Degree Name

Degree Discipline

Abstract

Committee Chair/Advisor

Committee Co-Chair:

Committee Member

Committee Member

Committee Member

Publisher

Rights

Date of Digitization

Contributing Institution

City of Publication

MIME Type

Recommended Citation

Share

Browse

Search

Author Corner