gogoWebsite

Pathway Enrichment I for Sangshin Learning (GO Analysis)

Updated to 29 minutes ago

Pathway Enrichment I (GO Analysis) for Sangshin Learning:

Theoretical knowledge of enrichment analysis

enrichment analysis(Enrichment Analysis (EA) is a statistical method widely used in bioinformatics research, which is mainly used to examine the degree of enrichment of certain functions or features in a collection of genes. The main purpose of enrichment analysis is to identify biologically significant patterns and functions from a large amount of gene data. According to the goal and method of analysis, enrichment analysis can be categorized into the following types:

Gene ontology enrichment analysis(Gene Ontology Enrichment Analysis): this is the most commonly used type of enrichment analysis to examine the enrichment of Gene Ontology (GO) entries in a gene collection. This helps the researcher to understand the common features of the genes in a gene collection in terms of biological processes, molecular functions and cellular composition.

Pathway enrichment analysis(Pathway Enrichment Analysis): This type of enrichment analysis focuses on the role of genes in metabolic pathways and signaling pathways. By examining the pathway enrichment in a collection of genes, researchers can understand the functions and regulatory mechanisms of these genes in an organism. Pathway databases such as KEGG (Kyoto Encyclopedia of Genes and Genomes) and Reactome are common resources for pathway enrichment analysis.

Gene set enrichment analysis(Gene Set Enrichment Analysis, GSEA): GSEA is a method designed to detect associations between a collection of genes (e.g., differentially expressed genes) and certain biological features (e.g., gene ontology, pathways, diseases, phenotypes, etc.).GSEA helps researchers to understand the associations between a collection of genes and biological functions and processes, and thus reveals the potential biological significance.

Protein-protein interaction enrichment analysis(Protein-Protein Interaction Enrichment Analysis): This type of enrichment analysis focuses on the interactions between proteins and helps researchers to understand the function of proteins in a collection of genes in cell signaling and metabolic processes.

Gene expression regulation enrichment analysis: This type of enrichment analysis focuses on the role of transcription factors, miRNAs and other regulatory factors in the regulation of gene expression. Through this type of enrichment analysis, researchers can understand the regulatory mechanisms and interrelationships of gene expression.

Theoretical knowledge of GO analysis

genetic ontology(Gene Ontology, GO) is a standardized body of terminology used to describe the properties of genes and gene products. It provides an organized way to represent the various roles of genes within an organism. Gene ontology usually describes genes at three levels: cellular component (CC), biological process (BP), and molecular function (MF).

cellular component(Cellular Component, CC): This dimension describes the localization of gene products (e.g., proteins) within the cell. For example, they may be located in the nucleus, cytoplasm, mitochondrial membrane, or other organelles. This helps to understand the role and function of the gene product within the cell.

biological process(Biological Process, BP): this level describes the biological processes in which genes are involved. These processes may include cell growth, signaling, gene expression regulation, metabolic pathways, and so on. By understanding the biological processes in which genes are involved, we can better understand the physiological functions of organisms and the mechanisms of disease occurrence.

molecular function(Molecular Function, MF): This dimension describes the function of a gene product at the molecular level, usually involving interactions with other molecules or catalyzing biochemical reactions. For example, a gene product may be an enzyme that catalyzes a specific biochemical reaction, or it may be a structural protein involved in the assembly and maintenance of the cytoskeleton.

Gene ontology provides researchers with a systematic way to represent and share knowledge about the functions and processes of genes and gene products in organisms. This helps to facilitate the development of gene function studies, improve research efficiency, and provide important information for disease treatment and drug development.

Show some related graphs

在这里插入图片描述
在这里插入图片描述

GO analyzes the fundamentals:

Preparing a list of genes: First, you need a list of genes to be analyzed, which is usually a collection of differentially expressed genes or genes associated with specific conditions obtained from experimental data.

Mapping genes to GO entries: Each gene in the gene list then needs to be mapped to the corresponding GO entry. This can be done by using bioinformatics tools and databases (e.g. DAVID, Ensembl, AmiGO, etc.).

Statistical enrichment: Next, the enrichment of each GO entry in the gene list needs to be calculated. This is usually done by comparing the ratio between the number of genes actually observed and the number of genes expected based on the random distribution of the background genome. Commonly used statistical methods include the hypergeometric test, Fisher's exact test, and chi-square test.

Multiple Comparison Correction: Since GO enrichment analysis involves a large number of hypothesis tests, multiple comparison correction is required to minimize false positive results. Commonly used multiple comparison correction methods include Bonferroni correction, Benjamini-Hochberg correction (FDR), and so on.

Interpretation and visualization of results: Finally, the gene list can be interpreted based on the results of enrichment analysis to identify biologically significant functions, processes and molecular functions. In addition, the enrichment results can be displayed graphically by various visualization tools (e.g. Cytoscape, REVIGO, etc.) for easy understanding and communication.

Gene ontology analysisHelps researchers to extract biological significance from gene expression data by assessing the enrichment of specific biological functions, processes or molecular functions in a collection of genes. This helps to reveal the association of genes in terms of biological processes and functions, thereby contributing to the understanding of gene regulation and mechanisms of action in organisms.

GO Analysis Code

// An highlighted block
degdf <- FindMarkers(scRNA1,ident.1 = "DapiNeg1",ident.2 = "DapiNeg2", 
                     logfc.threshold = 0.5,group.by = "",ident=1)
degdf <- FindAllMarkers(scRNA1)

saveRDS(degdf,"")
# degdf<-readRDS("")
# BiocManager::install("")
# BiocManager::install("rlang")
# BiocManager::install("vctrs",force = TRUE)
# 
# 
# 
# install.packages("clusterProfiler")
# install.packages("GOSemSim")
# 
# 
# 
# options(connectionObserver = NULL)
library(org.Hs.eg.db)
library(GOSemSim)

library(clusterProfiler)
degs.list=rownames(degdf)
erich.go.BP = enrichGO(gene =degs.list,
                       OrgDb = org.Hs.eg.db,
                       keyType = "SYMBOL",
                       ont = "BP",
                       pvalueCutoff = 0.05,
                       qvalueCutoff = 0.05)

在这里插入图片描述
在这里插入图片描述

The creation of the GO Project dates back to 1998 and its results and methods have been widely adopted and published in numerous research papers. The following are some of the key publications on the GO Project:

  1. Ashburner, M., Ball, ., Blake, ., Botstein, D., Butler, H., Cherry, ., Davis, ., Dolinski, K., Dwight, ., Eppig, ., Harris, ., Hill, ., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, ., Richardson, ., Ringwald, M., Rubin, ., & Sherlock, G. (2000). Gene Ontology: tool for the unification of biology. Nature Genetics, 25(1), 25–29. /10.1038/75556

This paper is an important document on the creation and initial realization of the Gene Ontology project. The authors describe the background of the project, its goals, and the conceptualization and realization of the three main components (cellular components, biological processes, and molecular functions).

  1. The Gene Ontology Consortium. (2017). Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Research, 45(D1), D331–D338. /10.1093/nar/gkw1108

This paper describes the expansion of the Gene Ontology knowledge base and resources, including the development of terminology, annotations, and tools. The article also describes recent advances in the GO program to support the study of gene function.
More biosignature knowledge welcome to exchange v: coffeeiix (also can take single-cell transcriptome analysis training)