Vahid Shariati
Vahid Shariati

Senior Bioinformatician
(Genomics Senior Scientist)

About

Dr. Vahid Shariati is a bioinformatician with extensive expertise in genomics research across several species. Over 10 years at Iran’s National Institute of Genetic Engineering and Biotechnology (NIGEB), he has led whole genome, pan-genome, transcriptome, and meta-analysis projects in human, plants, and bacteria species. Recently, his focus pivots to human cancer research, employing meta-nalysis, artificial intelligence (AI), and multi-omics integration to identify cancer development and progression mechanisms, biomarkers, and develop predictive models. Moreover, his experience as a full-stack developer enables him to design scalable pipelines, databases, and tools for next-generation sequencing (NGS) and genomics data.
Download CV
Interests
  • Bioinformatics
  • Genomics
  • Artificial Intelligence
  • Computational biology
  • Systems biology
Education
  • PhD in Genomics

    Sant' Anna University, Italy, 2010

  • MSc in Molecular Biology

    University of Shiraz, Iran, 2007

  • BSc in Molecular Biology

    University of Mazandaran, Iran, 2004

Experience

  1. Senior Bioinformatician

    NIGEB
    • Population genome diversity
    • Pan-cancer studies
    • Meta-analysis, and omics integration
    • Drug discovery
    • AI models
    • Train and lead junior scientists
  2. Scientific Director of Genome Center

    NIGEB
    • Genomics services and lab establishment
    • Analysis pipelines development
    • Scientific consults
  3. Bioinformatics Research Scientist

    NIGEB
    • Pan and core genome analysis
    • Algorithms for the identification and annotation of genetic variants
    • Bioinformatics support for research projects
    • Data mining, comparative genomics
    • Transcriptome Analysis
    • Population genetics
  4. Research Assistant

    University of Milan
    • Analysis of genomic data
    • Partial chromosome assembly for research project
    • Statistical analysis and visualization
    • Preparation of publications and research reports

Education

  1. PhD in Genomics

    Sant' Anna University, Italy, 2010
    Top student
  2. MSc in Molecular Biology

    University of Shiraz, Iran, 2007
    GPA: 3.8/4.0
  3. BSc in Molecular Biology

    University of Mazandaran, Iran, 2004
    GPA: 3.4/4.0
Skills
Programming
Python

Data Analysis, Pipeline Development, Backend (Flask/Django)
95%

R

Statistical Analysis, Visualization
80%

Bash

Scripting, NGS Pipeline Automation, System Management
95%

JavaScript

Full-Stack Development (React, Node.js, REST APIs)
80%

SQL

Database Design & Querying
70%

Analysis
Multi-Omics

Genomics, Transcriptomics, Proteomics
90%

Systems Biology

Network Analysis, Pathway Modeling
90%

Machine Learning

Supervised/Unsupervised Models, CNNs, GNNs, VAE, LLM
80%

Population

Pan-genome, GWAS, MetaQTL, Demographic Analysis
75%

Cheminformatics

Drug-Target Prediction, Virtual Screening
75%

Soft Skills
Problem Solving

Analytical, Innovative, Adaptable
95%

Communication

Technical Writing, Presentations, Stakeholder Collaboration
95%

Teamwork

Cross-functional Collaboration
95%

Independence

Self-directed Research, Project Ownership
95%

Learning

Continuous proactive engagement with emerging fields
95%

Languages
80%
English
60%
Italian
30%
French
100%
Persian
Project Experiences
Cancers Meta-analysis
Cancers Meta-analysis

The Cancers Meta-analysis Projects integrate comprehensive transcriptome, genome, and epigenome datasets to advance precision oncology through rigorous meta-analysis and cutting-edge AI models. Focused on high-impact cancers—including gastric, triple-negative breast cancer (TNBC), colorectal, lung, glioblastoma, and pancreatic cancers.

Jan 26, 2024

Drug Discovery
Drug Discovery

Projects bridges AI-driven target prioritization with precision experimental pipelines to uncover novel therapeutics. Leveraging structural bioinformatics and high-throughput virtual screening, the platform computationally evaluates millions of compounds against metagenes—AI-identified genes critical to disease mechanisms (e.g., cancer progression, antimicrobial resistance).

Jan 13, 2024

Pipelines and Tools
Pipelines and Tools

Delivering modular, software solutions to streamline data analysis, diagnostics, and industrial workflows across research and commercial sectors. By developing customizable pipelines in Python (automation, machine learning integration), Bash (high-performance computing orchestration), R (statistical modeling, visualization), and JavaScript (interactive web interfaces), the projects empowers users to process complex datasets with reproducibility and scalability.

Nov 21, 2023

Olive Genomics
Olive Genomics

Project harnesses global genome and transcriptome sequencing of olive genotypes to decode biodiversity, evolutionary adaptation, and human-driven migration patterns. By assembling high-resolution datasets from wild and cultivated varieties across continents, the project maps genetic diversity hotspots, identifies loci underpinning traits like drought tolerance and fruit yield, and reconstructs ancient dispersal routes shaped by trade and agriculture.

Oct 26, 2023

Small Genomes
Small Genomes

leveraging high-resolution genome sequencing and annotation to decode the functional and evolutionary landscapes of bacteria, fungi, and viruses. By systematically assembling and annotating small genomes across diverse taxa and ecological niches, the project uncovers conserved metabolic pathways, virulence factors, and niche-specific adaptations.

Oct 22, 2023

Upcoming Publications
A comprehensive transcriptomic meta-Analysis leveraging deep learning to uncover molecular signatures and potential therapeutic targets in Triple-Negative Breast Cancers

A comprehensive transcriptomic meta-Analysis leveraging deep learning to uncover molecular signatures and potential therapeutic targets in Triple-Negative Breast Cancers

Decoding Gastric Cancer: An AI-Driven Transcriptomic Meta-Analysis

Decoding Gastric Cancer: An AI-Driven Transcriptomic Meta-Analysis

Gastric cancer (GC) remains a leading cause of global cancer mortality, necessitating deeper insights into its molecular mechanisms. This meta-analysis and systematic review integrated transcriptomic data from 28 studies (14 RNA-seq, 13 microarray) to identify critical genes and pathways driving GC progression. Leveraging AI-driven approaches for data harmonization and batch effect correction, we standardized raw datasets from public repositories (GEO, SRA, TCGA) and performed rigorous quality control. Differential expression analysis using edgeR and LIMMA identified 1,163 differentially expressed genes (DEGs), including CST1 (most up-regulated) and PGA3 (most down-regulated). Pathway enrichment revealed tumor proliferation (E2F targets, G2-M checkpoint), ECM remodeling (collagens, MMPs), immune evasion (CXCL chemokines), and metabolic reprogramming as key processes. Protein-protein interaction (PPI) network analysis highlighted hub genes such as AURKA, COL1A1, and IL6, while AI-enhanced clustering delineated functional modules linked to metastasis and prognosis. Survival and immune infiltration analyses underscored the clinical relevance of identified genes. Notably, ERBB4 down-regulation and collagen family up-regulation were mechanistically tied to apoptosis resistance and microenvironment stiffening. AI algorithms further aided in resolving dataset heterogeneity and prioritizing high-confidence biomarkers. This study provides a comprehensive transcriptomic landscape of GC, emphasizing the interplay between genetic drivers, tumor microenvironment, and immune evasion. The integration of AI methodologies enhanced robustness in cross-study data synthesis, offering novel therapeutic targets and underscoring the potential of computational strategies in advancing GC research. These findings illuminate pathways for precision oncology and underscore the need for multi-omics approaches to unravel GC complexity.

Oleuropein's Effects on Breast Cancer Revealed by RNA-Sequencing and Machine Learning

Oleuropein's Effects on Breast Cancer Revealed by RNA-Sequencing and Machine Learning

Breast cancer (BC) remains a leading cause of cancer-related morbidity and mortality worldwide, highlighting the Critical need for innovative treatment strategies. Phytochemicals, bioactive compounds derived from plants, have emerged as promising candidates in cancer therapy due to their diverse anti-cancer properties. Oleuropein, a polyphenol found in olive oil, has shown potential in modulating key signaling pathways, inducing apoptosis, and inhibiting metastasis in various cancer models. In this study, we investigated the effects of oleuropein on genome expression profile of MDA-MB-231 BC cell line by RNA-sequencing method. The cell line treated with 200 μL of oleuropein for 48 hours, total RNA extracted from both treated and untreated cells and RNA sequencing performed to assess global gene expression changes. Differential Gene Expression (DEG) analysis was conducted to evaluate pharmacological effects of Oleuropein treatment through pathway analysis and deep learning models. A comprehensive RNA-sequencing analysis revealed a total of 137 differentially expressed genes in MDA-MB-231cells treated with oleuropein. Of these, 115 genes were downregulated, while 21 genes were upregulated during the study. These findings suggest that oleuropein exerts a significant impact on breast cancer cells by modulating multiple molecular mechanisms. The downregulation of numerous genes involved in cell proliferation, survival, and invasion pathways indicates the potential for oleuropein to inhibit tumor growth and metastasis in BC.

Towards a pan-cancer atlas of endoplasmic reticulum stress network

Towards a pan-cancer atlas of endoplasmic reticulum stress network

Endoplasmic reticulum (ER) stress and the unfolded protein response (UPR) pathway play pivotal roles in cancer progression and therapy resistance, yet their pan-cancer dynamics and clinical implications remain poorly understood. This study presents a comprehensive analysis of ER stress and UPR pathway activity across 32 cancer types using The Cancer Genome Atlas (TCGA) data. By integrating gene-centric and pathway-centric approaches, including single-sample Gene Set Enrichment Analysis (ssGSEA), we characterized the expression landscape, tumor microenvironment interactions, and clinical relevance of UPR signaling. Our results revealed coordinated ER stress gene expression patterns in primary tumors, with UPR pathway activity significantly elevated in most cancers compared to adjacent normal tissues. Tumor purity inversely correlated with ER stress activity, underscoring microenvironmental influences. Differential expression analysis identified 61 UPR-related genes dysregulated across cancers, with IRE1 and PERK branches predominantly upregulated. Clinically, elevated UPR activity correlated with poor prognosis, advanced tumor stages, and resistance to therapies targeting EGFR, chromatin remodeling, and DNA repair. Co-expression networks highlighted UPR interactions with DNA repair and extracellular matrix pathways, while hallmark pathway analysis linked UPR to mTORC1 signaling, hypoxia, and epithelial-mesenchymal transition. Immune profiling revealed UPR-associated shifts in cytotoxic T cells and macrophages, suggesting microenvironmental modulation. Drug response analysis demonstrated UPR-mediated resistance to EGFR inhibitors and PARP inhibitors, implicating IRE1 as a key contributor. This study establishes the UPR as a central regulator of cancer progression, offering insights into its dual roles in tumor survival and therapy resistance. Our findings advocate for UPR pathway inhibition as a promising strategy to enhance treatment efficacy, particularly in lung, gastrointestinal, and kidney cancers.

Pan-cancer analysis of SQSTM1/p62 reveals its comprehensive contribution to shaping tumor microenvironment and anti-tumor immunity

Pan-cancer analysis of SQSTM1/p62 reveals its comprehensive contribution to shaping tumor microenvironment and anti-tumor immunity

Sequestosome 1 (SQSTM1)/p62 is a multifunctional protein involved in diverse physiological processes, and it has been evidenced that its dysregulation implicated in tumorigenesis. Using TCGA pan-cancer data, we analyzed p62 genomic alterations, expression patterns, and clinical relevance. Our results show that p62 mutations and copy number variations (CNVs) are rare, suggesting that prognostic significance of this gene is poor. However, p62 gene expression was significantly elevated in several cancers, including BRCA and LUAD , where it correlated with poorer overall survival and advanced tumor stages. Pathway analyses showed a strong association between p62 and oncogenic features, such as oxidative phosphorylation, reactive oxygen species (ROS), increased tumor mutation burden (TMB), and microsatellite instability (MSI). Intrestingly, p62 expression was inversely associated with immune cell infiltration and positively correlated with immunosuppressive markers, suggesting its role in fostering an immunosuppressive tumor microenvironment (TME) in most types of cancer. Therefore, p62 plays a pivotal role in cancer as both a driver of oncogenesis and a modulator of the TME, supporting its potential as a biomarker and therapeutic target to enhance the efficacy of immunotherapies, particularly immune checkpoint inhibitors (ICIs). Through docking-based virtual screening, we finally identified four natural-product-derived inhibitors targeting the PB1 domain of p62, which is essential for its self-oligomerization, with favorable pharmacokinetic profiles.

Advanced Molecular Mechanisms of Baicalein's Neuroprotective Effects in Neurodegenerative Disease Treatments

Advanced Molecular Mechanisms of Baicalein's Neuroprotective Effects in Neurodegenerative Disease Treatments

Baicalein is a flavonoid that demonstrates extensive and significant therapeutic potential for numerous neurodegenerative diseases (NDs). Due to its shared influences on the remediation of NDs, our study adopts a comprehensive approach to illuminate the underlying mechanisms responsible for the effects of baicalein. We initiated our investigation by computationally identifying the potential protein targets of baicalein using SwissTargetPrediction in Homo sapiens. Concurrently, we used the DisGeNET database to identify genes related to NDs. By integrating with baicalein-predicted targets, we build an inclusive network that highlights complex relationships between genes, diseases, and baicalein. Our findings revealed that baicalein predominantly affects the AGE/RAGE pathway, interleukin (notably IL-18 and IL-17), and NF-κB signalings, the pathways associated with inflammation and immune-related functions. Furthermore, the effects of baicalein extend to the pathways with critical roles in NDs, such as BDNF and PI3K-Akt signaling. Using protein-protein interaction networks to validate our findings, we identified key hub proteins (AR, EGFR, SIRT1, MAPK3, APP, ESR1, PTGS2, MMP9, and GSK3B) that may mediate the therapeutic effects of baicalein against NDs. In this case, molecular docking indicates strong binding affinities between them and baicalein. In summary, our detailed study highlights baicalein's potential as a promising treatment for NDs, offering a molecular basis for its effectiveness.

Recent Publications
Machine learning-aided microRNA discovery for olive oil quality

Machine learning-aided microRNA discovery for olive oil quality

MicroRNAs (miRNAs) are key regulators of gene expression in plants, influencing various biological processes such as oil quality and seed development. Although, our knowledge about miRNAs in olive (Olea europaea L.) is progressing, with several miRNAs being identified in previous studies, but most of these reported miRNAs have been predicted without the aid of a reference genome, primarily due to limited genome accessibility at the time. However, significant knowledge gaps still need to be improved in this area. This study addresses the complexities of miRNA detection in olive, using a high quality reference genome and a combination of genomics and machine learning-based methods. By leveraging random forest and support vector machine algorithms, we successfully identified 56 novel miRNAs in olive, surpassing the limitations of conventional homology-based methods. Our subsequent analysis revealed that some of these miRNAs are implicated in the regulation of key genes involved in oil quality. Within the context of oil biosynthesis pathways, the novel miRNA Oeu124369 regulates fatty acid biosynthesis by targeting acetyl-CoA acyltransferase 1 and palmitoyl-protein thioesterase, thereby influencing the production of acetyl-CoA and palmitic acid, respectively. These findings underscore the power of machine learning in unraveling the complex miRNA regulatory network in olive and provide a high quality miRNA resource for future research aimed at improving olive oil production by exploring the target genes of the identified miRNAs to understand their role and their biological processes.

Unraveling the genetic basis of oil quality in olives: a comparative transcriptome analysis

Unraveling the genetic basis of oil quality in olives: a comparative transcriptome analysis

The balanced fatty acid profile of olive oil not only enhances its stability but also contributes to its positive effects on health, making it a valuable dietary choice. Olive oil's high content of unsaturated fatty acids and low content of saturated fatty acids contribute to its beneficial effects on cardiovascular diseases and cancer. The quantities of these fatty acids in olive oil may fluctuate due to various factors, with genotype being a crucial determinant of the oil's quality. This study investigated the genetic basis of oil quality by comparing the transcriptome of two Iranian cultivars with contrasting oil profiles; Mari, known for its high oleic acid content, and Shengeh, characterized by high linoleic acid at Jaén index four. Gas chromatography confirmed a significant difference in fatty acid composition between the two cultivars. Mari exhibited significantly higher oleic acid content (78.48%) compared to Shengeh (48.05%), while linoleic acid content was significantly lower in Mari (4.76%) than in Shengeh (26.69%). Using RNA sequencing at Jaén index four, we analyzed genes involved in fatty acid biosynthesis. Differential expression analysis identified 2775 genes showing statistically significant differences between the cultivars. Investigating these genes across nine fundamental pathways involved in oil quality led to the identification of 25 effective genes. Further analysis revealed 78 transcription factors and 95 transcription binding sites involved in oil quality, with BPC6 and RGA emerging as unique factors. This research provides a comprehensive understanding of the genetic and molecular mechanisms underlying oil quality in olive cultivars. The findings have practical implications for olive breeders and producers, potentially streamlining cultivar selection processes and contributing to the production of high-quality olive oil.

Interaction between high-intensity interval training and high-protein diet on gut microbiota composition and body weight in obese male rats

Interaction between high-intensity interval training and high-protein diet on gut microbiota composition and body weight in obese male rats

Diet and exercise are two critical factors that regulate gut microbiota, affecting weight management. The present study investigated the effect of 10 weeks of high-intensity interval training (HIIT) and a high-protein diet (HPD) on gut microbiota composition and body weight changes in obese male Wistar rats. Forty obese rats were randomly divided into five groups, including HPD, HIIT + HPD, HIIT + high-fat diet (HFD) (continuing HFD during intervention), obese control 1 (continuing HFD during intervention), obese control 2 (cutting off HFD at the beginning of the intervention and continuing standard diet), and eight non-obese Wistar rats as a non-obese control (NOC) group (standard diet). Microbial community composition and diversity analysis by sequencing 16S rRNA genes derived from the fecal samples, body weight, and Lee index were assessed. The body weight and Lee index in the NOC, HIIT + HFD, HPD, and HIIT + HPD groups were significantly lower than that in the OC1 and OC2 groups along with the lower body weight and Lee index in the HPD and HIIT + HPD groups compared with the HIIT + HFD group. Also, HFD consumption and switching from HFD to a standard diet or HPD increased gut microbiota dysbiosis. Furthermore, HIIT along with HFD increased the adverse effects of HFD on gut microbiota, while the HIIT + HPD increased microbial richness, improved gut microbiota dysbiosis, and changed rats’ phenotype to lean. It appears that HFD discontinuation without doing HIIT does not improve gut microbiota dysbiosis. Also, the HIIT + HFD, HPD, and HIIT + HPD slow down HFD-induced weight gain, but HIIT + HPD is a more reliable strategy for weight management due to its beneficial effects on gut microbiota composition.

Meta-analysis of transcriptome reveals key genes relating to oil quality in olive

Meta-analysis of transcriptome reveals key genes relating to oil quality in olive

A deep search of RNA-seq published data shed light on thirty-nine experiments associated with the olive transcriptome, four of these proved to be ideal for meta-analysis. Meta-analysis confirmed the genes identified in previous studies and released new genes, which were not identified before. According to the IDR index, the meta-analysis had good power to identify new differentially expressed genes. The key genes were investigated in the metabolic pathways and were grouped into four classes based on the biosynthetic cycle of fatty acids and factors that affect oil quality. Galactose metabolism, glycolysis pathway, pyruvate metabolism, fatty acid biosynthesis, glycerolipid metabolism, and terpenoid backbone biosynthesis were the main pathways in olive oil quality. In galactose metabolism, raffinose is a suitable source of carbon along with other available sources for carbon in fruit development. The results showed that the biosynthesis of acetyl-CoA in glycolysis and pyruvate metabolism is a stable pathway to begin the biosynthesis of fatty acids. Key genes in oleic acid production as an indicator of oil quality and critical genes that played an important role in production of triacylglycerols were identified in different developmental stages. In the minor compound, the terpenoid backbone biosynthesis was investigated and important enzymes were identified as an interconnected network that produces important precursors for the synthesis of a monoterpene, diterpene, triterpene, tetraterpene, and sesquiterpene biosynthesis.

Transcriptional profile of ovine oocytes matured under lipopolysaccharide treatment in vitro

Transcriptional profile of ovine oocytes matured under lipopolysaccharide treatment in vitro

Lipopolysaccharide (LPS) derived from gram negative bacteria cell wall is known to cause ruminal acidosis and/or infectious diseases such as metritis and mastitis which has a significant negative impact on the reproductive performance. This study aimed to investigate the effect of LPS on oocyte maturation and subsequent development in vitro. Ovine cumulus oocyte complexes (COCs) were matured in a medium supplemented with 0 (control), 0.01, 0.1, 1 and 10 μg/mL LPS. Nuclear maturation, cleavage and blastocyst rate, mitochondrial membrane potential (ΔΨm), intracellular reactive oxygen species (ROS) content and changes to the transcript abundance were evaluated. In case of the maturation rate, the percentage of oocytes reaching the MII stage was lower following exposure to 10 μg/mL LPS in comparison to the control group (P < 0.05). Moreover, the blastocyst rate decreased in case of 1 and 10 μg/mL LPS when compared to the control group (P < 0.05). ROS overproduction accompanied by a decreased ΔΨm were recorded in LPS treated oocytes in comparison to the control group (P < 0.05). The 3′ tag digital gene expression profiling method revealed that 7887 genes were expressed while only seven genes exhibited changes in the transcript abundance following exposure to LPS. Tripartite motif containing 25 (TRIM25), Tripartite motif containing 26 (TRIM26), Zona Pellucida glycoprotein 3 (ZP3), Family with sequence similarity 50-member A (FAM50A), Glyoxalate and hydroxy pyruvate reductase (GRHPR), NADH ubiquinase oxireductase subunit A8 (NDUFA8) were down-regulated (P < 0.05), while only Centrin 3 (CETN3) was up-regulated (P < 0.05). Our findings show that LPS has undesirable effects on the maturation competence of ovine oocytes and subsequent embryo development. In addition, the transcriptomic profiling results may shed more light on the molecular mechanisms of LPS-induced infertility in ruminants.

Comprehensive genomic analysis of an indigenous Pseudomonas pseudoalcaligenes degrading phenolic compounds

Comprehensive genomic analysis of an indigenous Pseudomonas pseudoalcaligenes degrading phenolic compounds

Environmental contamination with aromatic compounds is a universal challenge. Aromatic-degrading microorganisms isolated from the same or similar polluted environments seem to be more suitable for bioremediation. Moreover, microorganisms adapted to contaminated environments are able to use toxic compounds as the sole sources of carbon and energy. An indigenous strain of Pseudomonas, isolated from the Mahshahr Petrochemical plant in the Khuzestan province, southwest of Iran, was studied genetically. It was characterized as a novel Gram-negative, aerobic, halotolerant, rod-shaped bacterium designated Pseudomonas YKJ, which was resistant to chloramphenicol and ampicillin. Genome of the strain was completely sequenced using Illumina technology to identify its genetic characteristics. MLST analysis revealed that the YKJ strain belongs to the genus Pseudomonas indicating the highest sequence similarity with Pseudomonas pseudoalcaligenes strain CECT 5344 (99% identity). Core- and pan-genome analysis indicated that P. pseudoalcaligenes contains 1,671 core and 3,935 unique genes for coding DNA sequences. The metabolic and degradation pathways for aromatic pollutants were investigated using the NCBI and KEGG databases. Genomic and experimental analyses showed that the YKJ strain is able to degrade certain aromatic compounds including bisphenol A, phenol, benzoate, styrene, xylene, benzene and chlorobenzene. Moreover, antibiotic resistance and chemotaxis properties of the YKJ strain were found to be controlled by two-component regulatory systems.

PrESOgenesis: a two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach

PrESOgenesis: a two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach

Successful spermatogenesis and oogenesis are the two genetically independent processes preceding embryo development. To date, several fertility-related proteins have been described in mammalian species. Nevertheless, further studies are required to discover more proteins associated with the development of germ cells and embryogenesis in order to shed more light on the processes. This work builds on our previous software (OOgenesis_Pred), mainly focusing on algorithms beyond what was previously done, in particular new fertility-related proteins and their classes (embryogenesis, spermatogenesis and oogenesis) based on the support vector machine according to the concept of Chou’s pseudo-amino acid composition features. The results of five-fold cross validation, as well as the independent test demonstrated that this method is capable of predicting the fertility-related proteins and their classes with accuracy of more than 80%. Moreover, by using feature selection methods, important properties of fertility-related proteins were identified that allowed for their accurate classification. Based on the proposed method, a two-layer classifier software, named as “PrESOgenesis” (https://github.com/mrb20045/PrESOgenesis) was developed. The tool identified a query sequence (protein or transcript) as fertility or non-fertility-related protein at the first layer and then classified the predicted fertility-related protein into different classes of embryogenesis, spermatogenesis or oogenesis at the second layer.