Construction of gene networks and interpretation of functional roles and disease phenotypes
NetCrafter is an intuitive, user-friendly tool for automatically visualizing gene and function networks,
and predicting representative functions and disease phenotypes for each networked cluster.
Users can easily obtain visualized networks with interpretation by simply providing a list of genes or functions, without the need for additional relationship data. NetCrafter utilizes predefined gene-to-function relationships retrieved from GeneOntology (GO) and gene-to-phenotype data from Human Phenotype Ontology (HPO) to generate the gene or function/phenotype networks and their annotations. |
Network analysis in Q-omics
User-provided gene or function lists can be directly used to generate multi-functional/-phenotypic networks, allowing for the analysis of representitive functions/phenotypes within each networked cluster.
NetCrafter is seamlessly integrated into Q-omics, enabling network generation and interpretation for RNAs, Proteins and CRISPRs data retrieved from Q-omics data mining workflows.
How NetCrafter creates gene networks
1. Input data: a list of genes
2. Quantify the functional similarity for each gene pair
3. Network visualization of connected genes defined by the functional similarity
4. Assigning a representative function to each network based on the dominant function
most contributing to edges within that network
* Nodes represent genes and edges represent the functional similarity (weight sum of
shared functions)
* Node size indicates how many functions are associated with the corresponding gene
* Edge length is inversely propoortional to the Tanimoto score
* The representative function of a network cluster is the function most contributing to
the formation of edges
* Calculating the functional similarity between genes
NetCrafter calculates functional similarity between genes using weighted Tanimoto coefficient (Jaccard Index) based on the overlap of shared functions
Over 7,000 functions, each containing 3 to 300 genes, are retrieved from Gene Ontology (GO) biological terms and Reactome DB.
Each function is assigned a weight based on the number of genes it contains.
Functions with fewer genes are assigned higher weights, contributing more to the Tanimoto similarity score. As a result, two genes sharing functions with higher weights will have a greater functional similarity.
NetCrafter assigns Tanimoto weights to each of ~7,000 functions as below:
Weights of each function are inversely calculated based on the number of genes it contains |
Example) Calculating functional similarity of two genes based on weighted Tanimoto scores
Gene1 belongs to 4 functions: function(a, 3, 300, 297) function(d, 233, 70, 40) function(g, 9, 294, 273) function(e, 15, 288, 218) |
Gene2 belongs to 5 functions: function(b, 300, 3, 1) function(c, 100, 203, 140) function(f, 35, 268, 218) function(g, 9, 294, 273) function(e, 15, 288, 218) |
Ratio of functional overlap: g + e / a + b + c + d + e + f + g
Standard Tanimoto score = 1+1/1+1+1+1+1+1+1 = 2/7 = 0.29
Linear weighted Tanimoto score = 294 + 288 / 300+3+203+70+288+268+294 = 582/1426 = 0.38
Non-linear weighted Tanimoto score = 273 + 218 / 297+1+140+40+218+218+273 = 529/1224 = 0.42
NetCrafter uses Linear or non-linear Tanimoto score for quantifying the functional similarity in gene networks, while Standard Tanimoto score is used for function networks
How NetCrafter creates function networks
1. Input data: a list of functions
2. Quantify the gene similarity for each function pair
3. Network visualization of connected functions defined by the gene similarity
4. Nodes represent functions and edges represent the gene similarity (Standard Tanimoto
score)
5. Assigning a representative function to each network based on the size of node
* Node size indicates how many genes are associated with the corresponding function
* Edge length is inversely propoortional to the Tanimoto score
* The representative function of a network cluster is determined by the function with the
largest number of associated genes (i.e., the function represented by the largest node)