First Glance at KEGGgraph
Posted on
This post is based on
- Zhang JD, Wiemann S (2022-11-01). KEGGgraph: a graph approach to KEGG PATHWAY in R and Bioconductor.
- Zhang JD (2022-11-01). KEGGgraph: Application Examples
Pathways are stored and presented as graphs on the KEGG server side, where
- nodes are molecules (protein, compound, etc)
- edges represent relation types between the nodes, e.g., activation or phosphorylation (氧化磷酸化作用)
KEGGgraph offers the following functionalities:
- Parsing: one node in KEGG pathway does not necessarily map to merely one gene product, for example, the node “ERK” in the human TGF-Beta signaling pathway contains two homologues, MAPK1 and MAPK3. User can set whether to expand these nodes topologically.
- Graph operations: subset and merge
- Visualization
the pathway identifier is in the form of [a-z]3[0-9]5
, where the
three-alphabet code represent the organism and the five digits represent
pathway.
Since Auguest 2009, KGML files are stored separately in different
subdirectories depending on whether they record metabolic or
nonmetabolic pathways. retrieveKGML
first tries to download the file
from the non-metabolic subdirectory, and tries the metabolic directory
in case no file was found in the non-metabolic step.
Parsing and graph feature query
first read in KGML file for human MAPK signaling pathway (with KEGG ID hsa04010)”
mapkKGML = system.file("extdata/hsa04010.xml", package = "KEGGgraph")
we can either parse them into an object of KEGGPathway or an object of graph.
- KEGGPathway maintains the information of the pathway (title, link, organism, etc)
- graph objects are more natural approach and can be directly plugged in many other tools
library(KEGGgraph)
##
## Attaching package: 'KEGGgraph'
## The following object is masked from 'package:graphics':
##
## plot
## The following object is masked from 'package:base':
##
## plot
mapkG = parseKGML2Graph(mapkKGML, expandGenes = TRUE)
mapkG
## A graphNEL graph with directed edges
## Number of Nodes = 265
## Number of Edges = 876
alternatively,
mapkpathway = parseKGML(mapkKGML)
mapkpathway
## KEGG Pathway
## [ Title ]: MAPK signaling pathway
## [ Name ]: path:hsa04010
## [ Organism ]: hsa
## [ Number ] :04010
## [ Image ] :http://www.genome.jp/kegg/pathway/hsa/hsa04010.gif
## [ Link ] :http://www.genome.jp/dbget-bin/show_pathway?hsa04010
## ------------------------------------------------------------
## Statistics:
## 136 node(s)
## 171 edge(s)
## 0 reaction(s)
## ------------------------------------------------------------
mapkG2 = KEGGpathway2Graph(mapkpathway, expandGenes = TRUE)
expandGenes
: ?? not exactly clear aboutgenesOnly
Edges in KEGG pathways are directional, not guarantee a reverse relation, although reciprocal edges are also allowed.
extract the node attributes
mapkGnodedata = getKEGGnodeData(mapkG)
mapkGnodedata[[2]]
## KEGG Node (Entry 'hsa:5924'):
## ------------------------------------------------------------
## [ displayName ]: RASGRF1, GRF1...
## [ Name ]: hsa:5924
## [ Type ]: gene
## [ Link ]: http://www.genome.jp/dbget-bin/www_bget?hsa+5923+5924
## ------------------------------------------------------------
alternatively,
getKEGGnodeData(mapkG, 'hsa:5924')
## KEGG Node (Entry 'hsa:5924'):
## ------------------------------------------------------------
## [ displayName ]: RASGRF1, GRF1...
## [ Name ]: hsa:5924
## [ Type ]: gene
## [ Link ]: http://www.genome.jp/dbget-bin/www_bget?hsa+5923+5924
## ------------------------------------------------------------
Similarly,
mapkGedgedata = getKEGGedgeData(mapkG)
mapkGedgedata[[4]]
## KEGG Edge (Type: PPrel):
## ------------------------------------------------------------
## [ Entry 1 ID ]: hsa:5923
## [ Entry 2 ID ]: hsa:3845
## [ Subtype ]:
## [ Subtype name ]: activation
## [ Subtype value ]: -->
## ------------------------------------------------------------
alternatively,
getKEGGedgeData(mapkG,'hsa:5923~hsa:3845')
## KEGG Edge (Type: PPrel):
## ------------------------------------------------------------
## [ Entry 1 ID ]: hsa:5923
## [ Entry 2 ID ]: hsa:3845
## [ Subtype ]:
## [ Subtype name ]: activation
## [ Subtype value ]: -->
## ------------------------------------------------------------
Roughly speaking the out-degree reflects the regulatory role, while the in-degree suggests the subjectability of the protein to intermolecular regulations.
mapkGoutdegrees = sapply(edges(mapkG), length)
mapkGindegrees = sapply(inEdges(mapkG), length)
topouts = sort(mapkGoutdegrees, decreasing = T)
topins = sort(mapkGindegrees, decreasing = T)
topouts[1:3]
## hsa:5594 hsa:5595 hsa:1432
## 26 26 13
topins[1:3]
## hsa:5923 hsa:5924 hsa:10125
## 26 26 26
Graph subset and merge
library(Rgraphviz)
## Loading required package: graph
## Loading required package: BiocGenerics
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, aperm, append, as.data.frame, basename, cbind,
## colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
## get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
## match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
## table, tapply, union, unique, unsplit, which.max, which.min
## Loading required package: grid
set.seed(123)
randomNodes = sample(nodes(mapkG), 25)
mapkGsub = subGraph(randomNodes, mapkG)
mapkGsub
## A graphNEL graph with directed edges
## Number of Nodes = 25
## Number of Edges = 6
makeAttr <- function(graph, default, valNodeList) {
tmp <- nodes(graph)
x <- rep(default, length(tmp)); names(x) <- tmp
if(!missing(valNodeList)) {
stopifnot(is.list(valNodeList))
allnodes <- unlist(valNodeList)
stopifnot(all(allnodes %in% tmp))
for(i in seq(valNodeList)) {
x[valNodeList[[i]]] <- names(valNodeList)[i]
}
}
return(x)
}
outs = sapply(edges(mapkGsub), length) > 0
ins = sapply(inEdges(mapkGsub), length) > 0
ios = outs | ins
translate the KEGG IDs into Gene Symbol
library(org.Hs.eg.db)
## Loading required package: AnnotationDbi
## Loading required package: stats4
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: IRanges
## Loading required package: S4Vectors
##
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:Rgraphviz':
##
## from, to
## The following objects are masked from 'package:base':
##
## expand.grid, I, unname
##
ioGeneID = translateKEGGID2GeneID(names(ios))
nodesNames = sapply(mget(ioGeneID, org.Hs.egSYMBOL, ifnotfound=NA), "[[", 1)
names(nodesNames) = names(ios)
plot it
# https://www.bioconductor.org/packages/release/bioc/vignettes/KEGGgraph/inst/doc/KEGGgraph.R
nAttrs <- list();
nAttrs$fillcolor <- makeAttr(mapkGsub, "lightgrey", list(orange=names(ios)[ios]))
nAttrs$label <- nodesNames
plot(mapkGsub, "neato", nodeAttrs=nAttrs,
attrs=list(node=list(fillcolor="lightgreen",
width="0.75", shape="ellipse"),
edge=list(arrowsize="0.7")))
merge two graphs
wntKGML = system.file("extdata/hsa04310.xml", package = "KEGGgraph")
wntG = parseKGML2Graph(wntKGML)
graphs = list(mapk = mapkG, wnt = wntG)
merged = mergeGraphs(graphs)
merged
## A graphNEL graph with directed edges
## Number of Nodes = 386
## Number of Edges = 1628
Using other graph tools
question: which nodes are of the highest importance in MAPK signalling pathway solution: relative betweenness centrality
betweenness if a centrality measure of a node within a graph. Nodes that occur on many shortest paths between other vertices have higher betweenness than those that do not. It is scaled by the factor of $(n-1)(n-2)/2$ to get relative betweenness centrality, where $n$ is the number of nodes in the graph. Both measurements estimate the importance or the role of the node in the graph.
library(RBGL)
bcc = brandes.betweenness.centrality(mapkG)
rbccs = bcc$relative.betweenness.centrality.vertices[1L,]
toprbccs = sort(rbccs, decreasing = TRUE)[1:4]
toprbccs
## hsa:4214 hsa:2885 hsa:5605 hsa:5604
## 0.2685233 0.2467144 0.2366450 0.2366450
Parsing chemical compound reaction network
KEGG pathway captures two kinds of network:
- the protein network: consists relations (edges) between gene products
- the chemical network: the reactions between chemical compounds
The metabolic pathway can be viewed both as a network of proteins (enzymes) and as a network of chemical compounds, whereas regulatory pathways are always viewed as protein networks only.
check the example of Glycine, serine and threonine metabolism pathway
mapfile = system.file("extdata/map00260.xml", package = "KEGGgraph")
map = parseKGML(mapfile)
map
## KEGG Pathway
## [ Title ]: Glycine, serine and threonine metabolism
## [ Name ]: path:map00260
## [ Organism ]: map
## [ Number ] :00260
## [ Image ] :http://www.genome.jp/kegg/pathway/map/map00260.gif
## [ Link ] :http://www.genome.jp/dbget-bin/show_pathway?map00260
## ------------------------------------------------------------
## Statistics:
## 144 node(s)
## 371 edge(s)
## 68 reaction(s)
## ------------------------------------------------------------
reactions = getReactions(map)
reactions[[1]]
## KEGG Reaction(rn:R08211)
## ------------------------------------------------------------
## [ Name ]: rn:R08211
## [ Type ]: irreversible
## [ Substrate Name ]: cpd:C00576
## [ Product Name ]: cpd:C00719
Expand embedded pathways
mapkGembed = parseKGMLexpandMaps(mapkKGML)
only consider gene after using genesOnly = FALSE
,
mapkGall = parseKGML2Graph(mapkKGML, genesOnly = FALSE)
mapkGall
## A graphNEL graph with directed edges
## Number of Nodes = 277
## Number of Edges = 891
mapkGsub = subGraphByNodeType(mapkGall, "gene")
mapkGsub
## A graphNEL graph with directed edges
## Number of Nodes = 265
## Number of Edges = 876
Annotation
translate KEGG identifier (KEGGID) into Entrez GeneID,
toprbccKEGGID = names(toprbccs)
toprbccKEGGID
## [1] "hsa:4214" "hsa:2885" "hsa:5605" "hsa:5604"
toprbccGeneID = translateKEGGID2GeneID(toprbccKEGGID)
toprbccGeneID
## [1] "4214" "2885" "5605" "5604"
To convert GeneID to other identifiers, the authors recommend genome wide annotation packages,
- for human it is
org.Hs.eg.db
- for other organisms, http://www.bioconductor.org/packages/release/data/annotation/
Parse created or edited pathway
- KGML-ED program is a tool designed for the dynamic visualization, interactive navigation and editing of KEGG pathway diagrams. The program is able to modify or even create de novo new pathways and export them into KGML (KEGG XML) files
- KEGGgraph captures the graph information of KEGG PATHWAY by parsing KGML files, it is also capable to parse the KGML files exported from KGML-ED program.
Joint use: allow user edit KGML files from KEGG (e.g., by adding/removing new nodes/edges), or create new pathways from scratch, and then analyse them.
Demonstrate the parse of a toy pathway (network motif) created by KGML-ED, which illustrates a feed-forward loop (FFL)
toyKGML = system.file("extdata/kgml-ed-toy.xml", package = "KEGGgraph")
$ cat /media/weiya/PSSD/Programs/R/4.2.1/lib/R/library/KEGGgraph/extdata/kgml-ed-toy.xml
<?xml version="1.0" encoding="UTF-8"?>
<pathway name="path:KGML_ED_TOY_00001" org="hsa" number="00001" title="KGML-ED toy pathway" image="" link="">
<entry id="1" name="CRP" type="gene">
<graphics name="CRP" fgcolor="#000000" bgcolor="#FFFFFF" type="circle" x="210" y="180" width="120" height="60" />
</entry>
<entry id="2" name="GalS" type="gene">
<graphics name="GalS" fgcolor="#000000" bgcolor="#FFFFFF" type="circle" x="450" y="180" width="120" height="60" />
</entry>
<entry id="3" name="galETK" type="gene">
<graphics name="galETK" fgcolor="#000000" bgcolor="#FFFFFF" type="circle" x="450" y="360" width="120" height="60" />
</entry>
<entry id="4" name="cAMP" type="compound">
<graphics name="cAMP" fgcolor="#000000" bgcolor="#FFFFFF" type="circle" x="100" y="90" width="60" height="60" />
</entry>
<entry id="5" name="galactose" type="compound">
<graphics name="galactose" fgcolor="#000000" bgcolor="#FFFFFF" type="circle" x="590" y="90" width="60" height="60" />
</entry>
<relation entry1="1" entry2="2" type="PPrel">
<subtype name="activation" value="-->"/>
</relation>
<relation entry1="2" entry2="3" type="GErel">
<subtype name="repression" value="--|"/>
</relation>
<relation entry1="1" entry2="3" type="GErel">
<subtype name="expression" value="--|"/>
</relation>
<relation entry1="4" entry2="1" type="PCrel">
<subtype name="activation" value="-->"/>
</relation>
<relation entry1="5" entry2="2" type="PCrel">
<subtype name="activation" value="-->"/>
</relation>
<relation entry1="2" entry2="2" type="PPrel">
<subtype name="inhibition" value="--|"/>
</relation>
</pathway>
toyGraph = parseKGML2Graph(toyKGML, genesOnly = FALSE)
toyGraph
## A graphNEL graph with directed edges
## Number of Nodes = 5
## Number of Edges = 6
nodes(toyGraph)
## [1] "CRP" "GalS" "galETK" "cAMP" "galactose"
library(Rgraphviz)
plot(toyGraph)
alternatively, use
plotKEGGgraph(toyGraph)
render the attributes
nodeInfo = getKEGGnodeData(toyGraph)
nodeType = sapply(nodeInfo, getType)
makeNodeRenderAttrs = function(g, label = nodes(g), shape = "ellipse", fill="#e0e0e0", ...) {
rv = list(label = label, shape = shape, fill = fill, ...)
nA = nodeRenderInfo(g)
for (i in seq(along = rv)) {
if (length(rv[[i]]) == 1) {
rv[[i]] = rep(rv[[i]], numNodes(g))
} else {
if (length(rv[[i]]) != numNodes(g))
stop("Attributes vector must have as many elements as 'g' has nodes.")
}
names(rv[[i]]) = nodes(g)
nA[[ names(rv)[[i]] ]] = rv[[i]]
}
nodeRenderInfo(g) = nA
return(g)
}
toyDrawn = plotKEGGgraph(toyGraph)
toyDrawnRefine = makeNodeRenderAttrs(toyDrawn, fill = ifelse(nodeType == "gene", "lightblue", "orange"),
shape = ifelse(nodeType == "gene", "ellipse", "rectangle"))
renderGraph(toyDrawnRefine)
Microarray analysis
KEGG PATHWAY contains pathway maps for nearly 1000 organisms and has been intensively expanded with pathways including signaling transduction, cellular processes and human diseases in recent years.
The microarray data is Affemetric GeneChip (GEO accession: GSE4107), containing 10 normal samples and 12 colorectal cancer samples, is normalized with affy
and limma
package. The result of topTable
in limma
is used as the input data.
load(system.file("extdata/colorectalcancerSPIA.RData", package="KEGGgraph"))
head(top)
## ID logFC AveExpr t P.Value adj.P.Val B
## 10738 201289_at 5.960206 6.226928 23.94388 1.789221e-17 9.782565e-13 25.40124
## 18604 209189_at 5.143502 7.487305 17.42995 1.560212e-14 2.843486e-10 21.02120
## 11143 201694_s_at 4.148081 7.038281 16.46040 5.153117e-14 7.043667e-10 20.14963
## 10490 201041_s_at 2.429889 9.594413 14.06891 1.293706e-12 1.414667e-08 17.66883
## 10913 201464_x_at 1.531126 8.221044 10.98634 1.685519e-10 1.151947e-06 13.61189
## 11463 202014_at 1.429269 5.327647 10.45906 4.274251e-10 2.418771e-06 12.80131
FC
indicates Fold Change, a measure describing how much a quantity changes between an original and a subsequent measurement.
Entrez is a molecular biology database system that provides integrated access to nucleotide and protein sequence data, gene-centered and genomic mapping information, 3D structure data, PubMed MEDLINE, and more.
整理好了表达矩阵以后,我们需要将探针的id转换成为基因的Gene symbol。对于探针id的转换过程,目前主要是通过R包来进行转换。 随着芯片平台的普遍使用,其基因的注释信息也被整理成了不同的R包;因此,通常情况下我们使用R包来注释。不同的平台,对应着不同的R包。首先,我们来看一下这个数据集使用的平台类型。
该芯片使用的是我们最常见的平台,GPL570 对于GPL570,其对应的R包是hgu133plus2.db包
基因探针,即核酸探针,是一段带有检测标记,且顺序已知的,与目的基因互补的核酸序列(DNA或RNA)。基因探针通过分子杂交与目的基因结合,产生杂交信号,能从浩瀚的基因组中把目的基因显示出来。
source: https://baike.baidu.com/item/%E5%9F%BA%E5%9B%A0%E6%8E%A2%E9%92%88
library(hgu133plus2.db)
##
ls("package:hgu133plus2.db")
## [1] "hgu133plus2" "hgu133plus2_dbconn"
## [3] "hgu133plus2_dbfile" "hgu133plus2_dbInfo"
## [5] "hgu133plus2_dbschema" "hgu133plus2.db"
## [7] "hgu133plus2ACCNUM" "hgu133plus2ALIAS2PROBE"
## [9] "hgu133plus2CHR" "hgu133plus2CHRLENGTHS"
## [11] "hgu133plus2CHRLOC" "hgu133plus2CHRLOCEND"
## [13] "hgu133plus2ENSEMBL" "hgu133plus2ENSEMBL2PROBE"
## [15] "hgu133plus2ENTREZID" "hgu133plus2ENZYME"
## [17] "hgu133plus2ENZYME2PROBE" "hgu133plus2GENENAME"
## [19] "hgu133plus2GO" "hgu133plus2GO2ALLPROBES"
## [21] "hgu133plus2GO2PROBE" "hgu133plus2MAP"
## [23] "hgu133plus2MAPCOUNTS" "hgu133plus2OMIM"
## [25] "hgu133plus2ORGANISM" "hgu133plus2ORGPKG"
## [27] "hgu133plus2PATH" "hgu133plus2PATH2PROBE"
## [29] "hgu133plus2PFAM" "hgu133plus2PMID"
## [31] "hgu133plus2PMID2PROBE" "hgu133plus2PROSITE"
## [33] "hgu133plus2REFSEQ" "hgu133plus2SYMBOL"
## [35] "hgu133plus2UNIPROT"
then we can obtain the correspondence between probe id and gene symbol
ids = toTable(hgu133plus2SYMBOL)
head(ids)
## probe_id symbol
## 1 1007_s_at DDR1
## 2 1053_at RFC2
## 3 117_at HSPA6
## 4 121_at PAX8
## 5 1255_g_at GUCA1A
## 6 1294_at UBA7
Only consider the probes annotated with EntrezGeneID and differentially expressed with FDR p-value less than 0.05.
x = hgu133plus2ENTREZID
x
## ENTREZID map for chip hgu133plus2 (object of class "ProbeAnnDbBimap")
x[top$ID]
## ENTREZID submap for chip hgu133plus2 (object of class "ProbeAnnDbBimap")
top$ENTREZ = unlist(as.list(x[top$ID]))
remove NA
top = top[!is.na(top$ENTREZ),]
top = top[!duplicated(top$ENTREZ),]
tg1 = top[top$adj.P.Val < 0.05, ]
DE_Colorectal = tg1$logFC
names(DE_Colorectal) = as.vector(tg1$ENTREZ)
All_Colorectal = top$ENTREZ
length(DE_Colorectal)
## [1] 3630
length(All_Colorectal)
## [1] 17716
The SPIA algorithm takes two vectors as input:
DE_Colorectal
: log2 fold changes of differentially expressed genesAll_Colorectal
: those of all EntrezGeneID annotated genes
and it will return a table of pathways ranked from the most to the least significant. The results show that Colorectal cancer pathway (ID:05210) is significantly activated (ranked the 5th of the result table).
Retrieve the KGML remotely from the KEGG FTP site
tmp = "hsa05210.xml"
retrieveKGML(pathwayid = "05210", organism = "hsa", destfile = tmp)
The file has been attached in the KEGGgraph package. Parse it into graph and observe that around 30% percent of the genes in the pathway are differentially expressed.
colFile = system.file("extdata/hsa05210.xml", package = "KEGGgraph")
g = parseKGML2Graph(colFile)
deKID = translateGeneID2KEGGID(names(DE_Colorectal))
allKID = translateGeneID2KEGGID(All_Colorectal)
isDiffExp = nodes(g) %in% deKID
sprintf("%2.2f%% genes differentially-expressed", mean(isDiffExp)*100)
## [1] "29.76% genes differentially-expressed"
visualize the pathway
plot(g)
# https://www.bioconductor.org/packages/release/bioc/vignettes/KEGGgraph/inst/doc/KEGGgraphApp.R
library(RColorBrewer)
ar = 20
cols = rev(colorRampPalette(brewer.pal(6, "RdBu"))(ar))
logfcs = DE_Colorectal[match(nodes(g), deKID)]
names(logfcs) = nodes(g)
logfcs[is.na(logfcs)] = 0
incol = round((logfcs + 2)*5)
incol[incol > ar] = ar
undetected = !nodes(g) %in% allKID
logcol = cols[incol]
logcol[logfcs == 0] = "darkgrey"
logcol[undetected] = "yellow"
names(logcol) = names(logfcs)
nA = makeNodeAttrs(g, fillcolor = logcol, label = "", width = 10, height = 1.2)
layout(mat = matrix(c(rep(1, 8), 2), ncol = 1, byrow = TRUE)) ## very important!!! without it, the circles reduce to a line, it is (almost) equivalent to `c(rep(1, 1), 2)`.
plot(g, "dot", nodeAttrs = nA)
image(as.matrix(seq(1, ar)), col = cols, yaxt = "n", xaxt = "n")
- Not all the genes are connected with each other, 34/84 nodes in the pathway have a degree of 0
d = degree(g)
sum(d$inDegree == 0 & d$outDegree == 0)
## [1] 34
# https://www.bioconductor.org/packages/release/bioc/vignettes/KEGGgraph/inst/doc/KEGGgraphApp.R
gDeg = degree(g)
gIsSingle = gDeg[[1]] + gDeg[[2]] == 0
options(digits=3)
gGeneID = translateKEGGID2GeneID(nodes(g))
gSymbol = sapply(gGeneID, function(x) mget(x, org.Hs.egSYMBOL, ifnotfound = NA)[[1]])
isUp = logfcs > 0
isDown = logfcs < 0
singleUp = isUp & gIsSingle
singleDown = isDown & gIsSingle
they are not connected to any other node. There are several reasons for this.
- genes with genetic alterations are indicated in the pathway map, but not interweaved into the pathway
- in some pathways, especially human disease pathways, KEGG records
genes involved in the disease but does not provide any edge in that
pathway, although one can find their interaction partners in other
pathways, e.g., Grb2 has no edges in colorectal cancer pathway,
however in other maps including MAPK signaling pathway, both up- and
down-stream interactors are found.
- one solution is to merge (union) the related pathways
together:
mergeKEGGgraphs
- one solution is to merge (union) the related pathways
together:
- From the visualization, it is difficult to recognize any patterns.
To examine the pathway, use Divide and Conquer strategy, namely
to subset the graph into subgraphs:
subKEGGgraph
ups = nodes(g)[logfcs > 0]
upNs = unique(unlist(neighborhood(g, ups, return.self = TRUE)))
upSub = subKEGGgraph(upNs, g)
upNeighbor = nodes(upSub)[sapply(neighborhood(upSub, nodes(upSub)), length) > 0]
upNeighbor = setdiff(upNeighbor, nodes(g)[undetected])
upSub = subKEGGgraph(upNeighbor, upSub)
upSubGID = translateKEGGID2GeneID(nodes(upSub))
upSymbol = gSymbol[upSubGID]
upnA = makeNodeAttrs(upSub, fillcolor = logcol[nodes(upSub)], label = upSymbol, fixedsize = TRUE, width = 10, height = 10, font = 20)
plot(upSub, "dot", nodeAttrs = upnA)
Use degree centrality (the sum of in- and out-degree of each node) as a measure of the relative importance of the node compared to others.
- 8 nodes have an degree equal or larger than three, and 5 of them are upregulated to different extent.
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/libf77blas.so.3.10.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
##
## locale:
## [1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 grid stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] RColorBrewer_1.1-3 hgu133plus2.db_3.13.0 RBGL_1.74.0
## [4] org.Hs.eg.db_3.16.0 AnnotationDbi_1.60.0 IRanges_2.32.0
## [7] S4Vectors_0.36.0 Biobase_2.58.0 Rgraphviz_2.42.0
## [10] graph_1.76.0 BiocGenerics_0.44.0 KEGGgraph_1.58.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.9 highr_0.9 GenomeInfoDb_1.34.3
## [4] XVector_0.38.0 compiler_4.2.1 zlibbioc_1.44.0
## [7] bitops_1.0-7 tools_4.2.1 digest_0.6.30
## [10] bit_4.0.5 evaluate_0.18 RSQLite_2.2.18
## [13] memoise_2.0.1 pkgconfig_2.0.3 png_0.1-7
## [16] rlang_1.0.6 cli_3.4.1 DBI_1.1.3
## [19] rstudioapi_0.14 yaml_2.3.6 xfun_0.35
## [22] fastmap_1.1.0 GenomeInfoDbData_1.2.9 stringr_1.4.1
## [25] httr_1.4.4 knitr_1.41 Biostrings_2.66.0
## [28] vctrs_0.5.1 bit64_4.0.5 R6_2.5.1
## [31] XML_3.99-0.12 rmarkdown_2.18 blob_1.2.3
## [34] magrittr_2.0.3 htmltools_0.5.3 KEGGREST_1.38.0
## [37] stringi_1.7.8 RCurl_1.98-1.9 cachem_1.0.6
## [40] crayon_1.5.2