介绍个数据整合和可视化的R包 pathview
。
首先安装包并载入数据:
source("http://bioconductor.org/biocLite.R")
biocLite("pathview")
library(pathview)
# 载入数据
data(gse16873.d)
data(demo.paths)
基因表达变化数据框如下所示,行是基因ID,列是样本ID,变化范围是-1到1.
对单样本做经典星号通路可视化,"Cell Cycle"通过 gene.data
和 pathway.id
指定,表达谱文件是人类的,所以 species="hsa"
,
pv.out pathview(gene.data = gse16873.d[, 1], pathway.id = demo.paths$sel.paths[1], species = "hsa", out.suffix = "gse16873", kegg.native = T)
具体查看图里每个节点的数据,每个节点的kegg名和ID都如下表列出:
head(pv.out$plot.data.gene)
如果想删除与自己数据里无关的节点:
pv.out pathview(gene.data = gse16873.d[, 1], pathway.id = demo.paths$sel.paths[1], species = "hsa", out.suffix = "gse16873.2layer", kegg.native = F, sign.pos = demo.paths$spos[1], same.layer = F)
将组合在一起的接节点画分开画:
pv.out pathview(gene.data = gse16873.d[, 1], pathway.id = demo.paths$sel.paths[1], species = "hsa", out.suffix = "gse16873.split", kegg.native = F, sign.pos = demo.paths$spos[i], split.group = T)
完整画出所有节点之间的关系,包括间接联系:
pv.out pathview(gene.data = gse16873.d[, 1], pathway.id = demo.paths$sel.paths[1], species = "hsa", out.suffix = "gse16873.split.expanded", kegg.native = F, sign.pos = demo.paths$spos[i], split.group = T, expand.node = T)
还可以将基因数据和化合物数据与代谢途径整合可视化,包括小分子、代谢物、酶等数据以及多样本作图均可以使用这个包。
更多原创精彩视频敬请关注生信杂谈: