问题描述
我有一个数据集,我正在其中绘制热图来比较 7 个组。我还有每组 2 列描述该组的数据。我正在尝试创建一个交互式绘图,根据其信息列显示每个组的信息。
以下是一个数据示例,其中 7 个组各有 2 列对应信息:
df <- structure(list(Group1 = c(9.420318259,5.801092847,4.890727291,4.589825753,4.836092781),Group2 = c(14.57805564,8.798453748,7.982599836,7.951599435,10.81418654),Group3 = c(14.49131554,7.975284646,8.258878348,7.922657108,13.3205827),Group4 = c(11.44447147,6.208332721,6.529806574,4.882623805,10.69676399),Group5 = c(22.86835197,10.94297858,7.197041788,9.237584441,12.70083108),Group6 = c(10.62687539,6.458410247,7.461916094,6.308454021,12.39464562),Group7 = c(11.09404106,6.420303272,6.821000583,5.0727153,11.13903127),Group1_Genes = c(46L,17L,23L,16L,27L),Group1_score = c(0.719,0.757,0.71,0.807,0.761),Group2_Genes = c(58L,22L,30L,40L),Group2_score = c(0.754,0.766,0.741,0.774),Group3_Genes = c(37L,14L,13L,22L),Group3_score = c(0.798,0.788,0.81,0.879,0.805),Group4_Genes = c(55L,20L,29L,21L,42L),Group4_score = c(0.774,0.768,0.822,0.781),Group5_Genes = c(71L,24L,37L,53L),Group5_score = c(0.766,0.767,0.765,0.811,0.771
),Group6_Genes = c(69L,Group6_score = c(0.772,0.771),Group7_Genes = c(58L,33L,48L),Group7_score = c(0.79,0.78,0.774,0.817,0.78
)),row.names = c("Cardiac Hypertrophy","Cellular Effects of Adrenaline","Metastasis Signaling","Hormone Signaling","Estrogen Receptor Signaling"
),class = "data.frame")
#One row of this data looks like:
Pathway Group1 Group2 Group3 Group4 Group5 Group6 Group7 Group1_score Group1_Genes Group2_score Group2_Genes ...
Cardiac 0.7 0.8 0.5 0.7 0.3 0.6 0.6 0.6 34 0.4 65
我正在尝试在热图中绘制组 1-7(也是第 1:7 列),然后通过修改热图中另一个问题 (How to create an interactive heatmaply plot with custom text in R?) 中的答案使用其余列悬停文本:>
groups <- as.matrix(df[,1:7])
labels1 <-
df %>%
mutate(label1 = paste(
"Gene Overlap:",Group1_Genes,"\nMean_GB_score:",Group1_score
)) %>%
transmute(across(Group1,~label1)) %>%
as.matrix()
labels2 <-
df %>%
mutate(label2 = paste(
"Gene Overlap:",Group2_Genes,Group2_score
)) %>%
transmute(across(Group2,~label2)) %>%
as.matrix()
#I repeat making 7 labels objects to then cbind:
labels = cbind(labels1,labels2,labels3,labels4,labels5,labels6,labels7)
heatmaply(groups,custom_hovertext = labels,file = "heatmaply_plot.html",scale_fill_gradient_fun = ggplot2::scale_fill_gradient2(
low = "pink",high = "red"))
但是尝试这样做会产生错误:
Error in custom_hovertext[rowInd,colInd,drop = FALSE] :
subscript out of bounds
有没有办法让我在 custom_text
中创建 heatmaply()
来指定为热图的每一列提供的悬停文本信息,而不是为每个热图方块提供的全局信息?
解决方法
labels_df <-
df %>%
select(ends_with("Score"),ends_with("Genes")) %>%
rownames_to_column() %>%
pivot_longer(-rowname) %>%
separate(name,c("Group","var")) %>%
pivot_wider(c(rowname,Group),names_from = var,values_from = value) %>%
mutate(label = paste(
"Gene Overlap:",Genes,"\nMean_GB_Score:",Score
)) %>%
pivot_wider(rowname,names_from = Group,values_from = label)
您可以通过在任何地方打破链并运行代码来检查每一步发生了什么。但基本上我们只是进行一些转置,以使数据采用更可用的整洁格式,这样我们就不需要输入 7 个类似的表达式来计算标签。然后我们转回 heatmaply
所需的格式。
这里要提的重要一点是,在所有这些转置之后,行的顺序恰好与它们开始时的顺序相同。这很酷,但最好检查一下这些东西。
矩阵形式的标签:
labels_mat <-
labels_df %>%
select(Group1:Group7) %>%
as.matrix()
最后:
heatmaply(
groups,custom_hovertext = labels_mat,scale_fill_gradient_fun = ggplot2::scale_fill_gradient2(low = "pink",high = "red")
)