使用Sankey图通过ggplot查看R层积和化妆品中的数据流

问题描述

我有一个治疗前(共识)和治疗后(单一药物)患者分类的数据表,我想显示患者在治疗前后如何流入不同的分类在这种情况下,实际簇数并不重要,重要的一点是,对于大多数患者而言,在治疗前会聚在一起,在治疗后也会聚在一起。一些移动。

这是数据的屏幕截图

enter image description here

dummy dataset 

structure(list(Stimulation = c("3S","3S","3S"),Patient.ID =       c("S3077497","S1041120","S162465","S563275","S2911623","S3117192","S2859024","S2088278","S3306185","S190789","S12146451","S2170842","S115594","S2024203","S1063872","S2914138","S303984","S570813","S2176683","S820460","S1235729","S3009401","S2590229","S629309","S1208256","S2572773","S3180483","S3032079","S3217608","S5566943","S5473728","S104259","S2795346","S2848989","S2889801","S2813983","S2528246","S3151923","S2592908","S2603793","S5565867","S3127064","S675629","S834679","S3011944","S5011583","S2687896","S2998620","S651963","S2104595","S2433454","S2565220","S3307762","S294778","S995510","S2476822","S140868","S1018263","S2990223","S5524130","S1042529","S999706","S363003","S2303087","S868213","S5568359","S3174542","S521782","S3294727"),`Cluster assigned consensus` = c(2,2,5,3,1,4,7,8,6,7),`Cluster assigned single drug` = c("1","1","2","3","4","5","6","7","8","8"),count = c(1,1)),row.names = c(NA,-69L),class =     c("tbl_df","tbl","data.frame"))

我是第一次参加Sankey剧集,所以我不是专家。我添加了计数列,因此每个患者的计数为1,然后可以通过该计数添加流量厚度。

我从R教程进行了修改,可视化代码在这里

library(ggplot2)
library(ggalluvial)

ggplot(data = CLL3S,aes(axis1 = `Cluster assigned consensus`,axis2 = `Cluster assigned single drug`,y = count)) +
  scale_x_discrete(limits = c("Consensus cluster","Single-drug cluster"),expand = c(.1,.1)) +
  xlab("Clusters") +
  geom_alluvium(aes(fill = `Cluster assigned consensus`)) +
  geom_stratum() +
  geom_text(stat = "stratum",aes(label = after_stat(stratum))) +
  theme_minimal() +
  ggtitle("Patient flow between the Consensus clusters and Single-drug treated clusters","3S stimulated patients")

这种作品,但数字并不漂亮:

enter image description here

您会看到簇号被巨大的白色空白框包围。如何将其更改为较小的内容?以及如何将框颜色编码为不同的颜色,并确保是否更改geom_alluvium(填充),以便数据流与框(共识框)的颜色匹配?

解决方法

您可以在geom_stratum中进行控制。试试这个

library(ggplot2)
library(ggalluvial)
library(RColorBrewer)

# Define the number of colors you want
nb.cols <- 10
mycolor1 <- colorRampPalette(brewer.pal(8,"Set2"))(nb.cols)
mycolor2 <- colorRampPalette(brewer.pal(2,"Set2"))(nb.cols)

mycolors <- c("red","blue","green","orange")

ggplot(data = CLL3S,aes(y = count,axis1 = `Cluster assigned consensus`,axis2 = `Cluster assigned single drug` 
           )) +
  scale_x_discrete(limits = c("Consensus cluster","Single-drug cluster"),expand = c(.1,.1)) +
  labs(x="Clusters") +
  geom_alluvium(aes(fill = `Cluster assigned consensus`)) +
  geom_stratum(width = 1/4,fill = c(mycolor1[1:8],mycolor1[1:8]),color = "red") +
  #geom_stratum() +
  geom_text(stat = "stratum",aes(label = after_stat(stratum))) +
  #scale_fill_manual(values = mycolors) +
  theme_minimal() +
  guides(fill=guide_legend(override.aes = list(color=mycolors)))+
  ggtitle("Patient flow between the Consensus clusters and Single-drug treated clusters","3S stimulated patients")

output

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...