可视化不同子组的分布

问题描述

我正在使用“ d.pizza”数据。有一个名为“ delivery_min”的变量,它是交付时间(以分钟为单位),还有一个名为“ area”的变量,它可以是三个区域(Camden,Westminster和Brent)之一。 我想绘制一个密度图,以可视化方式这三个区域的交货时间分布

我尝试过

 plot.ecdf(pizza_d$delivery_min)

此代码有效,但是如何在每个区域做到这一点?

head(d.pizza)=

index       date week weekday        area count rabate  price operator  driver delivery_min
1 1     1 01.03.2014    9       6      Camden     5   TRUE 65.655   Rhonda  Taylor         20.0
2 2     2 01.03.2014    9       6 Westminster     2  FALSE 26.980   Rhonda Butcher         19.6
3 3     3 01.03.2014    9       6 Westminster     3  FALSE 40.970  Allanah Butcher         17.8
4 4     4 01.03.2014    9       6       Brent     2  FALSE 25.980  Allanah  Taylor         37.3
5 5     5 01.03.2014    9       6       Brent     5   TRUE 57.555   Rhonda  Carter         21.8
6 6     6 01.03.2014    9       6      Camden     1  FALSE 13.990  Allanah  Taylor         48.7
  temperature wine_ordered wine_delivered wrongpizza quality
1        53.0            0              0      FALSE  medium
2        56.4            0              0      FALSE    high
3        36.5            0              0      FALSE    <NA>
4          NA            0              0      FALSE    <NA>
5        50.0            0              0      FALSE  medium
6        27.0            0              0      FALSE     low

解决方法

library(DescTools)

data(d.pizza)
summary(d.pizza$delivery_min)

plot(NULL,ylab='',xlab='',xlim=c(5,66),ylim=0:1)
for(A in 1:3) {
    plot.ecdf(d.pizza$delivery_min[d.pizza$area == levels(d.pizza$area)[A]],pch=20,col=A+1,add=T)
}
legend("bottomright",legend=levels(d.pizza$area),bty='n',col=2:4)

ECDF by area

,

您可以这样做:

library(DescTools)

data(d.pizza)

plot.ecdf(subset(d.pizza,area == "Camden")$delivery_min,col = "red",main = "ECDF for pizza deliveries")
plot.ecdf(subset(d.pizza,area == "Westminster")$delivery_min,add = TRUE,col = "blue")
plot.ecdf(subset(d.pizza,area == "Brent")$delivery_min,col = "green")

enter image description here

,

我建议使用ggplot2库在R中进行数据可视化。这是一些使用ggplot2的代码,可以创建三个覆盖的密度图:

library(ggplot2)

# make example dataframe
d.pizza <- data.frame(delivery_min = rnorm(n=30),area = rep(c("Camden","Westminster","Brent"),10))

# plot data in ggplot2
ggplot(d.pizza,aes(x = delivery_min,fill = area,color = area)) + geom_density(alpha = 0.5)

enter image description here

如果您想要直方图,也可以这样做:

ggplot(d.pizza,color = area)) + geom_histogram(alpha = 0.5,position = 'identity')

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...