问题描述
我有代码来查看数据帧并确定事件开始的星期几 (started_at)。
现在我想绘制事件在一天中的 24 小时制开始的小时。
对我来说,它应该像将“工作日”功能更改为相应的“小时”功能(如果存在)一样简单 然后改变
factor( levels=c("0:00","01:00...))
这是我认为应该可以使用的代码,如果我用新的东西来替换
lubridate:hour()
函数与
df_mc_hour <- df_clean_distances %>%
mutate(hour = factor(lubridate::hour(df_clean_distances$started_at),levels=c("0:00","1:00","2:00","3:00","4:00","5:00","6:00","7:00","8:00","9:00","10:00","11:00","12:00","13:00","14:00","15:00","16:00","17:00","18:00","19:00","20:00","21:00","22:00","23:00","24:00")),mc=c(member_casual)) %>% tabyl(hour,member_casual)
df_mc_hour
这给了我下表,所以它部分工作
hour casual member
0:00 0 0
1:00 0 0
2:00 0 0
3:00 0 0
4:00 0 0
5:00 0 0
6:00 0 0
7:00 0 0
8:00 0 0
9:00 0 0
10:00 0 0
11:00 0 0
12:00 0 0
13:00 0 0
14:00 0 0
15:00 0 0
16:00 0 0
17:00 0 0
18:00 0 0
19:00 0 0
20:00 0 0
21:00 0 0
22:00 0 0
23:00 0 0
24:00 0 0
<NA> 1424941 2049543
我用累了
format(Sys.time(),format = "%H")
但这不起作用。我认为是因为 start_at 是一个 POSIXct 类型。
这是我的第二次尝试
df_mc_hour <- df_clean_distances %>%
mutate(hour = factor(format(df_clean_distances$started_at,format =
"%H"),mc=c(member_casual)) %>%
tabyl(hour,member_casual)
df_mc_hour
上面改成了下面这个
df_mc_hour <- df_clean_distances %>%
mutate(hour = factor(format(df_clean_distances$started_at,format = "% H"),levels=c("00","01","02","03","04","05","06","07","08","09","10","11",12,13,14,15,16,17,18,19,20,21,22,23)),mc=c(member_casual))
%>% tabyl(hour,member_casual)
第三次尝试给出了这个输出——成功。
hour casual member
00 22430 12100
01 13979 6796
02 7686 3661
03 4114 2319
04 3434 3574
05 5400 17244
06 12864 56244
07 23101 94422
08 31710 102599
09 40708 86997
10 58289 92457
11 80432 115845
12 97999 136178
13 106561 135697
14 113327 135849
15 119440 150640
16 126388 179952
17 139725 215323
18 125940 186883
19 96204 130136
20 67605 79977
21 48808 47913
22 43689 33737
23 35108 23000
Classes ‘tabyl’ and 'data.frame': 24 obs. of 3 variables:
$ hour : Factor w/ 24 levels "00",..: 1 2 3 4 5 6 7 8 9 10
...
$ casual: num 22430 13979 7686 4114 3434 ...
$ member: num 12100 6796 3661 2319 3574 ...
- attr(*,"core")='data.frame': 24 obs. of 3 variables:
..$ hour : Factor w/ 24 levels "00",..: 1 2 3 4 5 6 7 8 9
10 ...
..$ casual: num [1:24] 22430 13979 7686 4114 3434 ...
..$ member: num [1:24] 12100 6796 3661 2319 3574 ...
- attr(*,"tabyl_type")= chr "two_way"
- attr(*,"var_names")=List of 2
..$ row: chr "hour"
..$ col: chr "member_casual"
这是我的功能,用于按星期几比较休闲和会员来计算乘客量
##day of the week
df_mc_day <- df_clean_distances %>%
mutate(weekday = weekdays(df_clean_distances$started_at),mc=c(member_casual)) %>%
tabyl(weekday,member_casual)
df_mc_day <- df_clean_distances %>%
mutate(weekday = factor(weekdays(df_clean_distances$started_at),levels=c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")),member_casual)
df_mc_day
weekday casual member
Monday 150953 267013
Tuesday 145036 284001
Wednesday 158168 304679
Thursday 166103 300030
Friday 208234 306058
Saturday 334592 322765
Sunday 261855 264997
绘制星期几比较
df_mc_day %>% adorn_totals("row")
p <- ggplot() + geom_col( data=df_mc_day,aes(x=weekday,y=member))
p
df_mc_day %>%
pivot_longer(cols =-weekday) %>%
ggplot(aes(x=weekday,y=value,fill=name)) +
geom_col( position = 'dodge') + theme_light() +
scale_y_continuous(labels = function(x) format(x,scientific = FALSE)) +
labs( title ="Rider Membership by Day of the week") +
scale_color_brewer( type="seq",palette = "Spectral")
解决方法
所以答案就如上图所示,由@r2evans 提出。一些实验和查看 str() 给了我所需的线索。