使用R中的平均分数重新排列行和列+

问题描述

我有一个现在看起来像这样的数据集,其中每一行代表一个主题

score   item1   item2   item3
50     always   never   some
60     some     always  never
70     never    some    always
80     always   never   some
90     some     never   always

我正在尝试创建一个表,以显示每个级别每个项目的得分的平均值(即,项目1的平均得分,总是?项目1的平均得分,有些?)

关于如何修改数据集的任何建议,使其看起来像这样:

       item1    item2   item3
always  mean#   mean#   mean#
some    mean#   mean#   mean#
never   mean#   mean#   mean#

谢谢!

解决方法

这是pivot_tidyr函数的出色用法。

library(tidyr)
library(dplyr) # for the pipe and across
# here is the data
df <- tructure(list(score = c(50,60,70,80,90),item1 = c("always","some","never","always","some"),item2 = c("never","never"),item3 = c("some","always")),class = c("spec_tbl_df","tbl_df","tbl","data.frame"),row.names = c(NA,-5L),spec = structure(list(
    cols = list(score = structure(list(),class = c("collector_double","collector")),item1 = structure(list(),class = c("collector_character",item2 = structure(list(),item3 = structure(list(),"collector"))),default = structure(list(),class = c("collector_guess",skip = 1),class = "col_spec"))

df %>%
  pivot_longer(starts_with("item"),values_to = "response") %>%
  pivot_wider(id_cols = response,names_from = name,values_from = score,values_fn = mean)

# A tibble: 3 x 4
  response item1 item2 item3
  <chr>    <dbl> <dbl> <dbl>
1 always      65  60      80
2 never       70  73.3    60
3 some        75  70      65
,

以下是一些基本的R选项:

  • 使用stack + tapply
with(
  cbind(score = df$score,stack(df[-1])),tapply(score,list(values,ind),mean,na.rm = TRUE)
)

给出

        ind
values   item1    item2 item3
  always    65 60.00000    80
  never     70 73.33333    60
  some      75 70.00000    65
  • 使用lapply + tapply
do.call(
  cbind,lapply(df[-1],function(k) tapply(df$score,k,na.rm = TRUE))
)

给出

       item1    item2 item3
always    65 60.00000    80
never     70 73.33333    60
some      75 70.00000    65