将具有group_by的向量转换为以组为行名的矩阵

问题描述

在保留每一行的唯一标识符的同时，我似乎无法将数据框的一列重塑为正确的形状。我有以下数据

   id     x        y   indicator
1   1 249.6  1.124985        1 
2   1 250.9  1.124756        1 
3   1 252.2  1.124125        1 
4   1 253.5  1.124598        1 
5   1 254.8  1.127745        1 
6   1 256.1  1.129102        1 
7   2 249.6  2.167348        0   
8   2 250.9  2.165804        0   
9   2 252.2  2.164578        0  
10  2 253.5  2.163828        0  
11  2 254.8  2.164260        0   
12  2 256.1  2.166293        0 
13  3 249.6  0.04647765      0
14  3 250.9  0.04932262      0
15  3 252.2  0.05245448      0
15  3 253.5  0.05692405      0
17  3 254.8  0.06184551      0
18  3 256.1  0.06751989      0

我想将y向量整形为一个矩阵，其中每行对应一个y向量，并且还有id和indicator的其他列，而变量列则由x值标记。像这样：

id indicator  249.6      250.9      252.2      ...
1  1          1.124985   1.124756   1.124125   ...
2  0          2.167348   2.165804   2.164578   ...
3  0          0.04647765 0.04932262 0.05245448 ...

我尝试过使用这样的重塑功能：

reshape(df[c('id','x','y')],direction = "wide",idvar = "id",timevar = "x")

在这种情况下，我只是忽略了指标变量，看它是否可以工作，但是我得到的数据框只有两列，第一列是ID，第二列是y.c(249.6,250.9,252.2,253.5,254.8,256.1,257.4,258.7,260,261.3,262.6,263.9,265.2,266.5,267.8,269.1,270.4,271.7,273,274.3,275.6,276.9,278.2,279.5,280.8,282.1,283.4,284.7,286,287.3,288.6,289.9,291.2,292.5,293.8,295.1,[etc]。 / p>

我还尝试使用xtabs函数：a = xtabs(formula = y ~ id + indicator+ x,data=df)，但这只是返回了一个看起来与我输入的表非常相似的表。

解决方法

在拉各斯（Largo para largo）进行格式化。（Veja no SO emInglêsaqui）。

python setup.py install

Dados

library(dplyr)
library(tidyr)

df1 %>%
  pivot_wider(
    id_cols = c('id','indicator'),names_from = 'x',values_from = 'y'
  )
## A tibble: 3 x 8
#     id indicator `249.6` `250.9` `252.2` `253.5` `254.8` `256.1`
#  <int>     <int>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#1     1         1  1.12    1.12    1.12    1.12    1.13    1.13  
#2     2         0  2.17    2.17    2.16    2.16    2.16    2.17  
#3     3         0  0.0465  0.0493  0.0525  0.0569  0.0618  0.0675

关于问题中的代码，indicator应该是idvar的一部分。另外，如果df中除了列出的4列之外没有其他列，则df[...]可以缩短为df。

reshape(df[c('id','x','y','indicator')],direction = "wide",idvar = c("id","indicator"),timevar = "x")

给予：

   id indicator    y.249.6    y.250.9    y.252.2    y.253.5    y.254.8    y.256.1
1   1         1 1.12498500 1.12475600 1.12412500 1.12459800 1.12774500 1.12910200
7   2         0 2.16734800 2.16580400 2.16457800 2.16382800 2.16426000 2.16629300
13  3         0 0.04647765 0.04932262 0.05245448 0.05692405 0.06184551 0.06751989

注意

可复制形式的输入：

df <- structure(list(id = c(1L,1L,2L,3L,3L),x = c(249.6,250.9,252.2,253.5,254.8,256.1,249.6,256.1),y = c(1.124985,1.124756,1.124125,1.124598,1.127745,1.129102,2.167348,2.165804,2.164578,2.163828,2.16426,2.166293,0.04647765,0.04932262,0.05245448,0.05692405,0.06184551,0.06751989),indicator = c(1L,0L,0L)),class = "data.frame",row.names = c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18"))

所以我很快就用tidyr找出了答案：

z = pivot_wider(df,id_cols = c("id",names_from = "x",values_from = "y")

瑞·巴拉达斯（Rui Barradas）的答案是相同的，而且效果很好。

dplyr dplyr r r reshape