根据 R 中的键将列收集到两个不同的列中 数据

问题描述

我有一个非常基本的问题,尝试了一些已经在堆栈溢出中的东西,但不知何故它不起作用。

这是我的代码的头部:

  STATE STATEFP GW_ratio.2000 GW_ratio.2005 GW_ratio.2010 GW_ratio.2015 SW_ratio.2000 SW_ratio.2005 SW_ratio.2010 SW_ratio.2015
1    AL       1    0.04247763    0.04742443    0.04685309    0.05630994     0.9575224     0.9525756     0.9531469     0.9436901
2    AR       5    0.62710731    0.64621860    0.67762854    0.68590438     0.3728927     0.3537814     0.3223715     0.3140956
3    AZ       4    0.50738010    0.48620606    0.41704896    0.45948459     0.4926199     0.5137939     0.5829510     0.5405154
4    CA       6    0.39589058    0.32538360    0.39230956    0.67370799     0.6041094     0.6746164     0.6076904     0.3262920
5    CO       8    0.18116832    0.18174957    0.13833921    0.14408643     0.8188317     0.8182504     0.8616608     0.8559136
6    CT       9    0.13035722    0.18318935    0.25172450    0.20938673     0.8696428     0.8168107     0.7482755     0.7906133

我想要一个看起来像这样的框架:

STATE - STATEFP - YEAR - GW_ratio - SW_ratio

我迷路了,如果有人能帮助我就太好了!

解决方法

我们可以使用 pivot_longernames_sep 作为分隔符 .

library(dplyr)
library(tidyr)
df1 %>%
   pivot_longer(cols = -c(STATE,STATEFP),names_to = c(".value","YEAR"),names_sep = "\\.")

-输出

# A tibble: 24 x 5
#   STATE STATEFP YEAR  GW_ratio SW_ratio
#   <chr>   <int> <chr>    <dbl>    <dbl>
# 1 AL          1 2000    0.0425    0.958
# 2 AL          1 2005    0.0474    0.953
# 3 AL          1 2010    0.0469    0.953
# 4 AL          1 2015    0.0563    0.944
# 5 AR          5 2000    0.627     0.373
# 6 AR          5 2005    0.646     0.354
# 7 AR          5 2010    0.678     0.322
# 8 AR          5 2015    0.686     0.314
# 9 AZ          4 2000    0.507     0.493
#10 AZ          4 2005    0.486     0.514
# … with 14 more rows

数据

df1 <- structure(list(STATE = c("AL","AR","AZ","CA","CO","CT"),STATEFP = c(1L,5L,4L,6L,8L,9L),GW_ratio.2000 = c(0.04247763,0.62710731,0.5073801,0.39589058,0.18116832,0.13035722
    ),GW_ratio.2005 = c(0.04742443,0.6462186,0.48620606,0.3253836,0.18174957,0.18318935),GW_ratio.2010 = c(0.04685309,0.67762854,0.41704896,0.39230956,0.13833921,0.2517245),GW_ratio.2015 = c(0.05630994,0.68590438,0.45948459,0.67370799,0.14408643,0.20938673
    ),SW_ratio.2000 = c(0.9575224,0.3728927,0.4926199,0.6041094,0.8188317,0.8696428),SW_ratio.2005 = c(0.9525756,0.3537814,0.5137939,0.6746164,0.8182504,0.8168107),SW_ratio.2010 = c(0.9531469,0.3223715,0.582951,0.6076904,0.8616608,0.7482755),SW_ratio.2015 = c(0.9436901,0.3140956,0.5405154,0.326292,0.8559136,0.7906133)),class = "data.frame",row.names = c("1","2","3","4","5","6"))
,

使用 reshape 的基本 R 选项

reshape(
  df,direction = "long",idvar = c("STATE","STATEFP"),timevar = "YEAR",varying = -(1:2)
)

给予

          STATE STATEFP YEAR   GW_ratio  SW_ratio
AL.1.2000    AL       1 2000 0.04247763 0.9575224
AR.5.2000    AR       5 2000 0.62710731 0.3728927
AZ.4.2000    AZ       4 2000 0.50738010 0.4926199
CA.6.2000    CA       6 2000 0.39589058 0.6041094
CO.8.2000    CO       8 2000 0.18116832 0.8188317
CT.9.2000    CT       9 2000 0.13035722 0.8696428
AL.1.2005    AL       1 2005 0.04742443 0.9525756
AR.5.2005    AR       5 2005 0.64621860 0.3537814
AZ.4.2005    AZ       4 2005 0.48620606 0.5137939
CA.6.2005    CA       6 2005 0.32538360 0.6746164
CO.8.2005    CO       8 2005 0.18174957 0.8182504
CT.9.2005    CT       9 2005 0.18318935 0.8168107
AL.1.2010    AL       1 2010 0.04685309 0.9531469
AR.5.2010    AR       5 2010 0.67762854 0.3223715
AZ.4.2010    AZ       4 2010 0.41704896 0.5829510
CA.6.2010    CA       6 2010 0.39230956 0.6076904
CO.8.2010    CO       8 2010 0.13833921 0.8616608
CT.9.2010    CT       9 2010 0.25172450 0.7482755
AL.1.2015    AL       1 2015 0.05630994 0.9436901
AR.5.2015    AR       5 2015 0.68590438 0.3140956
AZ.4.2015    AZ       4 2015 0.45948459 0.5405154
CA.6.2015    CA       6 2015 0.67370799 0.3262920
CO.8.2015    CO       8 2015 0.14408643 0.8559136
CT.9.2015    CT       9 2015 0.20938673 0.7906133