使用 pivot_longer 将数据重塑为多列

问题描述

我正在使用 pivot_longer 将我的数据从宽格式改成长格式为多个值列。我知道有相关问题(Pivot_longer 6 columns to 3 columnsTidy dataset with pivot_longer: Multiple columns into two columns),但到目前为止我找不到解决方案,可能是因为我的两列属于不同的类别,第一个POSIXct第二个是numeric

这是一个最小的工作示例:

    structure(list(compid = c("AT9130162999","AT9090003478","AT9070005375","AT9130048156"),iso2c = c("AT","AT","AT"),nace4 = c("7010","4211","2452","7010"),lastyear = c("2018","2019","2019"),`Closing date
                          Last avail. yr` = structure(c(1546214400,1577750400,1585612800,1577750400),tzone = "UTC",class = c("POSIXct","POSIXt")),`Closing date
                          Year - 1` = structure(c(1514678400,1546214400,1553990400,1546214400),`Closing date
                          Year - 2` = structure(c(NA,1514678400,1522454400,1514678400),`Closing date
                          Year - 3` = structure(c(NA,1483142400,1490918400,1483142400),`Closing date
                          Year - 4` = structure(c(NA,1451520000,1459382400,1451520000),`Closing date
                          Year - 5` = structure(c(NA,1419984000,1427760000,1419984000),`Closing date
                          Year - 6` = structure(c(NA,1388448000,1396224000,1388448000),`Closing date
                          Year - 7` = structure(c(NA,1356912000,1364688000,1356912000),`Closing date
                          Year - 8` = structure(c(NA,1325289600,1333152000,1325289600),`Closing date
                          Year - 9` = structure(c(NA,1293753600,1301529600,1293753600),operatinginc_last = c(NA,482813,-94300,NA),operatinginc_year1 = c(NA,423482,780400,operatinginc_year2 = c(NA,404694,1210300,ebit_last = c(1060000,351292),ebit_year1 = c(1501000,331415),ebit_year2 = c(NA,305492),operatingrev_last = c(28463000,15842418,13009700,11742884),operatingrev_year1 = c(NA,13734462,13146300,10682889
),operatingrev_year2 = c(NA,10682889)),row.names = c(NA,-4L),class = c("tbl_df","tbl","data.frame"))

到目前为止,我已经尝试过:

df_l <- df %>%  
pivot_longer(.,cols = -(starts_with(c("compid","iso2c","nace4","lastyear","Closing"))),values_to = "value",values_drop_na=T,names_sep = "_",names_to = c("variable","year"))

但现在我还想重塑所有以 Closing 开头的列。我该怎么做(最好用 pivot_longer 一步完成)?

预期的输出应该包括 variableyearvalue 列,以及 closingdatedate 列:

 compid    iso2c   nace4   lastyear   `closingdate             ~ `date              ~`variable      ~`year       ~ `value
   <chr>  <chr> <chr> <chr>    <dttm>              <dttm>              <dttm>              <dttm>             
 1 AT913~ AT    7010  2018    `Closing date Last avail. yr` 2018-12-31 ebit  last            28463000                 
 2 AT913~ AT    7010  2018    `Closing date Year - 1`       2017-12-31 ebit  year1           15362687  
 2 AT913~ AT    7010  2018    `Closing date Year - 1`       2016-12-31 ebit  year2           404694                 
           

解决方法

我不知道在一次调用 pivot_longer 中您将如何做到这一点,因为您有不同方案的不同变量。而且您还希望将结束日期变量延长。所以这里是在两次调用中对结束变量进行了一些清理。

library(tidyverse)

df_l <-   pivot_longer(df,cols = starts_with("Closing"),values_to = "date",values_drop_na=T,names_to = c("closing")) %>%
  pivot_longer(.,cols = contains("_"),values_to = "value",names_sep = '_',names_to = c("variable",'year')) %>%
  mutate(closing = str_remove_all(closing,'Closing date') %>% 
           str_remove_all(.,'[:cntrl:]') %>%
           str_squish() %>%
           str_trim())