在R中一次重新格式化多列中的日期

问题描述

我正在尝试将多列从“字符”转换为日期,但也要重新格式化日期。我可以逐列完成,但我希望编写某种循环来迭代所有变量。

例如,我可以做到

dates_test$date.d.m.y <- format(as.Date(dates_test$date.d.m.y,"%d/%m/%Y),"%m/%d/%Y")

如何编写代码,同时将 date.d.m.y 和 d.m.y.test 的格式更改为 %m/%d/%Y 格式?

这里的数据集:

dput(head(dates_test,20)

structure(list(date.m.d.y = c("5/13/2013","5/14/2013","5/15/2013","5/16/2013","5/17/2013","5/18/2013","5/19/2013","5/20/2013","5/21/2013","5/22/2013","5/23/2013","5/24/2013","5/25/2013","5/26/2013","5/27/2013","5/28/2013","5/29/2013","5/30/2013","5/31/2013","6/1/2013"),date.d.m.y = c("2/2/2012","2/2/2012","9/2/2012","16/2/2012","23/2/2012","23/2/2012"),date.y.m.d = c("2010-12-11","2010-12-12","2010-12-13","2010-12-14","2010-12-15","2010-12-16","2010-12-17","2010-12-18","2010-12-19","2010-12-20","2010-12-21","2010-12-22","2010-12-23","2010-12-24","2010-12-25","2010-12-26","2010-12-27","2010-12-28","2010-12-29","2010-12-30"),date.d.m.y.2 = c("13.5.2013","14.5.2013","15.5.2013","16.5.2013","17.5.2013","18.5.2013","19.5.2013","20.5.2013","21.5.2013","22.5.2013","23.5.2013","24.5.2013","25.5.2013","26.5.2013","27.5.2013","28.5.2013","29.5.2013","30.5.2013","31.5.2013","1.6.2013"),date.m.d.y.2 = c("13-May-2013","14-May-2013","15-May-2013","16-May-2013","17-May-2013","18-May-2013","19-May-2013","20-May-2013","21-May-2013","22-May-2013","23-May-2013","24-May-2013","25-May-2013","26-May-2013","27-May-2013","28-May-2013","29-May-2013","30-May-2013","31-May-2013","1-Jun-2013"),d.m.y.test = c("2/2/2012",new = c("13/05/2013","14/05/2013","15/05/2013","16/05/2013","17/05/2013","18/05/2013","19/05/2013","20/05/2013","21/05/2013","22/05/2013","23/05/2013","24/05/2013","25/05/2013","26/05/2013","27/05/2013","28/05/2013","29/05/2013","30/05/2013","31/05/2013","01/06/2013")),row.names = c(NA,20L),class = "data.frame")

解决方法

我们可以使用 anydateacross 不同日期格式的列转换为 Date 类,然后应用 format 返回所需的格式 - {{1} }

%m/%d/%Y

-输出

library(anytime)
library(dplyr)
addFormats("%d.%m.%Y")
dates_test1 <- dates_test %>% 
     mutate(across(everything(),~ format(anydate(.),"%m/%d/%Y")))
,

这里有一个包 lubridate 函数 parse_date_time 的方法。主要技巧是首先从列名中获取格式。

library(lubridate)

fmt <- names(dates_test)
fmt <- sub("date","",fmt)
fmt <- unique(gsub("[^dmy]",fmt))
fmt <- fmt[nchar(fmt) == 3]

dates_new <- lapply(dates_test,parse_date_time,orders = fmt)
dates_new <- do.call(cbind.data.frame,dates_new)

str(dates_new)
#'data.frame':  20 obs. of  7 variables:
# $ date.m.d.y  : POSIXct,format: "2013-05-13" ...
# $ date.d.m.y  : POSIXct,format: "2012-02-02" ...
# $ date.y.m.d  : POSIXct,format: "2010-12-11" ...
# $ date.d.m.y.2: POSIXct,format: "2013-05-13" ...
# $ date.m.d.y.2: POSIXct,format: "2013-05-13" ...
# $ d.m.y.test  : POSIXct,format: "2012-02-02" ...
# $ new         : POSIXct,format: "2013-05-13" ...

既然列都是 "POSIXct" 类,lapply 是适当的 format 方法。 R 的 S3 类机制会自动调用它。

dates_new[] <- lapply(dates_new,format,format = "%m/%d/%Y")
str(dates_new)
#'data.frame':  20 obs. of  7 variables:
# $ date.m.d.y  : chr  "05/13/2013" "05/14/2013" "05/15/2013" "05/16/2013" ...
# $ date.d.m.y  : chr  "02/02/2012" "02/02/2012" "02/02/2012" "02/02/2012" ...
# $ date.y.m.d  : chr  "12/11/2010" "12/12/2010" "12/13/2010" "12/14/2010" ...
# $ date.d.m.y.2: chr  "05/13/2013" "05/14/2013" "05/15/2013" "05/16/2013" ...
# $ date.m.d.y.2: chr  "05/13/2013" "05/14/2013" "05/15/2013" "05/16/2013" ...
# $ d.m.y.test  : chr  "02/02/2012" "02/02/2012" "02/02/2012" "02/02/2012" ...
# $ new         : chr  "05/13/2013" "05/14/2013" "05/15/2013" "05/16/2013" ...
,

您可以使用 parsedate 包并根据需要格式化输出。

library(parsedate)

for (i in colnames(dates_test)) {
  dates_test[,i] <- format(parse_date(dates_test[,i]),"%m/%d/%Y")
}

dates_test
   date.m.d.y date.d.m.y date.y.m.d date.d.m.y.2 date.m.d.y.2 d.m.y.test        new
1  05/13/2013 02/02/2012 12/11/2010   05/13/2013   05/13/2013 02/02/2012 05/13/2013
2  05/14/2013 02/02/2012 12/12/2010   05/14/2013   05/14/2013 02/02/2012 05/14/2013
3  05/15/2013 02/02/2012 12/13/2010   05/15/2013   05/15/2013 02/02/2012 05/15/2013
4  05/16/2013 02/02/2012 12/14/2010   05/16/2013   05/16/2013 02/02/2012 05/16/2013
5  05/17/2013 02/02/2012 12/15/2010   05/17/2013   05/17/2013 02/02/2012 05/17/2013
6  05/18/2013 09/02/2012 12/16/2010   05/18/2013   05/18/2013 09/02/2012 05/18/2013
7  05/19/2013 09/02/2012 12/17/2010   05/19/2013   05/19/2013 09/02/2012 05/19/2013
8  05/20/2013 09/02/2012 12/18/2010   05/20/2013   05/20/2013 09/02/2012 05/20/2013
9  05/21/2013 09/02/2012 12/19/2010   05/21/2013   05/21/2013 09/02/2012 05/21/2013
10 05/22/2013 09/02/2012 12/20/2010   05/22/2013   05/22/2013 09/02/2012 05/22/2013
11 05/23/2013 02/16/2012 12/21/2010   05/23/2013   05/23/2013 02/16/2012 05/23/2013
12 05/24/2013 02/16/2012 12/22/2010   05/24/2013   05/24/2013 02/16/2012 05/24/2013
13 05/25/2013 02/16/2012 12/23/2010   05/25/2013   05/25/2013 02/16/2012 05/25/2013
14 05/26/2013 02/16/2012 12/24/2010   05/26/2013   05/26/2013 02/16/2012 05/26/2013
15 05/27/2013 02/16/2012 12/25/2010   05/27/2013   05/27/2013 02/16/2012 05/27/2013
16 05/28/2013 02/23/2012 12/26/2010   05/28/2013   05/28/2013 02/23/2012 05/28/2013
17 05/29/2013 02/23/2012 12/27/2010   05/29/2013   05/29/2013 02/23/2012 05/29/2013
18 05/30/2013 02/23/2012 12/28/2010   05/30/2013   05/30/2013 02/23/2012 05/30/2013
19 05/31/2013 02/23/2012 12/29/2010   05/31/2013   05/31/2013 02/23/2012 05/31/2013
20 06/01/2013 02/23/2012 12/30/2010   06/01/2013   06/01/2013 02/23/2012 01/06/2013