R - parse_date_time() 上的一周间隔,自 2018 年以来返回错误的星期一

问题描述

我使用包 lubridate 中的函数 parse_date_time() 和参数 orders = 'YAU' 将年和周数转换为星期一日期。例如,'2017Monday1' 给出 '2017-01-02',即 2017 年的第一个星期一。

但从 2018 年开始,有一周的时间间隔。

parse_date_time('2018Monday1',orders = 'YAU')
#"2018-01-08 UTC"

但是2018年的第一个星期一是'2018-01-01',有一周的时间间隔。所有下周都有相同的差距,例如:

parse_date_time('2020Monday1',orders = 'YAU')
#"2020-01-06 UTC"      # wrong,it should be 2019-12-30

parse_date_time('2020Monday52',orders = 'YAU')
#"2020-12-28 UTC"      # wrong,it should be 2020-12-21

parse_date_time('2020Monday53',orders = 'YAU')
# NA                   # wrong,it should be 2020-12-28,2020 counts 53 weeks (leap year).

有人明白这里发生了什么吗? 谢谢。

解决方法

来自?parse_date_time

     'U' Week of the year as decimal number (00-53 or 0-53) using
          Sunday as the first day 1 of the week (and typically with the
          first Sunday of the year as day 1 of week 1).  The US
          convention.

这是一个基于 0 的操作,而不是一个基于 1 的操作。第一周编号为 0。

lubridate::parse_date_time('2018Monday0',orders = 'YAU')
# [1] "2018-01-01 UTC"

不幸的是,这似乎并不完全一致:

lubridate::parse_date_time(paste0(1980:2020,"Monday",0),"YAU")
# Warning:  36 failed to parse.
#  [1] NA               NA               NA               NA               NA              
#  [6] NA               NA               NA               NA               NA              
# [11] "1990-01-01 UTC" NA               NA               NA               NA              
# [16] NA               "1996-01-01 UTC" NA               NA               NA              
# [21] NA               "2001-01-01 UTC" NA               NA               NA              
# [26] NA               NA               "2007-01-01 UTC" NA               NA              
# [31] NA               NA               NA               NA               NA              
# [36] NA               NA               NA               "2018-01-01 UTC" NA              
# [41] NA              

看来这可能是一个需要人工干预的逻辑故障。

mondays0 <- paste0(2007:2018,0)
mondays1 <- paste0(2007:2018,1)

lubridate::parse_date_time(mondays0,"YAU")
# Warning:  10 failed to parse.
#  [1] "2007-01-01 UTC" NA               NA               NA               NA              
#  [6] NA               NA               NA               NA               NA              
# [11] NA               "2018-01-01 UTC"
### okay,we cannot rely on mondays0

(dates <- lubridate::parse_date_time(mondays1,"YAU"))
#  [1] "2007-01-08 UTC" "2008-01-07 UTC" "2009-01-05 UTC" "2010-01-04 UTC" "2011-01-03 UTC"
#  [6] "2012-01-02 UTC" "2013-01-07 UTC" "2014-01-06 UTC" "2015-01-05 UTC" "2016-01-04 UTC"
# [11] "2017-01-02 UTC" "2018-01-08 UTC"
(dates <- dates - ifelse(day(dates) > 7,7*86400,0))
#  [1] "2007-01-01 UTC" "2008-01-07 UTC" "2009-01-05 UTC" "2010-01-04 UTC" "2011-01-03 UTC"
#  [6] "2012-01-02 UTC" "2013-01-07 UTC" "2014-01-06 UTC" "2015-01-05 UTC" "2016-01-04 UTC"
# [11] "2017-01-02 UTC" "2018-01-01 UTC"

(第一个和最后一个条目是以前的问题,现在已修复。)

我不知道这是否是一个错误,或者是否存在一些不能依赖的极端情况(闰年等)。