在 R 4.0 中执行 NA 替换时出错

问题描述

使用 R 3.6 我可以执行以下 NA 替换

> d <- zoo(data.frame(a = NA,b = 1),Sys.Date())
> d[is.na(d)] <- 1
> d
           a b
2021-03-03 1 1

使用 R 4.0 时出现以下错误

> d <- zoo(data.frame(a = NA,Sys.Date())
> d[is.na(d)] <- 1
Error in as.Date.default(e) : 
  do not kNow how to convert 'e' to class “Date”

R 4.0 中的某些认行为是否发生了变化?

R 3.6 会话信息:

Microsoft Windows [Version 10.0.19041.804]
(c) 2020 Microsoft Corporation. All rights reserved.

C:\>R --no-site-file

R version 3.6.1 (2019-07-05) -- "Action of the Toes"
copyright (C) 2019 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos,'help()' for on-line help,or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(zoo)

Attaching package: 'zoo'

The following objects are masked from 'package:base':

    as.Date,as.Date.numeric

Warning message:
package 'zoo' was built under R version 4.0.4
> d <- zoo(data.frame(a = NA,Sys.Date())
> d[is.na(d)] <- 1
> d
           a b
2021-03-03 1 1

R 4.0 会话信息:

Microsoft Windows [Version 10.0.19041.804]
(c) 2020 Microsoft Corporation. All rights reserved.

C:\>R --no-site-file

R version 4.0.4 (2021-02-15) -- "Lost Library Book"
copyright (C) 2021 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos,as.Date.numeric

> d <- zoo(data.frame(a = NA,Sys.Date())
> d[is.na(d)] <- 1
Error in as.Date.default(e) :
  do not kNow how to convert 'e' to class "Date"

会话信息 (3.6):

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  Grdevices utils     datasets  methods   base

other attached packages:
[1] zoo_1.8-8

loaded via a namespace (and not attached):
[1] compiler_3.6.1  grid_3.6.1      lattice_0.20-38

会话信息 (4.0):

> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  Grdevices utils     datasets  methods   base

other attached packages:
[1] zoo_1.8-8

loaded via a namespace (and not attached):
[1] compiler_4.0.4  tools_4.0.4     grid_4.0.4      lattice_0.20-41

解决方法

感谢您提出这个问题,这是 zoo 包中的一个错误。在 [.zoo[<-.zoo 方法中,我们通过

检查索引 i 是否是矩阵
if (all(class(i) == "matrix")) ...

这在 R 3.x.y 中工作正常,因为矩阵对象只有 "matrix" 类。然而,在 R 4.0.0 矩阵对象开始额外继承 "array"。请参阅:https://developer.R-project.org/Blog/public/2019/11/09/when-you-think-class.-think-again/

在 R-Forge (https://R-Forge.R-project.org/R/?group_id=18) 上的 zoo 开发版本中,我现在通过将上述代码替换为

来解决该问题
if (inherits(i,"matrix")) ...

因此,您已经可以从 R-Forge 安装 zoo 1.8-9,您的代码将再次按预期运行。或者,您可以等待该版本到达 CRAN,它有望在反向依赖项检查后的几天内发布。同时,您可以使用

来解决该问题
coredata(d)[is.na(d)] <- 1
,

我也有这个问题!!关于可能发生的事情,这里有更多奇怪的行为/面包屑。仍然不确定为什么,但似乎是一个索引问题 w/zoo 而不是 is.na() 特别。如果逻辑结构与动物园对象具有相同的行名/索引,则逻辑索引有效:

  1. 打印出 d[is.na(d)](没有赋值)会导致一个空的动物园对象,这表明问题出在索引上

  2. 将 d 包装在 coredata() 中起作用

d <- zoo(data.frame(a = NA,b = 1),Sys.Date())
coredata(d)[is.na(d)] <- 1
d
             a b
> 2021-03-05 1 1
  1. 如果 is.na() 返回的逻辑被转换为与动物园对象具有相同的行名/索引,则该逻辑将起作用。
d <- zoo(data.frame(a = NA,Sys.Date())
changes <- is.na(d) #storing logical in a variable
> d
            a b
2021-03-05 NA 1

> changes #d[changes] won't work,so change rownames 
        a     b
[1,] TRUE FALSE

> changes <- as.zoo(changes,index(d))
> changes
              a     b
2021-03-05 TRUE FALSE

> d[as.logical(changes)] #changing zoo back to a logical,returns something 
            a b
2021-03-05 NA 1

> d[as.logical(changes)] <- 1
            a b
2021-03-05 NA 1

现在是面包屑……有人知道第 4 版中对 R 的日期类做了哪些更改吗? Zoo 建议它对 merge.zoo 进行了一些更改,“明确地解决了 R >= 4.1.0 中 c.Date() 的新行为。”

https://cran.r-project.org/web/packages/zoo/NEWS(见顶部的第 2 点)

我搜索了又搜索,没有看到提到这些变化......

我猜 Zoo 类有一些变化,以更严格地执行日期索引......不确定......似乎也不太明白要发布一个关于 zoo 的问题

更多面包屑.... 显然,根据此线程 https://github.com/joshuaulrich/xts/issues/331

,该功能早在 2020 年 4 月就适用于动物园对象

R 4.0.1 在当月晚些时候发布,最新的 zoo 包 1.8-8 于 2020 年 5 月发布,因此运行该包的 1.8-7 版可以确定它是带有 R 的更改还是带有 R 的更改导致不同行为的动物园

,

@neilfws 你是对的,这个问题是由于 R 4.0 中类响应的变化。

目前,最好的选择是使用:

d <- na.fill(d,1)

coredata(d)[is.na(d)] <- 1

d[is.na(d)] <- 1 的旧用例需要更新 zoo 包