编写函数以引用特定列

问题描述

我必须定期但出于不同的原因从同一个 API 中提取不同的数据集,因此我必须为许多不同的提取编写代码。我想创建一些函数来帮助解决这个问题,但我需要一些帮助。

我一直无法弄清楚如何设置函数,以便我可以更改数据集,但每次仍然从同一列中提取在这个例子中,我有 3 列的时间戳表示不同的东西(在这个数据中组成)。我需要将这里的时区更改为我的本地时区。在我的所有数据集中,列名将保持不变,但数据集的名称将发生变化。我的代码中有几个地方需要执行此操作,但我一直无法弄清楚,因此非常感谢您提供任何建议!

此示例代码的第二部分未包含在实际代码中,但用于正确设置数据。数据以 GMT 显示的格式从 API 中输出

df <- data.frame(col_1 = c(1,2,3,4),time_1 = c("2021-01-20 23:58:21","2021-01-20 21:21:00","2021-01-20 17:14:04","2021-01-20 01:05:18"),time_2 = c("2021-01-19 23:58:21","2021-01-19 21:21:00","2021-01-19 17:14:04","2021-01-19 01:05:18"),time_3 = c("2021-01-18 23:46:21","2021-01-18 36:21:00","2021-01-18 15:14:04","2021-01-18 01:05:18"),time_4 = c("2021-01-17 23:58:21","2021-01-17 20:21:00","2021-01-17 18:14:04","2021-01-17 02:05:18"))

# Not part of actual code 
df$time_1 <- as.POSIXlt(df$time_1,tz = "GMT")
df$time_2 <- as.POSIXlt(df$time_2,tz = "GMT")
df$time_3 <- as.POSIXlt(df$time_3,tz = "GMT")
df$time_4 <- as.POSIXlt(df$time_4,tz = "GMT")

# What I want it to do
# df$time_1 <- lubridate::with_tz(df$time_1,tz = "America/Los_Angeles")
# df$time_2 <- lubridate::with_tz(df$time_2,tz = "America/Los_Angeles")
# df$time_3 <- lubridate::with_tz(df$time_3,tz = "America/Los_Angeles")
# df$time_4 <- lubridate::with_tz(df$time_4,tz = "America/Los_Angeles")

# Attempted function
timezone_cleanup <- function(my_df){
  my_df$time_1 <- lubridate::with_tz(my_df$time_1,tz = "America/Los_Angeles")
  my_df$time_2 <- lubridate::with_tz(my_df$time_2,tz = "America/Los_Angeles")
  my_df$time_3 <- lubridate::with_tz(my_df$time_3,tz = "America/Los_Angeles")
  my_df$time_4 <- lubridate::with_tz(my_df$time_4,tz = "America/Los_Angeles")
}

# how I'd like to use this function.  Not working Now.  Even if I wrap it with data.frame(),it's not what I wanted.
new_df <- timezone_cleanup(df)

解决方法

我认为您需要在函数中返回 my_df 才能取回更改后的数据帧。但是,您可以使用 lapplyacross 将相同的函数应用于多个列。

library(dplyr)

timezone_cleanup <- function(my_df){
  my_df %>%
     mutate(across(starts_with('time'),lubridate::with_tz,tz = "America/Los_Angeles"))
}

new_df <- timezone_cleanup(df)

顺便说一下,我在使用此 Unrecognized time zone 'America/Los_Angeles' 时确实收到了一条警告消息。您确定您使用的是正确的 tz 值吗?