问题描述
我是R的新手,但是我真的全神贯注于R。我目前正在尝试清除导入的csv数据的一些练习,从中生成了data.table。这是我的data.table的示例:
> head(sample_dt)
date_of_birth date_employed Total AVC numtrans firstfundingdate minAccountingdte
1 10/01/1988 16/08/2013 490909.6 0 61 25/11/2014 31/10/2014
2 26/12/1971 08/01/2001 4400292.1 0 175 19/08/2006 28/02/2006
3 15/10/1979 14/01/2005 92240.0 0 44 25/10/2006 31/01/2005
4 04/04/1973 30/04/2002 1594627.9 0 158 18/09/2012 30/04/2007
5 22/02/1972 22/02/1996 627662.7 0 126 27/02/2007 31/10/2006
6 07/06/1976 01/03/2010 3735319.2 0 129 13/05/2010 31/03/2010
gender client_status Balance
1 F C 626567.9
2 M C 9955518.3
3 F C 385284.5
4 M C 3097565.4
5 M C 1815569.6
6 M C 7132986.0
我尝试了多个data.table命令失败,这是一个-可能令人困惑。我正在尝试添加四个新的日期列,这些日期列是根据4个现有列的格式生成的,最好使用链接语法:
sample_dt[,`:=` (dob=(as.Date(date_of_birth))),(employment_dte=(as.Date(date_employed))),(firstfunding_dte=(as.Date(firstfundingdate))),(minAccounting_dte=(as.Date(minAccountingdte))),'%d/%m/%Y']
我想尽可能地保留原始data.table并生成这种类型的输出,该输出将使用data.table NOT data.frame的代码保留原始日期列并创建新列:>
setDT(df)[,(date_cols) := lapply(.SD,anytime::anydate),.SDcols = date_cols]
> head(sample_dt)
date_of_birth date_employed Total AVC numtrans firstfundingdate minAccountingdte
1 10/01/1988 16/08/2013 490909.6 0 61 25/11/2014 31/10/2014
2 26/12/1971 08/01/2001 4400292.1 0 175 19/08/2006 28/02/2006
3 15/10/1979 14/01/2005 92240.0 0 44 25/10/2006 31/01/2005
4 04/04/1973 30/04/2002 1594627.9 0 158 18/09/2012 30/04/2007
5 22/02/1972 22/02/1996 627662.7 0 126 27/02/2007 31/10/2006
6 07/06/1976 01/03/2010 3735319.2 0 129 13/05/2010 31/03/2010
gender client_status Balance dob employment_dte firstfunding_dte minAccounting_dte
1 F C 626567.9 1988-01-11 2013-08-16 2014-11-25 2014-10-31
2 M C 9955518.3 1971-12-26 2001-01-08 2006-08-19 2006-02-28
3 F C 385284.5 1979-10-15 2005-01-14 2006-10-25 2005-01-31
4 M C 3097565.4 1973-04-04 2002-04-30 2012-09-18 2007-04-30
5 M C 1815569.6 1972-02-22 1996-02-22 2007-02-27 2006-10-31
6 M C 7132986.0 1976-06-07 2010-03-01 2010-05-13 2010-03-31
其他信息:
> dput(head(sample_dt))
structure(list(date_of_birth = c("10/01/1988","26/12/1971","15/10/1979","04/04/1973","22/02/1972","07/06/1976"),date_employed = c("16/08/2013","08/01/2001","14/01/2005","30/04/2002","22/02/1996","01/03/2010"
),Total = c(490909.59,4400292.09,92240,1594627.95,627662.74,3735319.25),AVC = c(0,0),numtrans = c(61L,175L,44L,158L,126L,129L),firstfundingdate = c("25/11/2014","19/08/2006","25/10/2006","18/09/2012","27/02/2007","13/05/2010"),minAccountingdte = c("31/10/2014","28/02/2006","31/01/2005","30/04/2007","31/10/2006","31/03/2010"
),gender = c("F","M","F","M"),client_status = c("C","C","C"),Balance = c(626567.94,9955518.35,385284.46,3097565.35,1815569.61,7132985.99)),row.names = c(NA,6L),class = c("data.table","data.frame"))
任何帮助都会很棒!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)