计算从年/月初到今天的天数

问题描述

假设我们有一个数据框(df):

opendate
2020-08-04
2018-06-24
2011-03-17
2019-11-20

我想做两件事:

  1. 对于每个日期,计算从特定年份开始到该日期的天数
  2. 对于每个日期,计算从特定月份开始到该日期的天数

在R中,我可以通过以下代码进行操作:

Year_Month_Diff <- function(x,start) as.numeric(x - as.Date(cut(start,"year")));
df = transform(df,Year_day_played = Year_Month_Diff(opendate,opendate));

Month_Diff <- function(x,"month")));
df= transform(df,Month_day_played = Month_Diff(opendate,opendate));

对于python等效项的任何帮助将不胜感激。

解决方法

这个月真的很简单,只需致电.dt.day

对于年份情况,从同一年的1月1日减去日期,然后计算天数。

假设opendate已经是Timestamp类型:

df['Days since BOM'] = df['opendate'].dt.day
df['Days since BOY'] = (df['opendate'] - (df['opendate'] - pd.tseries.offsets.YearBegin())).dt.days

感谢@ChrisA,针对该年份的情况有一个更简单的解决方案:

df['Days since BOY'] = df['opendate'].dt.dayofyear 
,

这比其他答案要简单,但也可以。

from time import mktime,strptime
from datetime import datetime,timedelta

date = '2020-05-05'
time_format = '%Y-%m-%d'

def string_to_date(string,time_format):
    string = string.split(' ')[0]
    struct = strptime(string,time_format)
    obj = datetime.fromtimestamp(mktime(struct))
    return obj

def get_start_of_month(date):
    month_day = date.day
    to_remove = timedelta(days=month_day-1)
    new_date = date - to_remove
    return new_date

def get_start_of_year(date):
    new_date = datetime(date.year,1,1)
    return new_date

def time_from_month(date):
    start = get_start_of_month(date)
    obj = date - start
    return obj.days

def time_from_year(date):
    start = get_start_of_year(date)
    obj = date - start
    return obj.days

print(time_from_month(obj))
print(time_from_year(obj))