问题描述
我有很多这样的日期和数字列表:
1.1.2018 0:00;2590
3.1.2018 1:00;2530
4.2.2018 2:00;1700
6.2.2018 3:00;2340
18.3.2018 4:00;1800
15.4.2018 5:00;2850
...
我需要将所有具有相同星期编号的数字相加,并在一周内返回总数字,如下所示:
0;0
1;549730
2;645010
3;681320
4;677060
5;698450
...etc
52;576280
53;81640
到目前为止,这是我的代码,我将日期和数字分隔在自己的列表中,但不确定如何从此处继续。
import datetime
def main():
file = open("2018Electricity.txt","r")
line = file.readline()
time_list = []
electricity_list = []
total = []
for i in file:
time = i.strip().split(';')[0]
electricity = i.strip().split(';')[1]
time_list.append(datetime.strptime(time,'%d.%m.%Y %H:%M'))
electricity_list.append(electricity)
file.close()
main()
任务要求我有0-53周的时间,并使用列表和strftime%W。
解决方法
这是完整的代码(代码中的注释提供了解释):
from datetime import datetime #You messed up with the import statement. It should be from datetime import datetime instead of import datetime
def main():
file = open("2018Electricity.txt","r")
line = file.readline()
time_list = []
electricity_list = []
total = []
for i in file:
time = i.strip().split(';')[0]
electricity = i.strip().split(';')[1]
datee = datetime.strptime(time,'%d.%m.%Y %H:%M')
if datee.month != 12:
time_list.append(datee.isocalendar()[1])
else:
if datee.isocalendar()[1] == 1:
time_list.append(53)
else:
time_list.append(datee.isocalendar()[1])
electricity_list.append(int(electricity)) #Converts electricity to an integer and appends it to electricity_list
week_numbers = list(set(time_list)) #Removes all repeated week numbers
for week_number in week_numbers: #Iterates over the week_numbers
curr_elec = 0
for week,elec in zip(time_list,electricity_list): #Creates an iterable out of time_list and electricty_list
if week == week_number:
curr_elec += elec #Running total of the electricity for the current week
print(f"{week_number};{curr_elec}")
file.close()
main()
输出:
1;5120
5;1700
6;2340
11;1800
15;2850
,
对我来说,pandas DataFrame似乎是这项工作的正确工具。 Read the csv转换为df,解析日期/时间列to datetime,groupby周号,并使用sum作为aggfunc:
from io import StringIO # for demo only
import pandas as pd
data = """datetime;values
1.1.2018 0:00;2590
3.1.2018 1:00;2530
4.2.2018 2:00;1700
6.2.2018 3:00;2340
18.3.2018 4:00;1800
15.4.2018 5:00;2850"""
df = pd.read_csv(StringIO(data),sep=';',parse_dates=['datetime'],dayfirst=True)
df.groupby(df.datetime.dt.isocalendar().week)['values'].sum()
Out[8]:
week
1 5120
5 1700
6 2340
11 1800
15 2850
Name: values,dtype: int64
您可以方便地将此数据写入csv,请参见pd.to_csv。