将CSV中的行添加到新文件

问题描述

您好，我想总结一下并在从CSV文件读取的文档中添加数字。

例如我的csv看起来像这样

Date,Customer number,Customer,Project number,Project,Worked time
2020,2020010,Apple,12345,Buying laptops,1,00
2020,4,3,Nokia,98738,Buying phones,00

我想将其输出到一个csv文件中，并使脚本像这样总结每个客户的工作时间数量

Apple，11岁诺基亚5

到目前为止，我只有这个

 
results = []
with open('Time_export.csv') as File:
    reader = csv.DictReader(File)
    for row in reader:
        results.append(row)
    print (results)

我是这个菜鸟:) 一直试图用谷歌搜索它，但无法弄清楚:( 有什么想法吗？

解决方法

使用词典存储客户名称和总数：

import csv

data = '''
Date,Customer number,Customer,Project number,Project,Worked time
2020,2020010,Apple,12345,Buying laptops,1,00
2020,4,3,Nokia,98738,Buying phones,00
'''.strip()

with open('Time_export.csv','w') as f: f.write(data)  # write test file

################################

cust = {}  # customer totals

with open('Time_export.csv') as File:
    reader = csv.DictReader(File)
    for row in reader:
        if row['Customer'] in cust:
           cust[row['Customer']] += int(row['Worked time'])
        else:
           cust[row['Customer']] = int(row['Worked time'])
        
    print (cust)

输出

{'Apple': 11,'Nokia': 5}

如果您想尝试熊猫，代码会变小：

import pandas
df = pandas.read_csv('Time_export.csv',index_col=False )
df['Worked time'] = df['Worked time'].astype(int)
gb = df.groupby('Customer')["Worked time"].sum().reset_index()
print(gb.to_string(index=False))

输出

Customer  Worked time
   Apple           11
   Nokia            5

我发现collections.defaultdict对于这类事情很有用。它会根据需要自动创建新的键/值对。在这种情况下，默认为int会根据需要创建0。

import csv
import collections

with open('Time_export.csv') as File:
    results = collections.defaultdict(int)
    reader = csv.DictReader(File)
    for row in reader:
        results[row['Customer']] += int(row['Worked time'])

for name,num in sorted(results.items()):
    print(f"{name}: {num}")

pandas是用于处理表的强大库。它很难学习，但是值得努力。您的数据在“工作时间”列中使用逗号，使其无效CSV。如果将其更改为“。”或正确地转义，那么您可以用几行代码来完成这项工作。

open

这是由客户分组的，除去“工作时间”列以外的所有内容，然后对分组求和。结果是一个系列对象，其行为非常类似于字典：

import pandas as pd
df = pd.read_csv('Time_export.csv')
sums = df.groupby("Customer")["Worked time"].sum()

我尝试过

import collections

with open('Time.csv') as File:
    results = collections.defaultdict(int)
    reader = csv.DictReader(File)
    for row in reader:
        results[row['Customer']] += int(row['Worked time'])

for name,num in sorted(results.items()):
    print(f"{name}: {num}")

但是得到了结果

Traceback (most recent call last):
  File "/Users/stoffe/Desktop/Python/Time.py",line 8,in <module>
    results[row['Customer']] += int(row['Worked time'])
ValueError: invalid literal for int() with base 10: '1,00'

我现在大部分事情都可以正常工作，但是我仍然遇到一些麻烦

import csv
import collections

f = open('./Time_export.csv','r')
a = [',00']
lst = []
for line in f:
    for word in a:
        if word in line:
            line = line.replace(word,'')
    lst.append(line)
f.close()
f = open('./Time_export.csv','w')
for line in lst:
    f.write(line)
f.close()

with open('Time_export.csv') as File:
    results = collections.defaultdict(int)
    reader = csv.DictReader(File)
    for row in reader:
        print(row['Project'],row['Service'],row['Worked time'])
        
        f = open('./Time.csv','w')
for name,num in sorted(results.items()):
    f.write(f"{name}: {num}")
    f.close()

我打开文件以在小时后删除.00，但是为了某种原因，我为每个条目获得1个pos，而不是将数字添加到每个项目中，结果显示在终端窗口中，但是Time.csv文件是还是空的。

看起来像这样

Apple Cleaning  6
Volvo Installing 4
AFRY Window Cleaning 5
Apple Cleaning 1
Apple Building 1
AFRY Window Cleaning 2
Donald Duck Writing 12
Donald Duck  Reading 2

有什么想法吗？

python sum summary

将CSV中的行添加到新文件

问题描述

解决方法

相关问答