将CSV中的行添加到新文件

问题描述

您好,我想总结一下并在从CSV文件读取的文档中添加数字。

例如我的csv看起来像这样

Date,Customer number,Customer,Project number,Project,Worked time
2020,2020010,Apple,12345,Buying laptops,1,00
2020,4,3,Nokia,98738,Buying phones,00

我想将其输出到一个csv文件中,并使脚本像这样总结每个客户的工作时间数量

Apple,11岁 诺基亚5

到目前为止,我只有这个

 
results = []
with open('Time_export.csv') as File:
    reader = csv.DictReader(File)
    for row in reader:
        results.append(row)
    print (results)

我是这个菜鸟:) 一直试图用谷歌搜索它,但无法弄清楚:( 有什么想法吗?

解决方法

使用词典存储客户名称和总数:

import csv

data = '''
Date,Customer number,Customer,Project number,Project,Worked time
2020,2020010,Apple,12345,Buying laptops,1,00
2020,4,3,Nokia,98738,Buying phones,00
'''.strip()

with open('Time_export.csv','w') as f: f.write(data)  # write test file

################################

cust = {}  # customer totals

with open('Time_export.csv') as File:
    reader = csv.DictReader(File)
    for row in reader:
        if row['Customer'] in cust:
           cust[row['Customer']] += int(row['Worked time'])
        else:
           cust[row['Customer']] = int(row['Worked time'])
        
    print (cust)

输出

{'Apple': 11,'Nokia': 5}

如果您想尝试熊猫,代码会变小:

import pandas
df = pandas.read_csv('Time_export.csv',index_col=False )
df['Worked time'] = df['Worked time'].astype(int)
gb = df.groupby('Customer')["Worked time"].sum().reset_index()
print(gb.to_string(index=False))

输出

Customer  Worked time
   Apple           11
   Nokia            5
,

我发现collections.defaultdict对于这类事情很有用。它会根据需要自动创建新的键/值对。在这种情况下,默认为int会根据需要创建0

import csv
import collections

with open('Time_export.csv') as File:
    results = collections.defaultdict(int)
    reader = csv.DictReader(File)
    for row in reader:
        results[row['Customer']] += int(row['Worked time'])

for name,num in sorted(results.items()):
    print(f"{name}: {num}")
,

pandas是用于处理表的强大库。它很难学习,但是值得努力。您的数据在“工作时间”列中使用逗号,使其无效CSV。如果将其更改为“。”或正确地转义,那么您可以用几行代码来完成这项工作。

open

这是由客户分组的,除去“工作时间”列以外的所有内容,然后对分组求和。结果是一个系列对象,其行为非常类似于字典:

import pandas as pd
df = pd.read_csv('Time_export.csv')
sums = df.groupby("Customer")["Worked time"].sum()
,

我尝试过

import collections

with open('Time.csv') as File:
    results = collections.defaultdict(int)
    reader = csv.DictReader(File)
    for row in reader:
        results[row['Customer']] += int(row['Worked time'])

for name,num in sorted(results.items()):
    print(f"{name}: {num}")

但是得到了结果

Traceback (most recent call last):
  File "/Users/stoffe/Desktop/Python/Time.py",line 8,in <module>
    results[row['Customer']] += int(row['Worked time'])
ValueError: invalid literal for int() with base 10: '1,00'
,

我现在大部分事情都可以正常工作,但是我仍然遇到一些麻烦

import csv
import collections

f = open('./Time_export.csv','r')
a = [',00']
lst = []
for line in f:
    for word in a:
        if word in line:
            line = line.replace(word,'')
    lst.append(line)
f.close()
f = open('./Time_export.csv','w')
for line in lst:
    f.write(line)
f.close()

with open('Time_export.csv') as File:
    results = collections.defaultdict(int)
    reader = csv.DictReader(File)
    for row in reader:
        print(row['Project'],row['Service'],row['Worked time'])
        
        f = open('./Time.csv','w')
for name,num in sorted(results.items()):
    f.write(f"{name}: {num}")
    f.close()

我打开文件以在小时后删除.00,但是为了某种原因,我为每个条目获得1个pos,而不是将数字添加到每个项目中,结果显示在终端窗口中,但是Time.csv文件是还是空的。

看起来像这样

Apple Cleaning  6
Volvo Installing 4
AFRY Window Cleaning 5
Apple Cleaning 1
Apple Building 1
AFRY Window Cleaning 2
Donald Duck Writing 12
Donald Duck  Reading 2

有什么想法吗?

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...