如何从txt文件中查找和提取值?

问题描述

编写一个程序,提示输入文件名,然后打开该文件并读取该文件,查找格式如下的行:

X-DSPAM-Confidence: 0.8475

对这些行进行计数,从每行中提取浮点值,然后计算这些值的平均值并产生如下所示的输出。请勿在解决方案中使用sum()函数或名为sum的变量。*

这是我的代码

fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
    ln = ln.rstrip()
    count += 1
    if not ln.startswith("X-DSPAM-Confidence:    ") : continue
    for num in fh:
        if ln.find(float(0.8475)) == -1:
            num += float(0.8475)
        if not ln.find(float(0.8475)) : break
    # problem: values aren't adding together and gq variable ends up being zero
gq = int(num)
jp = int(count)
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))

问题是,当我运行代码时,它指出存在错误,因为num的值为零。所以我收到了这个:

ZeroDivisionError: division by zero

当我将num的初始值更改为None时,会发生类似的问题:

int() argument must be a string or a number,not 'nonetype'

当我将其放在代码顶部时,python COURSERA自动分级机也不会接受它:

from __future__ import division

他们提供给我们的示例数据的文件名为“ mBox-short.txt”。这是链接http://www.py4e.com/code3/mbox-short.txt

解决方法

我如下编辑了您的代码。我认为您的任务是在 X-DSPAM-Confidence:旁边找到数字。我使用您的代码来识别 X-DSPAM-Confidence:行。然后,我用':'分割了字符串,然后选择了第一个索引,然后转换为float。

fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
    ln = ln.rstrip()
    if not ln.startswith("X-DSPAM-Confidence:") : continue
    count+=1 
    num += float(ln.split(":")[1])
gq = num
jp = count
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))
,
  • 使用with打开文件,因此文件自动关闭。
  • 请参阅在线注释。
  • 所需行的格式为X-DSPAM-Confidence: 0.6961,因此请在空格处将其分割。
    • 'X-DSPAM-Confidence: 0.6961'.split(' ')创建一个列表,其编号为列表索引1。
fname = input("Enter a file name:",)
with open(fname) as fh:
    count = 0
    num = 0  # collect and add each found value
    for ln in fh:
        ln = ln.rstrip()
        if not ln.startswith("X-DSPAM-Confidence:"):  # find this string or continue to next ln
            continue
        num += float(ln.split(' ')[1])  # split on the space and add the float
        count += 1  # increment count for each matching line
    avr = num / count  # compute average
    print(f"Average spam confidence: {avr}")  # print value