查找文件之间的差异,而无需逐行检查蟒蛇

问题描述

我正在尝试检查包含IP地址和子网的两个输出文件间的差异。这些文件已从文件中剥离,并存储在output1.txt和output2.txt上。比较时我很挣扎。这些文件的行数并不总是相同的,因此逐行比较似乎不是一种选择。例如,两个文件的IP地址都可以为192.168.1.1,但在output1.txt中可以位于第1行,而在output2.txt中则可以位于第60行。如何比较两个文件中都不存在的字符串? / p>

下面的代码

import difflib


with open('input1.txt','r') as f:
    with open('output1.txt','w') as g:
        for line in f:
            ipaddress = line.split(None,1)[0]
            g.write(ipaddress + "\n")
with open('input2.txt','r') as f:
    with open('output2.txt',1)[0]
            g.write(ipaddress + "\n")

with open('output1.txt','r') as output1,open('output2.txt','r') as output2:
    output1_text = output1.read()
    output2_text = output2.read()
    d = difflib.Differ()
    diff = d.compare(output1_text,output2_text)
    print(''.join(diff))

我最终希望将差异写入文件中,但现在仅打印结果就可以了。

感谢帮助。

谢谢。

解决方法

您可能希望进行集合比较:

with open('output1.txt') as fh1,open('output2.txt') as fh2:
    # collect lines into sets
    set1,set2 = set(fh1),set(fh2)
    
diff = set1.symmetric_difference(set2)

print(''.join(diff))

symmetric_difference将在哪里:

返回一个新集合,该集合中包含一个或多个元素,但不包含两者。