问题描述
我正在尝试检查包含IP地址和子网的两个输出文件之间的差异。这些文件已从文件中剥离,并存储在output1.txt和output2.txt上。比较时我很挣扎。这些文件的行数并不总是相同的,因此逐行比较似乎不是一种选择。例如,两个文件的IP地址都可以为192.168.1.1,但在output1.txt中可以位于第1行,而在output2.txt中则可以位于第60行。如何比较两个文件中都不存在的字符串? / p>
下面的代码
import difflib
with open('input1.txt','r') as f:
with open('output1.txt','w') as g:
for line in f:
ipaddress = line.split(None,1)[0]
g.write(ipaddress + "\n")
with open('input2.txt','r') as f:
with open('output2.txt',1)[0]
g.write(ipaddress + "\n")
with open('output1.txt','r') as output1,open('output2.txt','r') as output2:
output1_text = output1.read()
output2_text = output2.read()
d = difflib.Differ()
diff = d.compare(output1_text,output2_text)
print(''.join(diff))
我最终希望将差异写入文件中,但现在仅打印结果就可以了。
感谢帮助。
谢谢。
解决方法
您可能希望进行集合比较:
with open('output1.txt') as fh1,open('output2.txt') as fh2:
# collect lines into sets
set1,set2 = set(fh1),set(fh2)
diff = set1.symmetric_difference(set2)
print(''.join(diff))
symmetric_difference
将在哪里:
返回一个新集合,该集合中包含一个或多个元素,但不包含两者。