gpt4 book ai didi

python - 如何在Python中比较2个文件时忽略字段数据

转载 作者:太空宇宙 更新时间:2023-11-03 16:24:52 25 4
gpt4 key购买 nike

输入文件如下,其字段架构asMode|Date|Count|timestamp|status|insertTimeStamp

test1.txt:
HR|06/08/2016|3000|Thu Jun 09 2016|Complete|20160627020300
HR|06/08/2016|2000|Thu Jun 09 2016|Complete|20160627020400
HR|06/08/2016|1000|Thu Jun 09 2016|Complete|20160627020500
test2.txt:
HR|06/08/2016|3010|Thu Jun 09 2016|Complete|20160627070300
HR|06/08/2016|2000|Fri Jun 09 2016|Complete|20160627080300
HR|06/08/2016|1500|Thu Jun 09 2016|Complete|20160627090300

现在我的要求是比较两个文件之间的差异行,但在比较时应该忽略 insertTimeStamp 字段(最后一列数据)。

我尝试了下面的代码。它工作正常,但它是逐行比较的。有人可以建议我如何在比较时跳过 insertTimeStamp 字段吗?

预先感谢您对我的帮助。

import difflib
import sys

with open('/tmp/test1.txt', 'r') as hosts0:
with open('/tmp/test2.txt', 'r') as hosts1:
diff = difflib.unified_diff(
hosts0.readlines(),
hosts1.readlines(),
fromfile='hosts0',
tofile='hosts1',
n=0,
)
for line in diff:
for prefix in ('---', '+++', '@@'):
if line.startswith(prefix):
break
else:
sys.stdout.write(line[1:])

最佳答案

您可以在将每行中的最后一个元素传递给 diff 函数之前将其切掉

diff = difflib.unified_diff(
['|'.join(x.split('|')[:-1]) for x in hosts0.readlines()],
['|'.join(x.split('|')[:-1]) for x in hosts1.readlines()],
fromfile='hosts0',
tofile='hosts1',
n=0,
)

不使用 difflib 的逐行比较:

with open('/tmp/test1.txt', 'r') as fh:
hosts1 = fh.readlines()
with open('/tmp/test2.txt', 'r') as fh:
hosts2 = fh.readlines()

for h1, h2 in zip(hosts1, hosts2):
if h1.split('|')[:-1] != h2.split('|')[:-1]:
print 'Lines are not the same!'

关于python - 如何在Python中比较2个文件时忽略字段数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38081143/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com