gpt4 book ai didi

Python Difflib Deltas 和比较 Ndiff

转载 作者:太空狗 更新时间:2023-10-30 02:00:20 26 4
gpt4 key购买 nike

我想做一些我认为更改控制系统所做的事情,它们比较两个文件,并在每次文件更改时保存一个小的差异。我一直在阅读此页面:http://docs.python.org/library/difflib.html而且它显然没有沉入我的脑海。

我试图在下面显示的一个有点简单的程序中重新创建它,但我似乎缺少的是 Delta 包含的内容至少与原始文件一样多,甚至更多。

难道不可能只进行纯粹的更改吗?我问的原因很明显 - 节省磁盘空间。
我可以每次都保存整个代码块,但最好只保存一次当前代码,然后保存小的差异。

我还在试图弄清楚为什么许多 difflib 函数返回一个生成器而不是一个列表,这有什么好处?

difflib 对我有用吗?还是我需要找一个功能更多的更专业的软件包?

# Python Difflib demo 
# Author: Neal Walters
# loosely based on http://ahlawat.net/wordpress/?p=371
# 01/17/2011

# build the files here - later we will just read the files probably
file1Contents="""
for j = 1 to 10:
print "ABC"
print "DEF"
print "HIJ"
print "JKL"
print "Hello World"
print "j=" + j
print "XYZ"
"""

file2Contents = """
for j = 1 to 10:
print "ABC"
print "DEF"
print "HIJ"
print "JKL"
print "Hello World"
print "XYZ"
print "The end"
"""

filename1 = "diff_file1.txt"
filename2 = "diff_file2.txt"

file1 = open(filename1,"w")
file2 = open(filename2,"w")

file1.write(file1Contents)
file2.write(file2Contents)

file1.close()
file2.close()
#end of file build

lines1 = open(filename1, "r").readlines()
lines2 = open(filename2, "r").readlines()

import difflib

print "\n FILE 1 \n"
for line in lines1:
print line

print "\n FILE 2 \n"
for line in lines2:
print line

diffSequence = difflib.ndiff(lines1, lines2)

print "\n ----- SHOW DIFF ----- \n"
for i, line in enumerate(diffSequence):
print line

diffObj = difflib.Differ()
deltaSequence = diffObj.compare(lines1, lines2)
deltaList = list(deltaSequence)

print "\n ----- SHOW DELTALIST ----- \n"
for i, line in enumerate(deltaList):
print line



#let's suppose we store just the diffSequence in the database
#then we want to take the current file (file2) and recreate the original (file1) from it
#by backward applying the diff

restoredFile1Lines = difflib.restore(diffSequence,1) # 1 indicates file1 of 2 used to create the diff

restoreFileList = list(restoredFile1Lines)

print "\n ----- SHOW REBUILD OF FILE1 ----- \n"
# this is not showing anything!
for i, line in enumerate(restoreFileList):
print line

谢谢!

更新:

contextDiffSeq = difflib.context_diff(lines1, lines2) 
contextDiffList = list(contextDiffSeq)

print "\n ----- SHOW CONTEXTDIFF ----- \n"
for i, line in enumerate(contextDiffList):
print line

----- SHOW CONTEXTDIFF -----




* 5,9 **

 print "HIJ"

print "JKL"

print "Hello World"
  • print "j=" + j

    print "XYZ"

--- 5,9 ----

 print "HIJ"

print "JKL"

print "Hello World"

print "XYZ"
  • print "The end"

另一个更新:

在 Panvalet 和 Librarian 的过去,大型机的源代码管理工具,你可以像这样创建一个变更集:

++ADD 9
print "j=" + j

这只是意味着在第 9 行之后添加一行(或多行)。然后是像++REPLACE 或++UPDATE 这样的单词。 http://www4.hawaii.gov/dags/icsd/ppmo/Stds_Web_Pages/pdf/it110401.pdf

最佳答案

I'm also still trying to figure out why many difflib functions return a generator instead of a list, what's the advantage there?

好吧,想一想 - 如果你比较文件,这些文件在理论上(并且在实践中)会非常大 - 将增量作为列表返回,例如,意味着将完整的数据读入内存,这不是明智之举。

至于只返回差异,嗯,使用生成器还有另一个优势 - 只需迭代增量并保留您感兴趣的任何行。

如果您阅读 difflib documentation对于 Differ - style deltas,你会看到这样一段话:

Each line of a Differ delta begins with a two-letter code:
Code Meaning
'- ' line unique to sequence 1
'+ ' line unique to sequence 2
' ' line common to both sequences
'? ' line not present in either input sequence

因此,如果您只想要差异,可以使用 str.startswith 轻松过滤掉它们

您还可以使用 difflib.context_diff 获取仅显示更改的紧凑增量。

关于Python Difflib Deltas 和比较 Ndiff,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4743359/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com