gpt4 book ai didi

python - RobuSTLy 使用 Git blame 检索 SHA 和行内容 (Python3)

转载 作者:行者123 更新时间:2023-12-04 04:17:06 27 4
gpt4 key购买 nike

我正在为一个使用 git blame 检索文件信息的包 (Python >= 3.5) 做贡献。我正在努力更换 GitPython自定义代码的依赖性仅支持我们实际需要的一小部分功能(并以我们实际需要的形式提供数据)。

我发现 git blame -lts 最接近我的需要,即检索文件中每一行的提交 SHA 和行内容。这给了我这样的输出

82a3e5021b7131e31fc5b110194a77ebee907955 books/main/docs/index.md  5) Softwareplattform [ILIAS](https://www.ilias.de/), die an zahlreichen

我已经处理过

       line_pattern = re.compile('(.*?)\s.*\s*\d\)(\s*.*)')

for line in cmd.stdout():
m = line_pattern.match(line)
if m:
sha = m.group(1)
content = m.group(2).strip()

效果很好。然而,该软件包的维护者正确地警告说,“这可能会为非常特定的用户组引入难以调试的错误。可能需要跨多个操作系统和 GIT 版本进行大量单元测试。”

我采用我的方法是因为我发现 git blame --porcelain 的输出解析起来有些乏味。

30ed8daf1c48e4a7302de23b6ed262ab13122d31 1 1 1
author XY
author-mail <XY>
author-time 1580742131
author-tz +0100
committer XY
committer-mail <XY>
committer-time 1580742131
committer-tz +0100
summary Stub-Outline-Dateien
filename home/docs/README.md
hero: abcdefghijklmnopqrstuvwxyz
82a3e5021b7131e31fc5b110194a77ebee907955 18 18

82a3e5021b7131e31fc5b110194a77ebee907955 19 19
---
82a3e5021b7131e31fc5b110194a77ebee907955 20 20

...

我不喜欢这种对字符串列表的迭代所涉及的内务处理。

我的问题是:

1) 我是否应该更好地使用 --porcelain 输出,因为它明确用于机器消费?2) 我可以期望这种格式在 Git 版本和操作系统上是健壮的吗?我是否可以假设以 TAB 字符开头的行是内容行,这是源代码行的最后输出行,并且该制表符之后的任何内容都是原始行内容?

最佳答案

不知道这是否是最好的解决方案,我没有在这里等待答案就试了一下。我假设我的两个问题的答案是"is"。

可以在此处的上下文中看到以下代码:https://github.com/uliska/mkdocs-git-authors-plugin/blob/6f5822c641452cea3edb82c2bbb9ed63bd254d2e/mkdocs_git_authors_plugin/repo.py#L466-L565

    def _process_git_blame(self):
"""
Execute git blame and parse the results.

This retrieves all data we need, also for the Commit object.
Each line will be associated with a Commit object and counted
to its author's "account".
Whether empty lines are counted is determined by the
count_empty_lines configuration option.

git blame --porcelain will produce output like the following
for each line in a file:

When a commit is first seen in that file:
30ed8daf1c48e4a7302de23b6ed262ab13122d31 1 2 1
author John Doe
author-mail <j.doe@example.com>
author-time 1580742131
author-tz +0100
committer John Doe
committer-mail <j.doe@example.com>
committer-time 1580742131
summary Fancy commit message title
filename home/docs/README.md
line content (indicated by TAB. May be empty after that)

When a commit has already been seen *in that file*:
82a3e5021b7131e31fc5b110194a77ebee907955 4 5
line content

In this case the metadata is not repeated, but it is guaranteed that
a Commit object with that SHA has already been created so we don't
need that information anymore.

When a line has not been committed yet:
0000000000000000000000000000000000000000 1 1 1
author Not Committed Yet
author-mail <not.committed.yet>
author-time 1583342617
author-tz +0100
committer Not Committed Yet
committer-mail <not.committed.yet>
committer-time 1583342617
committer-tz +0100
summary Version of books/main/docs/index.md from books/main/docs/index.md
previous 1f0c3455841488fe0f010e5f56226026b5c5d0b3 books/main/docs/index.md
filename books/main/docs/index.md
uncommitted line content

In this case exactly one Commit object with the special SHA and fake
author will be created and counted.

Args:
---
Returns:
--- (this method works through side effects)
"""

re_sha = re.compile('^\w{40}')

cmd = GitCommand('blame', ['--porcelain', str(self._path)])
cmd.run()

commit_data = {}
for line in cmd.stdout():
key = line.split(' ')[0]
m = re_sha.match(key)
if m:
commit_data = {
'sha': key
}
elif key in [
'author',
'author-mail',
'author-time',
'author-tz',
'summary'
]:
commit_data[key] = line[len(key)+1:]
elif line.startswith('\t'):
# assign the line to a commit
# and create the Commit object if necessary
commit = self.repo().get_commit(
commit_data.get('sha'),
# The following values are guaranteed to be present
# when a commit is seen for the first time,
# so they can be used for creating a Commit object.
author_name=commit_data.get('author'),
author_email=commit_data.get('author-mail'),
author_time=commit_data.get('author-time'),
author_tz=commit_data.get('author-tz'),
summary=commit_data.get('summary')
)
if len(line) > 1 or self.repo().config('count_empty_lines'):
author = commit.author()
if author not in self._authors:
self._authors.append(author)
author.add_lines(self, commit)
self.add_total_lines()
self.repo().add_total_lines()

关于python - RobuSTLy 使用 Git blame 检索 SHA 和行内容 (Python3),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60523415/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com