gpt4 book ai didi

git - Git Smart HTTP(S) 协议(protocol)在它所有的荣耀中是什么样子的?

转载 作者:行者123 更新时间:2023-12-05 03:40:35 29 4
gpt4 key购买 nike

我正在尝试实现一个模拟 Git 远程的网络服务器。用户应该能够从我的服务器克隆或 pull 、编辑文件、提交和推送(需要身份验证)——这些都是 Git 的正常操作。但是,在服务器端不是一个裸 Git 存储库或任何东西;数据以其他格式存储,仅在请求时转换。

我花了很多时间试图找出 Git 智能 HTTP 协议(protocol)的工作原理,以下是我目前所知道的。

来自Git docs on http-protocol ,我知道 GET $GIT_URL/info/refs?service=git-upload-pack HTTP/1.1 应该引发以下(示例)响应:

HTTP/1.1 200 OK<CRLF>
Content-Type: application/x-git-upload-pack-advertisement<CRLF>
Cache-Control: no-cache<CRLF>
<CRLF>
001e# service=git-upload-pack<LF>
0000<no LF>
004895dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint<NUL>multi_ack<LF>
003fd049f6c27a2244e12041955e262a404c7faba355 refs/heads/master<LF>
003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0<LF>
003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}<LF>
0000

来 self 自己对 a repo of mine with very few commits 的实验,到目前为止,GitHub 似乎完全在文档中描述的协议(protocol)限制范围内:

HTTP/1.1 200 OK<CRLF>
Server: GitHub Babel 2.0<CRLF>
Content-Type: application/x-git-upload-pack-advertisement<CRLF>
Content-Security-Policy: default-src 'none'; sandbox<CRLF>
Transfer-Encoding: chunked<CRLF>
expires: Fri, 01 Jan 1980 00:00:00 GMT<CRLF>
pragma: no-cache<CRLF>
Cache-Control: no-cache, max-age=0, must-revalidate<CRLF>
Vary: Accept-Encoding<CRLF>
X-Frame-Options: DENY<CRLF>
X-GitHub-Request-Id: [redacted]<CRLF>
<CRLF>
001e# service=git-upload-pack<LF>
0000<no LF>0156feee8d0aeff172f5b39e3175175d027f3fd5ecc1 HEAD<NUL>multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed allow-tip-sha1-in-want allow-reachable-sha1-in-want no-done symref=HEAD:refs/heads/master filter object-format=sha1 agent=git/github-g69d6dd5d35d8<LF>
003ffeee8d0aeff172f5b39e3175175d027f3fd5ecc1 refs/heads/master<LF>
0000

不过,简单的部分到此结束。如果我想实际获取提交数据怎么办? The Git docs on the matter给出要发送的 POST 请求示例和一些语法,然后说“TODO: Document this further”。 ????????

我尝试以我在文档中看到的格式通过 CURLing GitHub 进行试验。

(cwd)>curl https://github.com/Kenny2github/ConvoSplit.git/git-upload-pack -o - -i -X POST -d @-
0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1
0032have 941ea62275547bcbfb78fd97d29be18d09a78190
0009done
0000
^Z
HTTP/1.1 200 OK
Server: GitHub Babel 2.0
Content-Type: application/x-git-upload-pack-result
Content-Security-Policy: default-src 'none'; sandbox
Transfer-Encoding: chunked
expires: Fri, 01 Jan 1980 00:00:00 GMT
pragma: no-cache
Cache-Control: no-cache, max-age=0, must-revalidate
Vary: Accept-Encoding
X-GitHub-Request-Id: [redacted]
X-Frame-Options: DENY

curl: (18) transfer closed with outstanding read data remaining

什么?

我尝试使用 Python:

>>> import requests
>>> requests.post('https://github.com/Kenny2github/ConvoSplit.git/git-upload-pack', data=b'''
0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1
0032have 941ea62275547bcbfb78fd97d29be18d09a78190
0009done
0000
'''.strip())
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 572, in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 331, in _error_catcher
yield
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 637, in read_chunked
self._update_chunk_length()
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 576, in _update_chunk_length
raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\models.py", line 751, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 461, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 665, in read_chunked
self._original_response.close()
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\response.py", line 349, in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "<pyshell#17>", line 1, in <module>
requests.post('https://github.com/Kenny2github/ConvoSplit.git/git-upload-pack', data=b'0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1\n0032have 941ea62275547bcbfb78fd97d29be18d09a78190\n0009done\n0000')
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 119, in post
return request('post', url, data=data, json=json, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 685, in send
r.content
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\models.py", line 754, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

其余的 http 协议(protocol)文档没有帮助 - 出现了另外六个 TODO。 pack-protocol docs至少让我知道我应该收到什么,但没有说明如何收到。

Transfer Protocols docs没有告诉我任何新内容,然后说“看看 Git 源代码”。我试过了,但它是硬核 C,我必须基本上了解 Git 本身的整个基础结构。 (我可能会尝试这样做,但现在不是时候。)

我确实设法收集到涉及 git upload-pack,并且运行 git upload-pack --stateless-rpc --advertise-refs .git 确实给出了我像以前一样/info/refs 列表。然而,从中取出实际包的尝试失败了,不仅失败了,而且在平台之间的失败也不一致。

在 Windows 上:

(cwd)>git upload-pack --stateless-rpc .git
0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1
0009done # I hit Enter and nothing else
fatal: protocol error: bad line length character:
000

(cwd)>git upload-pack --stateless-rpc .git
0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1
0000 # likewise
fatal: protocol error: bad line length character:
000

(cwd)>py -c "print('0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1\n0009done\n0000')" | git upload-pack --stateless-rpc .git
fatal: protocol error: bad line length character:
000

怀疑是回车导致的问题,我尝试了WSL:

$ git upload-pack --stateless-rpc .git
0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1
0000 # I hit Enter and then ^D after 0000
fatal: The remote end hung up unexpectedly

$ git upload-pack --stateless-rpc .git
0032want feee8d0aeff172f5b39e3175175d027f3fd5ecc1
0009done # I hit Enter and did NOT hit ^D
fatal: git upload-pack: protocol error, expected to get sha, not 'done'

$ # using Python to pipe each of the above inputs yielded the same results

我做错了什么?如何让 GitHub/git-upload-pack 尊重我?

最佳答案

首先,不可能在 StackOverflow 答案中解释整个协议(protocol);解释太长了。不过,我会尝试指出一些需要注意的事项。

首先,当您说出协议(protocol)时,您需要非常准确;这不是容忍行尾差异和额外字节的情况。因此,如果您要合成数据以传递到远程,则应使用 printf(1) 或编程语言来完成。不要在 shell 上输入内容。

Git 使用 pkt-line 格式,这意味着每一行或每一 block 数据都以一个代表数据长度和前缀的四个十六进制字符序列为前缀。如果序列是 0000,那是一个 flush packet,它表示该数据 block 的结尾。如果序列是 0001,那是一个分隔符数据包,它在协议(protocol) v2 中用于分隔该数据 block 的各个部分。否则,十六进制序列的值不能超过65519。

在您发送 wanthave 行的情况下,您需要进行多次迭代,直到服务器向您发送一个包。在 HTTP 中,这是多个请求。服务器将向您发送对您指定的 have 参数的确认。服务器希望找到从每个 want 指令到双方都有的对象的路径(否则,客户端什么都没有,在这种情况下存储库为空)。

请注意,此任务实际上非常复杂。现在有一个用于提取的协议(protocol)的 v2(旧的是 v0,还有一个 v1,它是相同的但带有版本 header )。您还应该期望能够支持 SHA-256 存储库,这些存储库当前不与 SHA-1 存储库互操作,但在其他方面受到支持。 Git 还提供了大量您实际上想要支持的扩展,例如边带功能,如果您想向用户提供关于您这边正在做什么的输出,这是必需的。

文档主要位于 Git 存储库的 Documentation/technical 中。它在某些地方不完整,但您应该可以通过一些阅读和测试来辨别它。

关于git - Git Smart HTTP(S) 协议(protocol)在它所有的荣耀中是什么样子的?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68062812/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com