gpt4 book ai didi

git - 为什么 Git 不使用更现代的 SHA?

转载 作者:行者123 更新时间:2023-12-01 19:05:10 24 4
gpt4 key购买 nike

我读到 Git 使用 SHA-1 摘要作为修订版的 ID。为什么它不使用更现代的 SHA 版本?

最佳答案

Why does it not use a more modern version of SHA?



2017 年 12 月:会的。 Git 2.16(2018 年第一季度)是第一个说明和实现该意图的版本。

注意:请参阅下面的 Git 2.19:它将是 SHA-256 .

Git 2.16 将提出一个基础设施来定义 Git 中使用的哈希函数,并将开始努力在各种代码路径中探索它。

commit c250e02 (2017 年 11 月 28 日) 来自 Ramsay Jones (``) .
commit eb0ccfd , commit 78a6766 , commit f50e766 , commit abade65 (2017 年 11 月 12 日) 来自 brian m. carlson ( bk2204 ) .
(由 Junio C Hamano -- gitster -- merge 于 commit 721cc43 ,2017 年 12 月 13 日)

Add structure representing hash algorithm

Since in the future we want to support an additional hash algorithm, add a structure that represents a hash algorithm and all the data that must go along with it.
Add a constant to allow easy enumeration of hash algorithms.
Implement function typedefs to create an abstract API that can be used by any hash algorithm, and wrappers for the existing SHA1 functions that conform to this API.

Expose a value for hex size as well as binary size.
While one will always be twice the other, the two values are both used extremely commonly throughout the codebase and providing both leads to improved readability.

Don't include an entry in the hash algorithm structure for the null object ID.
As this value is all zeros, any suitably sized all-zero object ID can be used, and there's no need to store a given one on a per-hash basis.

The current hash function transition plan envisions a time when we will accept input from the user that might be in SHA-1 or in the NewHash format.
Since we cannot know which the user has provided, add a constant representing the unknown algorithm to allow us to indicate that we must look the correct value up.



Integrate hash algorithm support with repo setup

In future versions of Git, we plan to support an additional hash algorithm.
Integrate the enumeration of hash algorithms with repository setup, and store a pointer to the enumerated data in struct repository.
Of course, we currently only support SHA-1, so hard-code this value in read_repository_format.
In the future, we'll enumerate this value from the configuration.

Add a constant, the_hash_algo, which points to the hash_algo structure pointer in the repository global.
Note that this is the hash which is used to serialize data to disk, not the hash which is used to display items to the user.
The transition plan anticipates that these may be different.
We can add an additional element in the future (say, ui_hash_algo) to provide for this case.



2018 年 8 月更新,对于 Git 2.19(2018 年第三季度),Git 似乎选择了 SHA-256 作为新哈希。

commit 0ed8d8d (2018 年 8 月 4 日) 来自 Jonathan Nieder ( artagnon ) .
commit 13f5e09 (2018 年 7 月 25 日) 来自 Ævar Arnfjörð Bjarmason ( avar ) .
(由 Junio C Hamano -- gitster -- merge 于 commit 34f2297 ,2018 年 8 月 20 日)

doc hash-function-transition: pick SHA-256 as NewHash

From a security perspective, it seems that SHA-256, BLAKE2, SHA3-256, K12, and so on are all believed to have similar security properties.
All are good options from a security point of view.

SHA-256 has a number of advantages:

  • It has been around for a while, is widely used, and is supported by just about every single crypto library (OpenSSL, mbedTLS, CryptoNG, SecureTransport, etc).

  • When you compare against SHA1DC, most vectorized SHA-256 implementations are indeed faster, even without acceleration.

  • If we're doing signatures with OpenPGP (or even, I suppose, CMS), we're going to be using SHA-2, so it doesn't make sense to have our security depend on two separate algorithms when either one of them alone could break the security when we could just depend on one.

So SHA-256 it is.
Update the hash-function-transition design doc to say so.

After this patch, there are no remaining instances of the string "NewHash", except for an unrelated use from 2008 as a variable name in t/t9700/test.pl.



您可以看到 Git 2.20(2018 年第四季度)正在向 SHA 256 过渡:

commit 0d7c419 , commit dda6346 , commit eccb5a5 , commit 93eb00f , commit d8a3a69 , commit fbd0e37 , commit f690b6b , commit 49d1660 , commit 268babd , commit fa13080 , commit 7b5e614 , commit 58ce21b , commit 2f0c9e9 , commit 825544a (2018 年 10 月 15 日) 来自 brian m. carlson ( bk2204 ) .
commit 6afedba (2018 年 10 月 15 日) 来自 SZEDER Gábor ( szeder ) .
(由 Junio C Hamano -- gitster -- merge 于 commit d829d49 ,2018 年 10 月 30 日)

replace hard-coded constants

Replace several 40-based constants with references to GIT_MAX_HEXSZ or the_hash_algo, as appropriate.
Convert all uses of the GIT_SHA1_HEXSZ to use the_hash_algo so that they are appropriate for any given hash length.
Instead of using a hard-coded constant for the size of a hex object ID, switch to use the computed pointer from parse_oid_hex that points after the parsed object ID.


GIT_SHA1_HEXSZ进一步删除/替换为 Git 2.22(2019 年第二季度)和 commit d4e568b .

这种转变在 Git 2.21(2019 年第一季度)中继续进行,它添加了 sha-256 哈希并将其插入代码以允许使用“NewHash”构建 Git。

commit 4b4e291 , commit 27dc04c , commit 13eeedb , commit c166599 , commit 37649b7 , commit a2ce0a7 , commit 50c817e , commit 9a3a0ff , commit 0dab712 , commit 47edb64 (2018 年 11 月 14 日)和 commit 2f90b9d , commit 1ccf07c (2018 年 10 月 22 日) 来自 brian m. carlson ( bk2204 ) .
(由 Junio C Hamano -- gitster -- merge 于 commit 33e4ae9 ,2019 年 1 月 29 日)

Add a base implementation of SHA-256 support (Feb. 2019)

SHA-1 is weak and we need to transition to a new hash function.
For some time, we have referred to this new function as NewHash.
Recently, we decided to pick SHA-256 as NewHash.
The reasons behind the choice of SHA-256 are outlined in this thread and in the commit history for the hash function transition document.

Add a basic implementation of SHA-256 based off libtomcrypt, which is in the public domain.
Optimize it and restructure it to meet our coding standards.
Pull in the update and final functions from the SHA-1 block implementation, as we know these function correctly with all compilers. This implementation is slower than SHA-1, but more performant implementations will be introduced in future commits.

Wire up SHA-256 in the list of hash algorithms, and add a test that the algorithm works correctly.

Note that with this patch, it is still not possible to switch to using SHA-256 in Git.
Additional patches are needed to prepare the code to handle a larger hash algorithm and further test fixes are needed.

hash: add an SHA-256 implementation using OpenSSL

We already have OpenSSL routines available for SHA-1, so add routines for SHA-256 as well.

On a Core i7-6600U, this SHA-256 implementation compares favorably to the SHA1DC SHA-1 implementation:

SHA-1: 157 MiB/s (64 byte chunks); 337 MiB/s (16 KiB chunks)
SHA-256: 165 MiB/s (64 byte chunks); 408 MiB/s (16 KiB chunks)

sha256: add an SHA-256 implementation using libgcrypt

Generally, one gets better performance out of cryptographic routines written in assembly than C, and this is also true for SHA-256.
In addition, most Linux distributions cannot distribute Git linked against OpenSSL for licensing reasons.

Most systems with GnuPG will also have libgcrypt, since it is a dependency of GnuPG.
libgcrypt is also faster than the SHA1DC implementation for messages of a few KiB and larger.

For comparison, on a Core i7-6600U, this implementation processes 16 KiB chunks at 355 MiB/s while SHA1DC processes equivalent chunks at 337 MiB/s.

In addition, libgcrypt is licensed under the LGPL 2.1, which is compatible with the GPL. Add an implementation of SHA-256 that uses libgcrypt.



Git 2.24(2019 年第四季度)继续进行升级工作

commit aaa95df , commit be8e172 , commit 3f34d70 , commit fc06be3 , commit 69fa337 , commit 3a4d7aa , commit e0cb7cd , commit 8d4d86b , commit f6ca67d , commit dd336a5 , commit 894c0f6 , commit 4439c7a , commit 95518fa , commit e84f357 , commit fe9fec4 , commit 976ff7e , commit 703d2d4 , commit 9d958cc , commit 7962e04 , commit fee4930 (2019 年 8 月 18 日) 来自 brian m. carlson ( bk2204 ) .
(由 Junio C Hamano -- gitster -- merge 于 commit 676278f ,2019 年 10 月 11 日)

Instead of using GIT_SHA1_HEXSZ and hard-coded constants, switch to using the_hash_algo.



在 Git 2.26(2020 年第一季度)中, 测试脚本准备好迎接对象名称将使用 SHA-256 的那一天。

commit 277eb5a , commit 44b6c05 , commit 7a868c5 , commit 1b8f39f , commit a8c17e3 , commit 8320722 , commit 74ad99b , commit ba1be1a , commit cba472d , commit 82d5aeb , commit 3c5e65c , commit 235d3cd , commit 1d86c8f , commit 525a7f1 , commit 7a1bcb2 , commit cb78f4f , commit 717c939 , commit 08a9dd8 , commit 215b60b , commit 194264c (2019 年 12 月 21 日)来自 brian m. carlson ( bk2204 ) .
(由 Junio C Hamano -- gitster -- merge 于 commit f52ab33 ,2020 年 2 月 5 日)

示例:

t4204: make hash size independent

Signed-off-by: brian m. carlson

Use $OID_REGEX instead of a hard-coded regular expression.



所以,而不是使用:
grep "^[a-f0-9]\{40\} $(git rev-parse HEAD)$" output

测试正在使用
grep "^$OID_REGEX $(git rev-parse HEAD)$" output

OID_REGEX来自 commit bdee9cd (2018 年 5 月 13 日)来自 brian m. carlson ( bk2204 ) .
(由 Junio C Hamano -- gitster -- merge 在 commit 9472b13 中,2018 年 5 月 30 日,Git v2.18.0-rc0)

t/test-lib: introduce OID_REGEX

Signed-off-by: brian m. carlson

Currently we have a variable, $_x40, which contains a regex that matches a full 40-character hex constant.

However, with NewHash, we'll have object IDs that are longer than 40 characters.

In such a case, $_x40 will be a confusing name.

Create a $OID_REGEX variable which will always reflect a regex matching the appropriate object ID, regardless of the length of the current hash.



而且,仍然用于测试:

commit f303765 , commit edf0424 , commit 5db24dc , commit d341e08 , commit 88ed241 , commit 48c10cc , commit f7ae8e6 , commit e70649b , commit a30f93b , commit a79eec2 , commit 796d138 , commit 417e45e , commit dfa5f53 , commit f743e8f , commit 72f936b , commit 5df0f11 , commit 07877f3 , commit 6025e89 , commit 7b1a182 , commit 94db7e3 , commit db12505 (2020 年 2 月 7 日) 来自 brian m. carlson ( bk2204 ) .
(由 Junio C Hamano -- gitster -- merge 于 commit 5af345a ,2020 年 2 月 17 日)

t5703: make test work with SHA-256

Signed-off-by: brian m. carlson

This test used an object ID which was 40 hex characters in length, causing the test not only not to pass, but to hang, when run with SHA-256 as the hash.

Change this value to a fixed dummy object ID using test_oid_init and test_oid.

Furthermore, ensure we extract an object ID of the appropriate length using cut with fields instead of a fixed length.



一些代码路径被赋予了一个存储库实例作为在存储库中工作的参数,但通过了 the_repository其被调用者的实例,已使用 Git 2.26(2020 年第一季度)进行了清理(在某种程度上)。

commit b98d188 , commit 2dcde20 , commit 7ad5c44 , commit c8123e7 , commit 5ec9b8a , commit a651946 , commit eb999b3 (2020 年 1 月 30 日)来自 Matheus Tavares ( matheustavares ) .
(由 Junio C Hamano -- gitster -- merge 于 commit 78e67cd ,2020 年 2 月 14 日)

sha1-file: allow check_object_signature() to handle any repo

Signed-off-by: Matheus Tavares

Some callers of check_object_signature() can work on arbitrary repositories, but the repo does not get passed to this function. Instead, the_repository is always used internally.
To fix possible inconsistencies, allow the function to receive a struct repository and make those callers pass on the repo being handled.



基于:

sha1-file: pass git_hash_algo to hash_object_file()

Signed-off-by: Matheus Tavares

Allow hash_object_file() to work on arbitrary repos by introducing a git_hash_algo parameter. Change callers which have a struct repository pointer in their scope to pass on the git_hash_algo from the said repo.
For all other callers, pass on the_hash_algo, which was already being used internally at hash_object_file().
This functionality will be used in the following patch to make check_object_signature() be able to work on arbitrary repos (which, in turn, will be used to fix an inconsistency at object.c:parse_object()).

关于git - 为什么 Git 不使用更现代的 SHA?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28159071/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com