gpt4 book ai didi

git - 是否有任何支持部分 checkout /克隆的分布式修订控制系统?

转载 作者:IT王子 更新时间:2023-10-29 01:30:08 24 4
gpt4 key购买 nike

据我所知,所有分布式版本控制系统都要求您克隆整个存储库。出于这个原因,将大量内容放在一个单一的存储库中是不明智的(感谢 this answer)。我知道这不是错误而是功能,但我想知道这是否是所有分布式修订控制系统的要求。

在分布式 rcs 中,文件(或内容块)的历史记录是一个有向无环图,那么为什么不能克隆这个单个 DAG 而不是存储库中所有图的集合?也许我错过了一些东西,但以下用例很难做到:

  • 仅克隆存储库的一部分
  • merge 两个存储库(保留它们的历史!)
  • 将一些带有历史记录的文件从一个存储库复制到另一个

  • 如果我从多个项目中重用其他人的部分代码,我将无法保留他们的完整历史记录。至少在 git 中,我可以想到一个(相当复杂的)解决方法:
  • 克隆一个完整的存储库
  • 删除所有我不感兴趣的内容
  • 重写历史以删除所有不在 master 中的内容
  • 将剩余的存储库 merge 到现有存储库中

  • 我不知道这对于 Mercurial 或 Bazaar 是否也可行,但至少它根本不容易。那么是否有任何分布式 rcs 设计支持部分结帐/克隆?它应该支持一个简单的命令来从一个存储库中获取单个文件及其历史记录并将其 merge 到另一个存储库中。这样你就不需要考虑如何将你的内容组织成存储库和子模块,但你可以根据需要愉快地拆分和 merge 存储库(极端情况是每个文件一个存储库)。

    最佳答案

    从 Git 2.17(2018 年第二季度,10 年后)开始,可以实现 Mercurial 计划实现的功能:“narrow clone ”,即您只检索特定子目录数据的克隆。
    这也称为“部分克隆”。
    这与现在的不同

  • shallow clone
  • 从另一个工作文件夹中的克隆存储库中复制您需要的内容。

  • 请参阅 commit 3aa6694commit aa57b87commit 35a7ae9commit 1e1e39bcommit acb0c57commit bc2d0c3commit 640d8b7commit 10ac85cJeff Hostetler ( jeffhostetler )(2017 年 12 月 8 日)。
    请参阅 commit a1c6d7ccommit c0c578bcommit 548719fcommit a174334commit 0b6069fJonathan Tan ( jhowtan ) (2017 年 12 月 8 日)。
    (由 Junio C Hamano -- gitster --commit 6bed209 中 merge ,2018 年 2 月 13 日)
    这是 tests for a partial clone :
    git clone --no-checkout --filter=blob:none "file://$(pwd)/srv.bare" pc1 

    还有其他 other commits involved in that implementation of a narrow/partial clone
    特别是 commit 8b4c010 :

    sha1_file: support lazily fetching missing objects


    Teach sha1_file to fetch objects from the remote configured inextensions.partialclone whenever an object is requested but missing.



    关于 Git 2.17/2.18 的警告:最近添加的“部分克隆”实验性功能在不应该启动时启动,即即使设置了 extensions.partialclone 也没有定义部分克隆过滤器。
    请参阅 commit cac1137Jonathan Tan ( jhowtan )(2018 年 6 月 11 日)。
    (由 Junio C Hamano -- gitster --commit 92e1bbc 中 merge ,2018 年 6 月 28 日)

    upload-pack: disable object filtering when disabled by config


    When upload-pack gained partial clone support (v2.17.0-rc0~132^2~12,2017-12-08), it was guarded by the uploadpack.allowFilter config itemto allow server operators to control when they start supporting it.

    That config item didn't go far enough, though: it controls whether the'filter' capability is advertised, but if a (custom) client ignoresthe capability advertisement and passes a filter specification anyway,the server would handle that despite allowFilter being false.

    This is particularly significant if a security bug is discovered inthis new experimental partial clone code.
    Installations without uploadpack.allowFilter ought not to be affected since they don't intend to support partial clone, but they would be swept up into beingvulnerable.



    这在 Git 2.20(2018 年第二季度)中得到了增强,因为部分克隆中的“ git fetch $repo $object”没有正确获取由 promise 包文件中的对象引用的请求对象,该对象已修复。
    请参阅 commit 35f9e3ecommit 4937291Jonathan Tan ( jhowtan )(2018 年 9 月 21 日)。
    (由 Junio C Hamano -- gitster --commit a1e9dff 中 merge ,2018 年 10 月 19 日)

    fetch: in partial clone, check presence of targets


    When fetching an object that is known as a promisor object to the localrepository, the connectivity check in quickfetch() in builtin/fetch.csucceeds, causing object transfer to be bypassed.
    However, this should not happen if that object is merely promised and not actually present.

    Because this happens, when a user invokes "git fetch origin <sha-1>" onthe command-line, the <sha-1> object may not actually be fetched eventhough the command returns an exit code of 0. This is a similar issue(but with a different cause) to the one fixed by a0c9016("upload-pack: send refs' objects despite "filter"", 2018-07-09, Git v2.19.0-rc0).

    Therefore, update quickfetch() to also directly check for the presenceof all objects to be fetched.



    您可以使用 git rev-list --exclude-promisor-objects 列出部分克隆的对象,不包括“promisor”对象

    (For internal use only.) Prefilter object traversal at promisor boundary.
    This is used with partial clone.
    This is stronger than --missing=allow-promisor because it limits the traversal, rather than just silencing errors about missing objects.


    但请确保使用 Git 2.21(2019 年第一季度)以避免段错误。
    请参阅 commit 4cf6786Matthew DeVore ( matvore )(2018 年 12 月 5 日)。
    (由 Junio C Hamano -- gitster --commit c333fe7 中 merge ,2019 年 1 月 14 日)

    "git rev-list --exclude-promisor-objects" had to take an object that does not exist locally (and is lazily available) from the command line without barfing, but the code dereferenced NULL.

    list-objects.c :不要为缺少的 cmdline 对象设置段错误

    When a command is invoked with both --exclude-promisor-objects, --objects-edge-aggressive, and a missing object on the command line, the rev_info.cmdline array could get a NULL pointer for the value of an 'item' field.
    Prevent dereferencing of a NULL pointer in that situation.



    请注意,Git 2.21(2019 年第一季度)修复了一个错误:
    请参阅 commit bbcde41Matthew DeVore ( matvore )(2018 年 12 月 3 日)。
    (由 Junio C Hamano -- gitster --commit 6e5be1f 中 merge ,2019 年 1 月 14 日)

    exclude-promisor-objects: declare when option is allowed


    The --exclude-promisor-objects option causes some funny behavior in atleast two commands: log and blame.
    It causes a BUG crash:

    $ git log --exclude-promisor-objects
    BUG: revision.c:2143: exclude_promisor_objects can only be used
    when fetch_if_missing is 0
    Aborted
    [134]

    Fix this such that the option is treated like any other unknown option.
    The commands that must support it are limited, so declare in those commands that the flag is supported.
    In particular:

    pack-objects
    prune
    rev-list

    The commands were found by searching for logic which parses --exclude-promisor-objects outside of revision.c.
    Extra logic outside of revision.c is needed because fetch_if_missing must be turned on before revision.c sees the option or it will BUG-crash. The above list is supported by the fact that no other command is introspectively invoked by another command passing --exclude-promisor-object.



    Git 2.22(2019 年第二季度)优化了窄克隆:
    在惰性克隆中运行“ git diff ”时,我们可以预先知道哪个
    缺少我们需要的 blob,而不是等待按需
    一一发现它们的机器。
    旨在通过批处理对这些 promise 的 blob 的请求来实现更好的性能。
    请参阅 commit 7fbbcb2(2019 年 4 月 5 日)和 commit 0f4a4fb(2019 年 3 月 29 日)的 Jonathan Tan ( jhowtan )
    (由 Junio C Hamano -- gitster --commit 32dc15d 中 merge ,2019 年 4 月 25 日)

    diff: batch fetching of missing blobs


    When running a command like "git show" or "git diff" in a partial clone,batch all missing blobs to be fetched as one request.

    This is similar to c0c578b ("unpack-trees: batch fetching of missingblobs", 2017-12-08, Git v2.17.0-rc0), but for another command.



    Git 2.23(2019 年第 3 季度)将证明批量丢失 blob 部分的 future 。
    请参阅 commit 31f5256Derrick Stolee ( derrickstolee )(2019 年 5 月 28 日)。
    (由 Junio C Hamano -- gitster -- merge 于 commit 5d5c46b ,2019 年 6 月 17 日)

    sha1-file: split OBJECT_INFO_FOR_PREFETCH


    The OBJECT_INFO_FOR_PREFETCH bitflag was added to sha1-file.c in 0f4a4fb (sha1-file: support OBJECT_INFO_FOR_PREFETCH, 2019-03-29, Git v2.22.0-rc0) and is used to prevent the fetch_objects() method when enabled.

    However, there is a problem with the current use.
    The definition of OBJECT_INFO_FOR_PREFETCH is given by adding 32 to OBJECT_INFO_QUICK.
    This is clearly stated above the definition (in a comment) that this is soOBJECT_INFO_FOR_PREFETCH implies OBJECT_INFO_QUICK.
    The problem is that using "flag & OBJECT_INFO_FOR_PREFETCH" means that OBJECT_INFO_QUICK also implies OBJECT_INFO_FOR_PREFETCH.

    Split out the single bit from OBJECT_INFO_FOR_PREFETCH into a newOBJECT_INFO_SKIP_FETCH_OBJECT as the single bit and keepOBJECT_INFO_FOR_PREFETCH as the union of two flags.


    并且“ git fetch ”变成了一个懒惰的克隆忘记获取基础对象
    需要在一个瘦包文件中完成增量,这已经
    更正。
    请参阅 commit 810e193commit 5718c53commit 8a30a1e(2019 年 6 月 11 日)和 commit 385d1bfJonathan Tan ( jhowtan )(2019 年 5 月 14 日)。
    (由 Junio C Hamano -- gitster --commit 8867aa8 中 merge ,2019 年 6 月 21 日)

    index-pack: prefetch missing REF_DELTA bases


    When fetching, the client sends "have" commit IDs indicating that theserver does not need to send any object referenced by those commits,reducing network I/O.
    When the client is a partial clone, the client still sends "have"s in this way, even if it does not have every object referenced by a commit it sent as "have".

    If a server omits such an object, it is fine: the client could lazily fetch that object before this fetch, and it can still do so after.

    The issue is when the server sends a thin pack containing an object that is a REF_DELTA against such a missing object: index-pack fails to fixthe thin pack.
    When support for lazily fetching missing objects was added in 8b4c010 ("sha1_file: support lazily fetching missing objects", 2017-12-08, Git v2.17.0-rc0), support in index-pack was turned off in the belief that it accesses the repo only to do hash collision checks.
    However, this is not true: it also needs to access the repo to resolveREF_DELTA bases.

    Support for lazy fetching should still generally be turned off in index-pack because it is used as part of the lazy fetching process itself (if not, infinite loops may occur), but we do need to fetch the REF_DELTA bases.
    (When fetching REF_DELTA bases, it is unlikely that those are REF_DELTA themselves, because we do not send "have" when making such fetches.)

    To resolve this, prefetch all missing REF_DELTA bases before attemptingto resolve them.
    This both ensures that all bases are attempted to be fetched, and ensures that we make only one request per index-pack invocation, and not one request per missing object.



    Git 2.24(2019 年第 4 季度)修复了惰性克隆中的按需对象获取,它错误地尝试从子模块项目获取提交,同时仍在 super 项目中工作。
    请参阅 commit a63694fJonathan Tan ( jhowtan )(2019 年 8 月 20 日)。
    (由 Junio C Hamano -- gitster --commit d8b1ce7 中 merge ,2019 年 9 月 9 日)

    diff: skip GITLINK when lazy fetching missing objs


    In 7fbbcb2 ("diff: batch fetching of missing blobs", 2019-04-08, Git v2.22.0-rc0), diff was taught to batch the fetching of missing objects when operating on a partial clone, but was not taught to refrain from fetchingGITLINKs.
    Teach diff to check if an object is a GITLINK before including it in the set to be fetched.



    Git 2.24(2019 年第四季度)还引入了 promisor 远程存储库的概念。
    commit 4ca9474 , commit 60b7a92 , commit db27dca , commit 75de085 , commit 7e154ba , commit 9a4c507 , commit 5e46139 , commit fa3d1b6 , commit b14ed5a , commit faf2abf , commit 9cfebc1 , commit 9e27bea, commit 48de315, commit 2e86067, commit c59c7c8, Christian Couder ( chriscool ), Junio C Hamano -- gitster --, _jit_a, _jit_a, _jit_a, _jit_a, commit b9ac6c5, _jit_a, partial-clone documentation , _20
    (由 commit 90d21f9commit 5a133e8 中 merge ,2019 年 9 月 18 日)
    commit 489fc9e 将promisor repo 定义为:

    A remote that can later provide the missing objects is called apromisor remote, as it promises to send the objects whenrequested.

    Initialy Git supported only one promisor remote, the originremote from which the user cloned and that was configured in the"extensions.partialClone" config option.
    Later support for more than one promisor remote has been implemented.

    Many promisor remotes can be configured and used.

    This allows for example a user to have multiple geographically-closecache servers for fetching missing blobs while continuing to dofiltered git-fetch commands from the central server.

    Remotes that are considered "promisor" remotes are those specified bythe following configuration variables:

    • extensions.partialClone = <name>
    • remote.<name>.promisor = true
    • remote.<name>.partialCloneFilter = ...

    Only one promisor remote can be configured using the extensions.partialClone config variable. This promisor remote will be the last one tried when fetching objects.



    Git 2.24(2019 年第四季度)还在部分克隆中改进了 过滤器 的概念。
    commit c269495commit cf9ceb5commit f56f764commit e987df5commit 842b005commit 7a7c7f4commit 9430147Matthew DeVore ( matvore )Junio C Hamano -- gitster --commit 627b826 (2019 年 6 月 27 日)作者 commit 95acf11
    (由 commit c14b6f8commit 1c37e86 中 merge ,2019 年 9 月 18 日)
    它允许:
    • combining filters such that only objects accepted by all filters are shown.
      The motivation for this is to allow getting directory listings without also fetching blobs. This can be done by combining blob:none with tree:<depth>.
      There are massive repositories that have larger-than-expected trees - even if you include only a single commit.

    A combined filter supports any number of subfilters, and is written inthe following form:

    combine:<filter 1>+<filter 2>+<filter 3>
    • combining of multiple filters by simply repeating the --filter flag.
      Before, the user had to combine them in a single flag somewhat awkwardly (e.g. --filter=combine:FOO+BAR), including URL-encoding the individual filters.


    在 Git 2.27(2020 年第二季度)中,部分克隆中的“ git diff”学会了在更多不需要 blob 对象的情况下避免延迟加载它们。
    请参阅 commit db7ed74Jonathan Tan ( jhowtan )Junio C Hamano -- gitster --commit 8f5dc5a(2020 年 4 月 7 日)和 commit 23547c4(2020 年 4 月 2 日)。
    (由 commit 625e7f1Jonathan Tan ( jhowtan ) merge ,2020 年 4 月 28 日)

    diff: restrict when prefetching occurs

    Helped-by: Jeff King
    Signed-off-by: Jonathan Tan


    Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08, Git v2.22.0-rc0 -- merge listed in batch #7) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched.
    In these cases, any command that uses the diff machinery will unnecessarily fetch blobs.


    diffcore_std() may read blobs when it calls the following functions:

    1. diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex)
    2. diffcore_break() and diffcore_merge_broken() (for break-rewrite detection)
    3. diffcore_rename() (for rename detection)
    4. diffcore_pickaxe() (for detecting addition/deletion of specified string)

    Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object.
    This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run.



    请注意,内部进行的延迟获取以使部分克隆中丢失的对象可用,这错误地对存储库中的部分克隆过滤器造成了永久性损坏,这已在 Git 2.29(2020 年第四季度)中得到纠正。
    请参阅 Junio C Hamano -- gitster --commit e68f0a4(2020 年 9 月 28 日)和 ojit_a(2020 年 9 月 21 日)。
    (由 ojit_a 在 ojit_a 中 merge ,2020 年 10 月 5 日)

    fetch: do not override partial clone filter

    Signed-off-by: Jonathan Tan


    When a fetch with the --filter argument is made, the configured default filter is set even if one already exists. This change was made in 5e46139376 ("builtin/fetch: remove unique promisor remote limitation", 2019-06-25, Git v2.24.0-rc0 -- merge listed in batch #3) - in particular, changing from:

    • If this is the FIRST partial-fetch request, we enable partial
    • on this repo and remember the given filter-spec as the default
    • for subsequent fetches to this remote.

    to:

    • If this is a partial-fetch request, we enable partial on
    • this repo if not already enabled and remember the given
    • filter-spec as the default for subsequent fetches to this
    • remote.

    (The given filter-spec is "remembered" even if there is already an existing one.)

    This is problematic whenever a lazy fetch is made, because lazy fetches are made using "git fetch --filter=blob:none(man), but this will also happen if the user invokes "git fetch --filter=<filter>(man)" manually. Therefore, restore the behavior prior to 5e46139376, which writes a filter-spec only if the current fetch request is the first partial-fetch one (for that remote).

    关于git - 是否有任何支持部分 checkout /克隆的分布式修订控制系统?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3098029/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com