Re: RFC: Design and code of partial clones (now, missing commits and trees OK)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/22/2017 12:42 AM, Jonathan Tan wrote:
> On Thu, 21 Sep 2017 13:57:30 Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote:
> [...]
>> I struggled with the terms here a little when looking at the source.
>> () A remote responding to a partial-clone is termed a
>> "promisor-remote". () Packfiles received from a promisor-remote are
>> marked with "<name>.promisor" like "<name>.keep" names.
>> () An object actually contained in such packfiles is called a
>> "promisor-object". () An object not-present but referenced by one of
>> the above promisor-objects is called a "promised-object" (aka a
>> "missing-object").
>>
>> I think the similarity of the words "promisOR" and "promisED" threw
>> me here and as I was looking at the code.  The code in is_promised()
>> [1] looked like it was adding all promisor- and promised-objects to
>> the "promised" OIDSET, but I could be mistaken.
>>
>> [1]
>> https://github.com/jonathantanmy/git/commit/7a9c2d9b6e2fce293817b595dee29a7eede0dddd#diff-5d5d5dc185ef37dc30bb7d9a7ae0c4e8R1960
> 
> I was struggling a bit with the terminology, true.
> 
> Right now I'm thinking of:
>  - promisor remote (as you defined)
>  - promisor packfile (as you defined)
>  - promisor object is an object known to belong to the promisor (whether
>    because we have it in a promisor packfile or because it is referenced
>    by an object in a promisor packfile)

Maybe naming has been discussed at length before, and I am jumping into
a long-settled topic. And admittedly this is bikeshedding.

But I find these names obscure, even as a developer. And terms like this
will undoubtedly bleed into the UI and documentation, so it would be
good to put some effort into choosing the best names possible.

I suppose that the term "promisor" comes from the computer science term
"promise" [1]. In that sense it is apt, because, say, a promisor object
is something that is known to be obtainable, but we don't have it yet.

But from the user's point of view, I think this isn't a very
illuminating term. I think the user's mental model will be that there is
a distinguished remote repository that holds the project's entire
published history, and she has to remain connected to it for certain Git
operations to work [2]. Another interesting aspect of this remote is
that it has to be trusted never (well, almost never) to discard any
objects [3].

Let me brainstorm about other names or concepts that seem closer to the
user's mental model:

* "backing remote", "backing repository"

* "lazy remote", "live remote", "cloud remote", "shared remote",
  "on-demand remote"

* "full remote", "deep remote", "permanent remote"

* "attached remote", "bound remote", "linked remote"

* "trusted remote", "authoritative remote", "official remote"
  (these terms probably imply more than we want)

* "upstream", "upstream remote" (probably too confusing with how
  the term "upstream" is currently used, even if in practice they
  will often be the same remote)

* "object depot", "depot remote", "depot repository", "remote
  object depot" (I don't like the term "object" that much, because
  it is rather remote from the Git user's daily life)

* "repository of record", "remote of record" (too wordy)

* "history depot", "history warehouse" (meh)

* (dare I say it?) "central repository"

* "object archive", "archival remote" (not so good because we already
  use "archive" for other concepts)

* depository (sounds too much like "repository")

* The thesaurus suggests nouns like: store, bank, warehouse, library,
  chronicle, annals, registry, locker, strongbox, attic, bunker

Personally I think "lazy remote" and "backing remote" are not too bad.

Michael

[1] https://en.wikipedia.org/wiki/Futures_and_promises

[2] I haven't checked whether the current proposal allows for
    multiple "promisor remotes". It's certainly thinkable, if not
    now then in the future. But I suppose that even then, 99% of
    users will configure a single "promisor remote" for each
    repository.

[3] For those rare occasions where the server has to discard objects,
    it might make sense for the server to remember the names of the
    objects that were deleted, so that it can tell clients "no, you're
    not insane. I used to have that object but it has intentionally
    been obliterated", and possibly even a reason: "it is now taboo"
    vs. "I got tired of carrying it around".



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux