On 2019-12-23 at 13:00:46, Arnaud Bertrand wrote: > Hello, > > According to my understanding, git has only 3 kinds of objects: > (excluding the packed version) > - the blobs > - the trees > - the commits There are also tags. > Today to parse all objects of the same type, it is necessary to parse > all the objects and test them one by one. This isn't a behavior we often want. Can you say more about why you want to do this? > May be due to my limited knowledge of git, I don't see any advantage > to put everything together. > By splitting the objects directory, the gain in performance could be > important, the scripts simplified, the representation more clear. Oftentimes, we want to look up an item that we would refer to as a tree-ish. That means that any tag, commit, or tree can be used in this case and it will automatically be resolved to the appropriate tree. Currently, we can look for any loose object, and then look for any packed object, which is a limited number of lookups (at most, the number of packs plus one). Your proposal would have us look up at most the number of packs plus six. In addition, we sometimes know that we need to look up an object, but don't know its type. We would incur additional costs in this case as well. I'm not sure that we would gain a lot other than conceptual tidiness, but we would incur additional performance costs. We can currently distinguish between the type of all of these objects by simply reading the object header, which on a 64-bit system cannot exceed 28 bytes, which we do in some cases, such as `git cat-file --batch`. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature