On 2024-10-18 at 09:50:00, Aswin Benny wrote: > I would like to request the following features to git : > 1. A command or feature to get the size of the repo without cloning it > to the system > 2. An option to know the size of objects that will be downloaded forehand. These are actually really difficult to know without actually performing the operation. For example, GitHub and many other forges store all of the objects in a repository network in a single alternate, but only a part of those objects (the ones in the repository you're cloning or fetching) are included. In addition, to know the size of the pack being generated, there's no more efficient way than generating the pack. For example, a repository with identical structure but containing 500 MB of text files (source code, literature, etc.) will be much smaller than a repository with 500 MB of random data because the former deltifies and compresses much better than the latter. We don't know the size of the pack file being sent until we've actually compressed and deltified the objects. We can, of course, make estimations of this data based on what's on disk on the server side. But, just like with GitHub's API, it's not always possible to know exactly, and some users will be unhappy with a value that's not exactly correct. (I can confirm there are users who feel this way about GitHub's API functionality, and I understand their concerns.) Given this, I'm not super excited about adding this feature to Git, because I think it will set us up for a lot of complaints when the data isn't exactly correct, especially when the data is far off from the actual value, and I don't think the utility is worth it. But perhaps you or someone else can write a patch and it will be accepted, with the proviso that the data might not be correct, and users will still find it useful. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA
Attachment:
signature.asc
Description: PGP signature