On Tue, Dec 13, 2016 at 9:05 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Christian Couder <christian.couder@xxxxxxxxx> writes: > >> In general I think that having a lot of refs is really a big problem >> right now in Git as many big organizations using Git are facing this >> problem in one form or another. >> So I think that support for a big number of refs is a separate and >> important problem that should and hopefully will be solved. > > But you do not have to make it worse. > > Is "refs" a good match for the problem you are solving? Or is it > merely an expedient thing to use? I think it is the latter, judging > by your mentioning RefTree. Whatever mechanism we choose, that will > be carved into stone in users' repositories and you'd end up having > to support it, and devise the migration path out of it if the initial > selection is too problematic. > > That is why people (not just me) pointed out upfront that using refs > for this purose would not scale. What I should perhaps have clarified in my previous answer, and also in the documentation of the patch series, is that in what I have done and what I propose, the external odb helper is responsible for using and creating the refs in refs/odbs/<odbname>/. So this helper is free to just create one ref, as it is also free to create many refs. Git is just transmitting the refs that have been created by this helper. Right now people are already free to use whatever external script or software to create whatever refs/stuff/* they want, pointing to whatever objects they want, and have Git transmit that. And indeed I know that it is already a problem out there, as then people often get into trouble related to having many refs. But it is a different problem that is not going to be solved anyway in this patch series. So if some people want to use a specific external odb, it's their responsibility to use an helper that will not create too many refs. If they know that they just need their external odb to handle around 10 big files, why wouldn't they use a simple helper that creates one odb ref per big file/blob? On the contrary if they know that they will need to handle thousands of big files, then, yeah, they should find or implement a helper that will, as I suggested in my previous email, just create one ref in refs/odbs/<odbname>/ that points to a blob that contains a list (maybe a json list with information attached to each item) of the blobs stored in the external odb. For testing purposes in what I have done in the patch series, I use only simple helpers that create one odb ref per big file/blob. So yes, it gives a bad example, because, if people just copy this design while they need the e-odb to handle a big number of files, then they will be in trouble. But this does not by itself carve anything into stone. One thing that could help is perhaps to put big warnings into the simple helpers saying "Be careful!!! This will not scale if you want to handle more than a small number of large files!!! You'd better use an helper that does <this and that> if you want to handle many large files!!! You have been warned!!!". So I am reluctant at this point to write a complex helper just for the purpose of showing a good example to people who want to use e-odb to store a big number of files, as these people anyway would probably need something like Lars' "filter process protocol" too.