Han Xin <chiyutianyi@xxxxxxxxx> writes: > > From: Han Xin <hanxin.hx@xxxxxxxxxxxxxxx> > > Although we do not recommend users push large binary files to the git repositories, > it's difficult to prevent them from doing so. Once, we found a problem with a surge > in memory usage on the server. The source of the problem is that a user submitted > a single object with a size of 15GB. Once someone initiates a git push, the git > process will immediately allocate 15G of memory, resulting in an OOM risk. > > Through further analysis, we found that when we execute git unpack-objects, in > unpack_non_delta_entry(), "void *buf = get_data(size);" will directly allocate > memory equal to the size of the object. This is quite a scary thing, because the > pre-receive hook has not been executed at this time, and we cannot avoid this by hooks. > > I got inspiration from the deflate process of zlib, maybe it would be a good idea > to change unpack-objects to stream deflate. > Hi, Jeff. I hope you can share with me how Github solves this problem. As you said in your reply at: https://lore.kernel.org/git/YVaw6agcPNclhws8@xxxxxxxxxxxxxxxxxxxxxxx/ "we don't have a match in unpack-objects, but we always run index-pack on incoming packs". In the original implementation of "index-pack", for objects larger than big_file_threshold, "fixed_buf" with a size of 8192 will be used to complete the calculation of "oid". I tried the implementation in jk/no-more-unpack-objects, as you noted: /* XXX This will expand too-large objects! */ if (!data) data = new_data = get_data_from_pack(obj_entry); If the conditions of --unpack are given, there will be risks here. When I create an object larger than 1GB and execute index-pack, the result is as follows: $GIT_ALLOC_LIMIT=1024m git index-pack --unpack --stdin <large.pack fatal: attempting to allocate 1228800001 over limit 1073741824 Looking forward to your reply.