On Mon, Mar 21, 2022 at 4:02 PM Neeraj Singh <nksingh85@xxxxxxxxx> wrote: > > On Mon, Mar 21, 2022 at 10:55 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > > > "Neeraj Singh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > > > From: Neeraj Singh <neerajsi@xxxxxxxxxxxxx> > > > > > > The unpack-objects functionality is used by fetch, push, and fast-import > > > to turn the transfered data into object database entries when there are > > > fewer objects than the 'unpacklimit' setting. > > > > > > By enabling bulk-checkin when unpacking objects, we can take advantage > > > of batched fsyncs. > > > > This feels confused in that we dispatch to unpack-objects (instead > > of index-objects) only when the number of loose objects should not > > matter from performance point of view, and bulk-checkin should shine > > from performance point of view only when there are enough objects to > > batch. > > > > Also if we ever add "too many small loose objects is wasteful, let's > > send them into a single 'batch pack'" optimization, it would create > > a funny situation where the caller sends the contents of a small > > incoming packfile to unpack-objects, but the command chooses to > > bunch them all together in a packfile anyway ;-) > > > > So, I dunno. > > > > I'd be happy to just drop this patch. I originally added it to answer Avarab's > question: how does batch mode compare to packfiles? [1] [2]. > > [1] https://lore.kernel.org/git/87mtp5cwpn.fsf@xxxxxxxxxxxxxxxxxxx/ > [2] https://lore.kernel.org/git/pull.1076.v5.git.git.1632514331.gitgitgadget@xxxxxxxxx/ Well looking back again at the spreadsheet [3], at 90 objects, which is below the default transfer.unpackLimit, we see a 3x difference in performance between batch mode and the default fsync mode. That's a different interaction class (230 ms versus 760 ms). I'll include a small table in the commit description with these performance numbers to help justify it. [3] https://docs.google.com/spreadsheets/d/1uxMBkEXFFnQ1Y3lXKqcKpw6Mq44BzhpCAcPex14T-QQ