Re: [PATCH v2 4/7] unpack-objects: use the bulk-checkin infrastructure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 21, 2022 at 4:02 PM Neeraj Singh <nksingh85@xxxxxxxxx> wrote:
>
> On Mon, Mar 21, 2022 at 10:55 AM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> >
> > "Neeraj Singh via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> >
> > > From: Neeraj Singh <neerajsi@xxxxxxxxxxxxx>
> > >
> > > The unpack-objects functionality is used by fetch, push, and fast-import
> > > to turn the transfered data into object database entries when there are
> > > fewer objects than the 'unpacklimit' setting.
> > >
> > > By enabling bulk-checkin when unpacking objects, we can take advantage
> > > of batched fsyncs.
> >
> > This feels confused in that we dispatch to unpack-objects (instead
> > of index-objects) only when the number of loose objects should not
> > matter from performance point of view, and bulk-checkin should shine
> > from performance point of view only when there are enough objects to
> > batch.
> >
> > Also if we ever add "too many small loose objects is wasteful, let's
> > send them into a single 'batch pack'" optimization, it would create
> > a funny situation where the caller sends the contents of a small
> > incoming packfile to unpack-objects, but the command chooses to
> > bunch them all together in a packfile anyway ;-)
> >
> > So, I dunno.
> >
>
> I'd be happy to just drop this patch.  I originally added it to answer Avarab's
> question: how does batch mode compare to packfiles? [1] [2].
>
> [1] https://lore.kernel.org/git/87mtp5cwpn.fsf@xxxxxxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/git/pull.1076.v5.git.git.1632514331.gitgitgadget@xxxxxxxxx/

Well looking back again at the spreadsheet [3], at 90 objects, which
is below the
default transfer.unpackLimit, we see a 3x difference in performance
between batch
mode and the default fsync mode.  That's a different interaction class
(230 ms versus 760 ms).

I'll include a small table in the commit description with these
performance numbers to
help justify it.

[3] https://docs.google.com/spreadsheets/d/1uxMBkEXFFnQ1Y3lXKqcKpw6Mq44BzhpCAcPex14T-QQ



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux