Re: [PATCH v4 4/6] update-index: use the bulk-checkin infrastructure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 20 2021, Neeraj Singh via GitGitGadget wrote:

> From: Neeraj Singh <neerajsi@xxxxxxxxxxxxx>
>
> The update-index functionality is used internally by 'git stash push' to
> setup the internal stashed commit.
>
> This change enables bulk-checkin for update-index infrastructure to
> speed up adding new objects to the object database by leveraging the
> pack functionality and the new bulk-fsync functionality. This mode
> is enabled when passing paths to update-index via the --stdin flag,
> as is done by 'git stash'.
>
> There is some risk with this change, since under batch fsync, the object
> files will not be available until the update-index is entirely complete.
> This usage is unlikely, since any tool invoking update-index and
> expecting to see objects would have to snoop the output of --verbose to
> find out when update-index has actually processed a given path.
> Additionally the index is locked for the duration of the update.

Would you really need to sniff the verbose output? If I'm streaming data
to update-index now it looks like I could assume before that
update-index would have done the work if I managed to fflush() to it,
since it's processing a line at a time and doing the work in that
line-at-a-time loop.

I.e. you could print lines to it, and then do concurrent object lookups
knowing the data was written already...

I think this is probably fine, but that case seems way likelier than
someone sniffing back the verbose output, presumably for the "add" in
update_one(), but that's called in the getline_fn() loop...

All of this makes me wonder why this isn't using tmp-objdir.c, i.e. we
could have our cake and eat it too by writing the "real" objects, and
then just renaming them between directories instead. But perhaps the
answer has something to do with the metadata issues I raised.

And well, tmp-objdir.c isn't going to help someone in practice that's
relying on this "update-index --stdin" behavior, as they won't know
where we staged the temporary files...




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux