Re: Distinguishing FF vs non-FF updates in the reflog?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 17, 2021 at 09:06:06PM +0100, Han-Wen Nienhuys wrote:

> I'm working on some extensions to Gerrit for which it would be very
> beneficial if we could tell from the reflog if an update is a
> fast-forward or not: if we find a SHA1 in the reflog, and see there
> were only FF updates since, we can be sure that the SHA1 is reachable
> from the branch, without having to open packfiles and decode commits.

I left some numbers in another part of the thread, but IMHO performance
isn't that compelling a reason to do this these days, if you are using
commit-graphs.

Just walking the reflog might be _slightly_ faster, though not
necessarily (it depends on whether the depth of the object graph or the
depth of the reflog chain is deeper). It might matter more if you are
using a more exotic storage scheme, where switching from accessing
reflogs to objects implies extra round-trips to a server (e.g., custom
storage backends with JGit; I don't know the state of the art in what
Google is doing there).

> For the reftable format, I think we could store this easily by
> introducing more record types. Today we have 0 = deletion, 1 = update,
> and we could add 2 = FF update, 3 = non-FF update.
> 
> However, the textual reflog format doesn't easily allow for this.
> However, we might add a convention, eg. have the message start with
> 'FF' or 'NFF' depending on the nature of the update.
> 
> Does this make sense, and if yes is it worth proposing a change?

At GitHub we do something similar. We don't generally use reflogs much
at all, but we keep a custom "audit log": a single append-only file that
records every ref update in the repository. And its format just happens
to be one reflog entry per line, prefixed by the updated ref.

And there we do generally annotate the FF-ness of an update by stuffing
it into the free-form message field (in fact, we shove in a small JSON
object, so we record multiple fields like the pushing id, IP, etc).

But the main goal there isn't performance (and in fact we don't
generally consult it for anything outside of debugging). The reason we
record FF-ness is for later debugging or analysis. We don't prune from
the audit log, and we don't consider it for reachability when we prune
objects (since otherwise you'd never be able to prune anything!). So the
objects sometimes aren't available later to compute, but we still want
to know if the user did a force-push, etc.

I don't think that really applies to regular reflogs, because they do
imply reachability (and they are not great for later analysis, because
we may selectively expire unreachable entries).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux