Re: [PATCH 1/3] retain reflogs for deleted refs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/19/2012 11:33 PM, Jeff King wrote:
[...]
This cannot be done by simply leaving the reflog files in
place. The ref namespace does not allow D/F conflicts, so a
ref "foo" would block the creation of another ref "foo/bar",
and vice versa. This limitation is acceptable for two refs
to exist simultaneously, but should not have an impact if
one of the refs is deleted.

This is a great feature.

This patch moves reflog entries into a special "graveyard"
namespace, and appends a tilde (~) character, which is
not allowed in a valid ref name. This means that the deleted
reflogs of these refs:

    refs/heads/a
    refs/heads/a/b
    refs/heads/a/b/c

will be stored in:

    logs/graveyard/refs/heads/a~
    logs/graveyard/refs/heads/a/b~
    logs/graveyard/refs/heads/a/b/c~

Putting them in the graveyard namespace ensures they will
not conflict with live refs, and the tilde prevents D/F
conflicts within the graveyard namespace.

I agree with Junio that long-term, it would be nice to allow references "foo" and "foo/bar" to exist simultaneously. To get there, we would have to redesign the mapping between reference names and the filenames used for the references and for the reflogs.

The easiest thing would be to mark files and directories differently; something like

    $GIT_DIR/{,logs/}refs/heads/a/b/c~

or

    $GIT_DIR/{,logs/}refs/heads~/a~/b~/c

i.e., munging either directory or file names to strings that are illegal in refnames such that it is unambiguous from the name whether a path is a file or directory.

And *if* we did that, then we wouldn't need a separate "graveyard" namespace, would we? The reflogs for dead references could live among those for living references.

Therefore, I think it would be good if we would choose a convention now for dead reflogs that is compatible with this hoped-for future.

The first convention, "logs/refs/heads/a/b/c~" is not usable because a reflog for a dead reference with this name would conflict with a reflog for a live reference "heads/a" or "heads/a/b" that uses the current filename convention.

But the second convention, "logs/refs/heads~/a~/b~/c, cannot conflict with current reflog files. And it would be a step towards allowing "foo" and "foo/bar" at the same time. What do you think about using a convention like this instead of the one that you proposed?


Another minor concern is the choice of trailing tilde in the file or directory names. Given that emacs creates backup files by appending a tilde to the filename, (1) it would be easy to inadvertently create such files, which git might try to interpret as reflogs and (2) there might be tools that innately "know" to skip such files in their processing. ack-grep, a replacement for grep, is an example that springs to mind. I know that I have written backup scripts that ignore files matching "*~", and a garbage-removal script that removes files matching "*~". Probably it is less precarious to name directories rather than files with trailing tildes, but either one could be a surprise for sysadmins.

Other possibilities (according to git-check-ref-format(1)):

    refs/.heads/.a/.b/c
    refs/heads./a./b./c (problematic on some Windows filesystems?)
    refs/heads../a../b../c
    refs/heads~dir/a~dir/b~dir/c (or some other suffix)
refs/heads..a..b..c (not recommended because it flattens directory hierarchy)

The implementation is fairly straightforward, but it's worth
noting a few things:

   1. Updates to "logs/graveyard/refs/heads/foo~" happen
      under the ref-lock for "refs/heads/foo". So deletion
      still takes a single lock, and anyone touching the
      reflog directly needs to reverse the transformation to
      find the correct lockfile.

This should be documented in the code.

   2. We append entries to the graveyard reflog rather than
      simply renaming the file into place. This means that
      if you create and delete a branch repeatedly, the
      graveyard will contain the concatenation of all
      iterations.

Good.

   3. We do not resurrect dead entries when a new ref is
      created with the same name. However, it would be
      possible to build an "undelete" feature on top of this
      if one was so inclined.

Nice prospect.

[...]> diff --git a/refs.c b/refs.c
index da74a2b..553de77 100644
--- a/refs.c
+++ b/refs.c
[...]
@@ -2552,3 +2553,63 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
  	free(short_name);
  	return xstrdup(refname);
  }
+
+char *refname_to_graveyard_reflog(const char *ref)
+{
+	return git_path("logs/graveyard/%s~", ref);
+}
+
+char *graveyard_reflog_to_refname(const char *log)
+{
+	static struct strbuf buf = STRBUF_INIT;
+
+	if (!prefixcmp(log, "graveyard/"))
+		log += 10;
+
+	strbuf_reset(&buf);
+	strbuf_addstr(&buf, log);
+	if (buf.len > 0 && buf.buf[buf.len-1] == '~')
+		strbuf_setlen(&buf, buf.len - 1);
+
+	return buf.buf;
+}

Given the names of these two functions, I was surprised that they aren't inverses of each other.

Function comments would be nice, too, especially for the latter.

Michael

--
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]