On 07/19/2012 11:33 PM, Jeff King wrote:
[...]
This cannot be done by simply leaving the reflog files in
place. The ref namespace does not allow D/F conflicts, so a
ref "foo" would block the creation of another ref "foo/bar",
and vice versa. This limitation is acceptable for two refs
to exist simultaneously, but should not have an impact if
one of the refs is deleted.
This is a great feature.
This patch moves reflog entries into a special "graveyard"
namespace, and appends a tilde (~) character, which is
not allowed in a valid ref name. This means that the deleted
reflogs of these refs:
refs/heads/a
refs/heads/a/b
refs/heads/a/b/c
will be stored in:
logs/graveyard/refs/heads/a~
logs/graveyard/refs/heads/a/b~
logs/graveyard/refs/heads/a/b/c~
Putting them in the graveyard namespace ensures they will
not conflict with live refs, and the tilde prevents D/F
conflicts within the graveyard namespace.
I agree with Junio that long-term, it would be nice to allow references
"foo" and "foo/bar" to exist simultaneously. To get there, we would
have to redesign the mapping between reference names and the filenames
used for the references and for the reflogs.
The easiest thing would be to mark files and directories differently;
something like
$GIT_DIR/{,logs/}refs/heads/a/b/c~
or
$GIT_DIR/{,logs/}refs/heads~/a~/b~/c
i.e., munging either directory or file names to strings that are illegal
in refnames such that it is unambiguous from the name whether a path is
a file or directory.
And *if* we did that, then we wouldn't need a separate "graveyard"
namespace, would we? The reflogs for dead references could live among
those for living references.
Therefore, I think it would be good if we would choose a convention now
for dead reflogs that is compatible with this hoped-for future.
The first convention, "logs/refs/heads/a/b/c~" is not usable because a
reflog for a dead reference with this name would conflict with a reflog
for a live reference "heads/a" or "heads/a/b" that uses the current
filename convention.
But the second convention, "logs/refs/heads~/a~/b~/c, cannot conflict
with current reflog files. And it would be a step towards allowing
"foo" and "foo/bar" at the same time. What do you think about using a
convention like this instead of the one that you proposed?
Another minor concern is the choice of trailing tilde in the file or
directory names. Given that emacs creates backup files by appending a
tilde to the filename, (1) it would be easy to inadvertently create such
files, which git might try to interpret as reflogs and (2) there might
be tools that innately "know" to skip such files in their processing.
ack-grep, a replacement for grep, is an example that springs to mind. I
know that I have written backup scripts that ignore files matching "*~",
and a garbage-removal script that removes files matching "*~". Probably
it is less precarious to name directories rather than files with
trailing tildes, but either one could be a surprise for sysadmins.
Other possibilities (according to git-check-ref-format(1)):
refs/.heads/.a/.b/c
refs/heads./a./b./c (problematic on some Windows filesystems?)
refs/heads../a../b../c
refs/heads~dir/a~dir/b~dir/c (or some other suffix)
refs/heads..a..b..c (not recommended because it flattens directory
hierarchy)
The implementation is fairly straightforward, but it's worth
noting a few things:
1. Updates to "logs/graveyard/refs/heads/foo~" happen
under the ref-lock for "refs/heads/foo". So deletion
still takes a single lock, and anyone touching the
reflog directly needs to reverse the transformation to
find the correct lockfile.
This should be documented in the code.
2. We append entries to the graveyard reflog rather than
simply renaming the file into place. This means that
if you create and delete a branch repeatedly, the
graveyard will contain the concatenation of all
iterations.
Good.
3. We do not resurrect dead entries when a new ref is
created with the same name. However, it would be
possible to build an "undelete" feature on top of this
if one was so inclined.
Nice prospect.
[...]> diff --git a/refs.c b/refs.c
index da74a2b..553de77 100644
--- a/refs.c
+++ b/refs.c
[...]
@@ -2552,3 +2553,63 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
free(short_name);
return xstrdup(refname);
}
+
+char *refname_to_graveyard_reflog(const char *ref)
+{
+ return git_path("logs/graveyard/%s~", ref);
+}
+
+char *graveyard_reflog_to_refname(const char *log)
+{
+ static struct strbuf buf = STRBUF_INIT;
+
+ if (!prefixcmp(log, "graveyard/"))
+ log += 10;
+
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, log);
+ if (buf.len > 0 && buf.buf[buf.len-1] == '~')
+ strbuf_setlen(&buf, buf.len - 1);
+
+ return buf.buf;
+}
Given the names of these two functions, I was surprised that they aren't
inverses of each other.
Function comments would be nice, too, especially for the latter.
Michael
--
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html