> On Fri, 2016-02-19 at 09:54 +0700, Duy Nguyen wrote: >> On Fri, Feb 19, 2016 at 3:23 AM, David Turner < >> dturner@xxxxxxxxxxxxxxxx> wrote: >> > > > +static int read_per_worktree_ref(const char *submodule, const >> > > > char >> > > > *refname, >> > > > + struct MDB_val *val, int >> > > > *needs_free) >> > > >> > > From what I read, I suspect these _per_worktree functions will be >> > > identical for the next backend. Should we just hand over the job >> > > for >> > > files backend? For all entry points that may deal with per >> > > -worktree >> > > refs, e.g. lmdb_resolve_ref_unsafe, can we check ref_type() first >> > > thing, if it's per-worktree we call >> > > refs_be_files.resolve_ref_unsafe() >> > > instead? It could even be done at frontend level, >> > > e.g. refs.c:resolve_ref_unsafe(). >> > > >> > > Though I may be talking rubbish here because I don't know how >> > > whether >> > > it has anything to do with transactions. >> > >> > The reason I did it this way is that some ref chains cross backend >> > boundaries (e.g. HEAD -> refs/heads/master). But if we have other >> > backends later, we could generalize. >> >> Crossing backends should go through frontend again, imo. But I don't >> really know if it's efficient. > > It's pretty tricky to maintain state (e.g. count of symref redirects) > across that barrier. So I'm not sure how to do it cleanly. I notice files backend does pretty much the same thing. "files" backend looks more like two backends combined in one, one is files, the other is packed-refs. And it looks like we could solve it by providing a lower level api, read_raw_ref() or something, that retrieves the ref without any validation or link following. More on this later. >> > > I'm not sure I get this comment. D/F conflicts are no longer a >> > > thing >> > > for lmdb backend, right? >> > >> > I'm trying to avoid the lmdb backend creating a set of refs that >> > the >> > files backend can't handle. This would make collaboration with >> > other >> > versions of git more difficult. >> >> It already is. If you create refs "foo" and "FOO" on case sensitive >> file system and clone it on a case-insensitive one, you face the same >> problem. We may have an optional configuration knob to prevent >> incompatibilities with files backend, but I think that should be done >> (and enforced if necessary) outside backends. > > Sure, the current state isn't perfect, but why make it worse? I see it from a different angle. The current state isn't perfect, but we will be moving to a better future where "files" backend may eventually be deprecated. Why hold back? But this line of reasoning only works if we have a new backend capable of replacing "files" without regressions or introducing new dependency. Which is why I suggest a new backend [1] (or implement Shawn's RefTree if it's proven as good with small repos) I have no problem if you want to stay strictly compatible with "files" though. [1] http://thread.gmane.org/gmane.comp.version-control.git/285893/focus=286654 On Fri, Feb 19, 2016 at 05:49:46PM -0500, David Turner wrote: > > > > This code looks a lot like near the end of resolve_ref_1(). Maybe we > > could share the code in refs/backend-common.c or something and call > > here instead? > > Something like the following? > > commit aad6b84fd1869f6e1cf6ed15bcece0c2f6429e9d > Author: David Turner <dturner@xxxxxxxxxxxxxxxx> > Date: Thu Feb 18 17:09:29 2016 -0500 > > refs: break out some functions from resolve_ref_1 > > A bunch of resolve_ref_1 is not backend-specific, so we can > break it out into separate internal functions that other > backends can use. ... > > I'm not sure I like it, because it breaks out these weird tiny > functions that take a lot of arguments. But maybe it's worth it? What > do you think? OK how about we keep resolve_ref_1() whole and split real backend code out? Something like these three patches (only built, did not test). A bit ugly with continue_symlink, but it's just demonstration. commit ef46fcdc62ef89fd5260ca054cd1d98f9f2d7c2b Author: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> Date: Sat Feb 20 09:18:45 2016 +0700 refs/files: move ref I/O code out of resolve_refs_1() diff --git a/refs/files-backend.c b/refs/files-backend.c index 4bddfb3..f54f2ae 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1407,6 +1407,95 @@ static int resolve_missing_loose_ref(const char *refname, } } +static const char *continue_normal_ref = "read_ref returns a normal ref"; +static const char *continue_symlink = "read_ref returns a symlink"; + +/* + * Read a ref from backend. Returning any values except + * continue_normal_ref or continue_symlink ends resolve_ref_1() + * execution. If the return value is not NULL, sha1 and flags must be + * updated correctly. except REF_ISBROKEN which is set by + * resolve_ref_1(). + * + * If continue_* is returned, sb_contents must contain the ref data. + */ +static const char *parse_ref(const char *refname, + int resolve_flags, + unsigned char *sha1, + int *flags, + struct strbuf *sb_path, + struct strbuf *sb_contents) +{ + const char *path; + struct stat st; + int fd; + + strbuf_reset(sb_path); + strbuf_git_path(sb_path, "%s", refname); + path = sb_path->buf; + + /* + * We might have to loop back here to avoid a race + * condition: first we lstat() the file, then we try + * to read it as a link or as a file. But if somebody + * changes the type of the file (file <-> directory + * <-> symlink) between the lstat() and reading, then + * we don't want to report that as an error but rather + * try again starting with the lstat(). + */ +stat_ref: + if (lstat(path, &st) < 0) { + if (errno != ENOENT) + return NULL; + if (resolve_missing_loose_ref(refname, resolve_flags, + sha1, flags)) + return NULL; + return refname; + } + + /* Follow "normalized" - ie "refs/.." symlinks by hand */ + if (S_ISLNK(st.st_mode)) { + strbuf_reset(sb_contents); + if (strbuf_readlink(sb_contents, path, 0) < 0) { + if (errno == ENOENT || errno == EINVAL) + /* inconsistent with lstat; retry */ + goto stat_ref; + else + return NULL; + } + return continue_symlink; + } + + /* Is it a directory? */ + if (S_ISDIR(st.st_mode)) { + errno = EISDIR; + return NULL; + } + + /* + * Anything else, just open it and try to use it as + * a ref + */ + fd = open(path, O_RDONLY); + if (fd < 0) { + if (errno == ENOENT) + /* inconsistent with lstat; retry */ + goto stat_ref; + else + return NULL; + } + strbuf_reset(sb_contents); + if (strbuf_read(sb_contents, fd, 256) < 0) { + int save_errno = errno; + close(fd); + errno = save_errno; + return NULL; + } + close(fd); + + return continue_normal_ref; +} + /* This function needs to return a meaningful errno on failure */ static const char *resolve_ref_1(const char *refname, int resolve_flags, @@ -1442,54 +1531,18 @@ static const char *resolve_ref_1(const char *refname, bad_name = 1; } for (;;) { - const char *path; - struct stat st; + const char *ret; char *buf; - int fd; if (--depth < 0) { errno = ELOOP; return NULL; } - strbuf_reset(sb_path); - strbuf_git_path(sb_path, "%s", refname); - path = sb_path->buf; + ret = parse_ref(refname, resolve_flags, sha1, + flags, sb_path, sb_contents); - /* - * We might have to loop back here to avoid a race - * condition: first we lstat() the file, then we try - * to read it as a link or as a file. But if somebody - * changes the type of the file (file <-> directory - * <-> symlink) between the lstat() and reading, then - * we don't want to report that as an error but rather - * try again starting with the lstat(). - */ - stat_ref: - if (lstat(path, &st) < 0) { - if (errno != ENOENT) - return NULL; - if (resolve_missing_loose_ref(refname, resolve_flags, - sha1, flags)) - return NULL; - if (bad_name) { - hashclr(sha1); - if (flags) - *flags |= REF_ISBROKEN; - } - return refname; - } - - /* Follow "normalized" - ie "refs/.." symlinks by hand */ - if (S_ISLNK(st.st_mode)) { - strbuf_reset(sb_contents); - if (strbuf_readlink(sb_contents, path, 0) < 0) { - if (errno == ENOENT || errno == EINVAL) - /* inconsistent with lstat; retry */ - goto stat_ref; - else - return NULL; - } + if (ret == continue_symlink) { if (starts_with(sb_contents->buf, "refs/") && !check_refname_format(sb_contents->buf, 0)) { strbuf_swap(sb_refname, sb_contents); @@ -1502,35 +1555,10 @@ static const char *resolve_ref_1(const char *refname, } continue; } - } - - /* Is it a directory? */ - if (S_ISDIR(st.st_mode)) { - errno = EISDIR; - return NULL; - } - - /* - * Anything else, just open it and try to use it as - * a ref - */ - fd = open(path, O_RDONLY); - if (fd < 0) { - if (errno == ENOENT) - /* inconsistent with lstat; retry */ - goto stat_ref; - else - return NULL; - } - strbuf_reset(sb_contents); - if (strbuf_read(sb_contents, fd, 256) < 0) { - int save_errno = errno; - close(fd); - errno = save_errno; - return NULL; - } - close(fd); - strbuf_rtrim(sb_contents); + } else if (ret == refname) + break; + else if (ret != continue_normal_ref) + return ret; /* * Is it a symbolic ref? @@ -1547,12 +1575,7 @@ static const char *resolve_ref_1(const char *refname, errno = EINVAL; return NULL; } - if (bad_name) { - hashclr(sha1); - if (flags) - *flags |= REF_ISBROKEN; - } - return refname; + break; } if (flags) *flags |= REF_ISSYMREF; @@ -1578,6 +1601,13 @@ static const char *resolve_ref_1(const char *refname, bad_name = 1; } } + + if (bad_name) { + hashclr(sha1); + if (flags) + *flags |= REF_ISBROKEN; + } + return refname; } static const char *files_resolve_ref_unsafe(const char *refname, After this resolve_ref_1() is backend independent. So we can make it take parse_ref() as a function pointer instead. commit 50d96b6f79b30b5ba17fa00ec3ee42845546a123 Author: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> Date: Sat Feb 20 09:22:03 2016 +0700 refs/files-backend.c: let resolve_refs_1() accept parse_ref as callback diff --git a/refs/files-backend.c b/refs/files-backend.c index f54f2ae..9b4de9f 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1424,7 +1424,8 @@ static const char *parse_ref(const char *refname, unsigned char *sha1, int *flags, struct strbuf *sb_path, - struct strbuf *sb_contents) + struct strbuf *sb_contents, + void *cb_data) { const char *path; struct stat st; @@ -1497,13 +1498,21 @@ stat_ref: } /* This function needs to return a meaningful errno on failure */ -static const char *resolve_ref_1(const char *refname, - int resolve_flags, - unsigned char *sha1, - int *flags, - struct strbuf *sb_refname, - struct strbuf *sb_path, - struct strbuf *sb_contents) +const char *resolve_ref_1(const char *refname, + int resolve_flags, + unsigned char *sha1, + int *flags, + struct strbuf *sb_refname, + struct strbuf *sb_path, + struct strbuf *sb_contents, + const char *(*parse_ref)(const char *refname, + int resolve_flags, + unsigned char *sha1, + int *flags, + struct strbuf *sb_path, + struct strbuf *sb_contents, + void *cb_data), + void *cb_data) { int depth = MAXDEPTH; int bad_name = 0; @@ -1540,7 +1549,8 @@ static const char *resolve_ref_1(const char *refname, } ret = parse_ref(refname, resolve_flags, sha1, - flags, sb_path, sb_contents); + flags, sb_path, sb_contents, + cb_data); if (ret == continue_symlink) { if (starts_with(sb_contents->buf, "refs/") && @@ -1621,7 +1631,8 @@ static const char *files_resolve_ref_unsafe(const char *refname, const char *ret; ret = resolve_ref_1(refname, resolve_flags, sha1, flags, - &sb_refname, &sb_path, &sb_contents); + &sb_refname, &sb_path, &sb_contents, + parse_ref, NULL); strbuf_release(&sb_path); strbuf_release(&sb_contents); return ret; And finally we can make lmdb use resolve_ref_1(). lmdb-specific code is in the new retrieve_ref() function. commit 62a5df3117c0f825bc26fd09dda29e713f94d743 (HEAD -> lmdb) Author: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> Date: Sat Feb 20 09:33:01 2016 +0700 refs/lmdb-backend: use resolve_ref_1() diff --git a/refs/files-backend.c b/refs/files-backend.c index 9b4de9f..44b7136 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1407,7 +1407,7 @@ static int resolve_missing_loose_ref(const char *refname, } } -static const char *continue_normal_ref = "read_ref returns a normal ref"; +const char *continue_normal_ref = "read_ref returns a normal ref"; static const char *continue_symlink = "read_ref returns a symlink"; /* diff --git a/refs/lmdb-backend.c b/refs/lmdb-backend.c index 6c0d7fb..7f169bd 100644 --- a/refs/lmdb-backend.c +++ b/refs/lmdb-backend.c @@ -544,74 +544,64 @@ done: return ret; } +extern const char *continue_normal_ref; + +static const char *retrieve_ref(const char *refname, + int resolve_flags, + unsigned char *sha1, + int *flags, + struct strbuf *sb_path, + struct strbuf *sb_contents, + void *cb_data) +{ + struct lmdb_transaction *transaction = cb_data; + MDB_val key, val; + int needs_free; /* dont care, leak */ + + key.mv_data = (void *)refname; + key.mv_size = strlen(refname) + 1; + + val.mv_data = NULL; + val.mv_size = 0; + + if (mdb_get_or_die(transaction, &key, &val, &needs_free)) { + struct strbuf err = STRBUF_INIT; + + if (resolve_flags & RESOLVE_REF_READING) + return NULL; + + if (verify_refname_available_txn(transaction, + refname, NULL, NULL, &err)) { + error("%s", err.buf); + strbuf_release(&err); + return NULL; + } + + hashclr(sha1); + return refname; + } + + strbuf_reset(sb_contents); + strbuf_add(sb_contents, val.mv_data, val.mv_size); + return continue_normal_ref; +} + static const char *resolve_ref_unsafe_txn(struct lmdb_transaction *transaction, const char *refname, int resolve_flags, unsigned char *sha1, int *flags) { - int bad_name = 0; - char *ref_data; - struct MDB_val key, val; - struct strbuf err = STRBUF_INIT; - int needs_free = 0; + static struct strbuf sb_refname = STRBUF_INIT; + struct strbuf sb_contents = STRBUF_INIT; + struct strbuf sb_path = STRBUF_INIT; const char *ret; - val.mv_size = 0; - val.mv_data = NULL; - - if (flags) - *flags = 0; - - if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) { - if (flags) - *flags |= REF_BAD_NAME; - - if (!(resolve_flags & RESOLVE_REF_ALLOW_BAD_NAME) || - !refname_is_safe(refname)) { - errno = EINVAL; - return NULL; - } - /* - * dwim_ref() uses REF_ISBROKEN to distinguish between - * missing refs and refs that were present but invalid, - * to complain about the latter to stderr. - * - * We don't know whether the ref exists, so don't set - * REF_ISBROKEN yet. - */ - bad_name = 1; - } - - key.mv_data = (void *)refname; - key.mv_size = strlen(refname) + 1; - if (mdb_get_or_die(transaction, &key, &val, &needs_free)) { - if (bad_name) { - hashclr(sha1); - if (flags) - *flags |= REF_ISBROKEN; - } - - if (resolve_flags & RESOLVE_REF_READING) - return NULL; - - if (verify_refname_available_txn(transaction, refname, NULL, NULL, &err)) { - error("%s", err.buf); - strbuf_release(&err); - return NULL; - } - - hashclr(sha1); - return refname; - } - - ref_data = val.mv_data; - assert(ref_data[val.mv_size - 1] == 0); - - ret = parse_ref_data(transaction, refname, ref_data, sha1, - resolve_flags, flags, bad_name); - if (needs_free) - free(ref_data); + ret = resolve_ref_1(refname, resolve_flags, sha1, flags, + &sb_refname, &sb_path, &sb_contents, + retrieve_ref, transaction); + strbuf_release(&sb_path); + strbuf_release(&sb_contents); return ret; } lmdb-backend.c:retrieve_ref(), files-backend.c:parse_ref() can be made part of ref api that, given a ref name, returns the ref raw data and type. The frontend can decide what backend callback to use based on refname, so retrieve_ref() in the end does not have to call read_per_worktree_ref() internally anymore. Hmm? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html