On Mon, Apr 4, 2011 at 5:29 AM, Erik Faye-Lund <kusmabite@xxxxxxxxx> wrote: > > We perfer this style: > > if (c < '0' || c > '7') > return NULL; > > i.e a line-break and a tab between the if-statement and the conditional code. I am well aware, I just wanted to make the code a bit less lengthy due to the repetition. On Mon, Apr 4, 2011 at 11:55 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Dan McGee <dpmcgee@xxxxxxxxx> writes: > >> We know our mode entry in our tree objects should be 5 or 6 characters >> long. This change both enforces this fact... > > I find the implementation later shown in the thread is cleaner, Totally agree; I should have tried to do it this way in the first place. However, compiling the fixed-length 0 to 5 loop does not produce fully-unrolled assembly for me with CFLAGS="-march=native -mtune=native -O2 -pipe -g" on x86_64. I see two copies of the loop only, and even worse is the (lack of) performance (each is the mode of 3 runs). Compilers are stupid apparently. Hand-unrolled: $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m28.616s user 0m28.428s sys 0m0.177s Static counter loop: $ time ../git/git-log -- zzzzz_not_exist > /dev/null real 0m26.393s user 0m26.185s sys 0m0.203s Diff between the two down at the bottom. > but I'd > comment on the word "enforces" here. > > It is more like "versions of git we know how to read from writes mode bits > with 5 or 6 characters---if we see something else, either the tree object > is corrupt, or we are too old to read the new type of entry in the tree > object". OK, sounds like we have opposite thinking on this, interesting. get_mode() is not get_permissions()- it is the same as the stat() st_mode field, which means it has to (as of now) include at least one of the S_IFREG, S_IFDIR, or S_IFGITLINK flags in its value. All of S_ISREG, S_ISDIR, and S_ISGITLINK are used fsck_walk_tree(). So I'm not seeing a change in the tree storage format happening anytime soon as there is little way to preserve both forward and backward compatibility. Currently git will happily parse both a "1" in the tree stream, or a "165324132132464677321513252351"- it doesn't care. The problem is in the former case, a mode of 01 doesn't have any significance later on. In the latter case, we will be left-shifting our mode to hell and it becomes worthless. Do we disagree on clamping the lower limit at 5, the upper limit at 6, or both? > So returning NULL is fine and it tells the caller that we do not > understand the tree object. ÂThe caller says "corrupt tree file" when we > do so here, but this change needs to rephrase it. ÂIf we stopped because > we saw something other than ' ' after the run of octal digits, then we > know the tree is corrupt. ÂIf we saw a three octal digits 644 in the mode > field, terminated with ' ', maybe we are seeing a new kind of tree entry > generated from later versions of git. > > Ideally, I'd rather see error checking done even higher layer than > decode_tree_entry() for the "we are too old" case, though. ÂWe should > return mode 0644 in such a case, and let the caller suggest that the > version of git we are running might be too old, or the tree may have been > written by a broken system that tried to mimic git in its error message. > diff --git a/tree-walk.c b/tree-walk.c index 63901f8..63ec130 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -7,34 +7,21 @@ static const char *get_mode(const char *str, unsigned int *modep) { unsigned char c; - unsigned int mode = 0; + unsigned int mode = 0, i; /* - * Unroll what looks like a loop since the bounds are + * Allow the compiler to unroll the loop since the bounds are * well-known. There should be at least 5 and at most 6 * characters available in any valid mode, as '40000' is the * shortest while '160000' (S_IFGITLINK) is the longest. */ - /* char 1 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 2 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 3 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 4 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); - /* char 5 */ - c = *str++; - if (c < '0' || c > '7') return NULL; - mode = (mode << 3) + (c - '0'); + /* chars 1-5, optional */ + for (i = 0; i < 5; i++) { + c = *str++; + if (c < '0' || c > '7') + return NULL; + mode = (mode << 3) + (c - '0'); + } /* char 6, optional */ if (*str != ' ') { c = *str++; -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html