Quint Guvernator <quintus.public@xxxxxxxxx> writes: >> The result after the conversion, however, still have the same magic >> numbers, but one less of them each. Doesn't it make it harder to >> later spot the patterns to come up with a better abstraction that >> does not rely on the magic number? > > It is _not_ my goal to make the code harder to maintain down the road. Good. > So, at this point, which hunks (if any) are worth patching? Will, I am not going through all the mechanical hits to memcmp() and judge each and every one if it is a good idea to convert. Anybody who does so in order to tell you "which hunks are worth patching" would end up being the one doing the real work, and at that point there is nothing left to be credited as your work anymore ;-) But as Peff said, there are good bits, like these ones just for a few examples. diff --git a/builtin/apply.c b/builtin/apply.c index a7e72d5..16c20af 100644 --- a/builtin/apply.c +++ b/builtin/apply.c @@ -1631,7 +1631,7 @@ static int parse_fragment(const char *line, unsigned long size, * l10n of "\ No newline..." is at least that long. */ case '\\': - if (len < 12 || memcmp(line, "\\ ", 2)) + if (len < 12 || !starts_with(line, "\\ ")) return -1; break; } @@ -1646,7 +1646,7 @@ static int parse_fragment(const char *line, unsigned long size, * it in the above loop because we hit oldlines == newlines == 0 * before seeing it. */ - if (12 < size && !memcmp(line, "\\ ", 2)) + if (12 < size && starts_with(line, "\\ ")) offset += linelen(line, size); patch->lines_added += added; These two are about "An incomplete line marker begins with a backslash and a SP" and there is no other significance in the constant 2 (like, "after we recognise the match, we start scanning the remainder of the line starting at the offset 2"). It is a tangent but I notice that these two parts (both in the original and in the version after patch) contradict what the incomplete last line marker should look like in a minor detail. On the other hand, I think this one from nearby is iffy. @@ -846,8 +846,8 @@ static int has_epoch_timestamp(const char *nameline) * YYYY-MM-DD hh:mm:ss must be from either 1969-12-31 * (west of GMT) or 1970-01-01 (east of GMT) */ - if ((zoneoffset < 0 && memcmp(timestamp, "1969-12-31", 10)) || - (0 <= zoneoffset && memcmp(timestamp, "1970-01-01", 10))) + if ((zoneoffset < 0 && !starts_with(timestamp, "1969-12-31")) || + (0 <= zoneoffset && !starts_with(timestamp, "1970-01-01"))) return 0; hourminute = (strtol(timestamp + 11, NULL, 10) * 60 + If one looks at the post-context of the hunk, one would realize that this codepath very intimately knows how the timestamp should look like at the byte-offset level, not just "YYYY-MM-DD ought to be 10-byte long", but "there should be two-digit hour part after skipping one byte after that YYYY-MM-DD part, followed by two-digit minute part after further skipping one byte", knowing that these details are guaranteed by the stamp_regexp[] pattern it earlier made sure the given string would match. I do not think it is a good idea to reduce this kind of precise format knowledge from this function in the first place (after all, this is parsing a header line in a traditional diff whose format is known to us), and even if it were our eventual goal to reduce the precise format knowledge, it would not help very much to get rid of constant 10 only from these two memcmp() calls, and that is why I think this hunk is iffy. Hope the above shows what kind of thinking is needed to judge each change from memcmp() to !starts_with(). Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html