Re: [PATCH v5 2/4] read-cache: add strcmp_offset function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/6/2017 10:19 AM, SZEDER Gábor wrote:
Add strcmp_offset() function to also return the offset of the
first change.

Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
---
 cache.h      |  1 +
 read-cache.c | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/cache.h b/cache.h
index 80b6372..4d82490 100644
--- a/cache.h
+++ b/cache.h
@@ -574,6 +574,7 @@ extern int write_locked_index(struct index_state *, struct lock_file *lock, unsi
 extern int discard_index(struct index_state *);
 extern int unmerged_index(const struct index_state *);
 extern int verify_path(const char *path);
+extern int strcmp_offset(const char *s1_in, const char *s2_in, int *first_change);
 extern int index_dir_exists(struct index_state *istate, const char *name, int namelen);
 extern void adjust_dirname_case(struct index_state *istate, char *name);
 extern struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase);
diff --git a/read-cache.c b/read-cache.c
index 9054369..b3fc77d 100644
--- a/read-cache.c
+++ b/read-cache.c

read-cache.c is probably not the best place for such a general string
utility function, though I'm not sure where its most apropriate place
would be.

Yeah, I looked and didn't see any place so I left it here
near it's first usage for now.


@@ -887,6 +887,35 @@ static int has_file_name(struct index_state *istate,
 	return retval;
 }

+
+/*
+ * Like strcmp(), but also return the offset of the first change.

This comment doesn't tell us what will happen with the offset
parameter if it is NULL or if the two strings are equal.  I don't
really care about the safety check for NULL: a callsite not interested
in the offset should just call strcmp() instead.  I'm fine either way.
However, setting it to 0 when the strings are equal doesn't seem quite
right, does it.

Good catch. I'll add a null check.

When the strings are equal, I'm not sure any one value is
better than another.  I could leave it undefined or set it
to the length.  That might be more useful since we have it on
hand and it might save the caller a strlen() later.


+ */
+int strcmp_offset(const char *s1_in, const char *s2_in, int *first_change)
+{
+	const unsigned char *s1 = (const unsigned char *)s1_in;
+	const unsigned char *s2 = (const unsigned char *)s2_in;
+	int diff = 0;
+	int k;
+
+	*first_change = 0;
+	for (k=0; s1[k]; k++)
+		if ((diff = (s1[k] - s2[k])))
+			goto found_it;
+	if (!s2[k])
+		return 0;
+	diff = -1;
+
+found_it:
+	*first_change = k;
+	if (diff > 0)
+		return 1;
+	else if (diff < 0)
+		return -1;
+	else
+		return 0;
+}

The implementation seems to me a bit long-winded, with more
conditional statements than necessary.  How about something like this
instead?  Much shorter, no goto and only half the number of
conditionals to reason about, leaves *first_change untouched when the
two strings are equal, and deals with it being NULL.

int strcmp_offset(const char *s1, const char *s2, int *first_change)
{
	int k;

	for (k = 0; s1[k] == s2[k]; k++)
		if (s1[k] == '\0')
			return 0;

	if (first_change)
		*first_change = k;
	return ((unsigned char *)s1)[k] - ((unsigned char *)s2)[k];
}


Let me give that a try.  Thanks!
Jeff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]