Re: [RFC PATCHv6 1/2] repack: rewrite the shell script in C

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/21/2013 10:56 PM, Junio C Hamano wrote:
> Stefan Beller <stefanbeller@xxxxxxxxxxxxxx> writes:
> 
>> The motivation of this patch is to get closer to a goal of being
>> able to have a core subset of git functionality built in to git.
>> That would mean
>>
>>  * people on Windows could get a copy of at least the core parts
>>    of Git without having to install a Unix-style shell
>>
>>  * people deploying to servers don't have to rewrite the #! line
>>    or worry about the PATH and quality of installed POSIX
>>    utilities, if they are only using the built-in part written
>>    in C
> 
> I am not sure what is meant by the latter.  Rewriting #! is part of
> any scripted Porcelain done by the top-level Makefile, and I do not
> think we have seen any problem reports on it.
> 
> As to "quality of ... utilities", I think the real issue some people
> in the thread had was not about "deploying to servers" but about
> installing in a minimalistic chrooted environment where standard
> tools may be lacking.
> 
>> diff --git a/builtin/repack.c b/builtin/repack.c
>> new file mode 100644
>> index 0000000..fb050c0
>> --- /dev/null
>> +++ b/builtin/repack.c
>> @@ -0,0 +1,376 @@
>> +/*
>> + * The shell version was written by Linus Torvalds (2005) and many others.
>> + * This is a translation into C by Stefan Beller (2013)
>> + */
> 
> I am not sure if we want to record "ownership" in the code like
> this; it will go stale over time.

I'll remove it. Initially I put it there as I found similar 
comments in other files as well.


>> +static int delta_base_offset = 1;
>> +char *packdir;
> 
> Does this have to be global?

We could pass it to all the functions, making it not global.
I'd be ok with that for the functions get_pack_filenames 
and remove_redundant_pack, but we also need to know
packdir in remove_temporary_files which is called from
the signal handler remove_pack_on_signal.

As the path is pretty obvious (get_object_directory() + "/pack"),
we could however also construct it again in the signal handler.


> So in summary:
> 
> 	dir = opendir(packdir);
>         if (!dir)
> 		return;
> 
> 	strbuf_addf(&buf, "%s-", packtmp);

packtmp is not yet a global variable, but could be passed to 
to this function. Currently we're reconstructing it here.

> 
>         /* Point at the slash at the end of ".../objects/pack/" */
> 	dirlen = strlen(packdir) + 1;
>         /* Point at the dash at the end of ".../.tmp-%d-pack-" */
>         prefixlen = buf.len - dirlen;
> 
> You would need to move the initialization of packdir and packtmp
> before sigchain_push() in cmd_repack() if you were to do this.

Ah ok, I'll do so.

>> +
>> +		if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
>> +			string_list_append_nodup(fname_list, fname);
> 
> mental note: this is getting names of non-kept packs, not all packs.

I should document that. ;)


>> +	while (strbuf_getline(&line, out, '\n') != EOF) {
>> +		if (line.len != 40)
>> +			die("repack: Expecting 40 character sha1 lines only from pack-objects.");
>> +		strbuf_addstr(&line, "");
> 
> What is this addstr() about?

According to the documentation of strbufs, we cannot assume to have sane 
strings, but anything. Adding an empty string however will make sure to
add a NUL-terminated string to the buffer, no?

In a previous roll of this patch, which operated on char* line,
there was just line[40] = '\0'; // replacing '\n' by '\0'
to have it sane in the string list.


> 
>> +		string_list_append(&names, line.buf);
>> +		count_packs++;
> 
> It probably is more in line with our naming convention to call this
> nr_packs, num_packs, etc.  "count_packs" sounds more like a boolean
> that instructs the code to either count or not bother counting,
> which this thing is not.

This is something subtle, but important to know. Thanks, will be fixed in
the reroll.


>> +
>> +			if (rename(fname, fname_old)) {
>> +				failed = 1;
>> +				break;
> 
> "break"-ing from here leaks fname_old.  As the only out-of-line call
> file_exists() is just a thin wrapper around lstat(), I think it is
> fine not to pathdup the fname_old here.

fixed

I'd really appreciate, if there was documentation on these functions.
(When is mkpath safe? What is better in which situation: mkpath or strbufs?)
Maybe I could start doing it (but only those functions I used so far,
there are many more in cache.h)

> 
>> +			}
>> +			string_list_append_nodup(&rollback, fname);
>> +			free(fname);
> 
> This looks bad, doesn't it?  append_nodup() lets &rollback string
> list to take the ownership of the piece of memory pointed at by
> fname, but then you free it here, no?
> 
> If you initialize &rollback with INIT_NODUP, you would not have to
> call append_nodup().

Removed the free.
Having rollback initialized with NODUP and then not explicitely 
using append_nodup() makes me feel unhappy, because now you need
to check different places to make sure there is no leaking memory,
(you need to know the list is NODUP). I changed it nevertheless,
maybe I feel enlightened later on. ;)

As Matthieu proposed, I also set
CFLAGS += -Wdeclaration-after-statement in config.mak now. Hopefully
I don't screw up again now.

Thanks,
Stefan

--8<--
From 79945f5ae45f08fa2dbabfa1f6b7cd0b344ec0b3 Mon Sep 17 00:00:00 2001
From: Stefan Beller <stefanbeller@xxxxxxxxxxxxxx>
Date: Thu, 22 Aug 2013 00:13:35 +0200
Subject: [PATCH] Suggestions by Junio

---
 builtin/repack.c | 68 ++++++++++++++++++++++++++------------------------------
 1 file changed, 31 insertions(+), 37 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 1f13e0d..bb90f07 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -1,8 +1,3 @@
-/*
- * The shell version was written by Linus Torvalds (2005) and many others.
- * This is a translation into C by Stefan Beller (2013)
- */
-
 #include "builtin.h"
 #include "cache.h"
 #include "dir.h"
@@ -13,9 +8,8 @@
 #include "string-list.h"
 #include "argv-array.h"
 
-/* enabled by default since 22c79eab (2008-06-25) */
 static int delta_base_offset = 1;
-char *packdir;
+char *packdir, *packtmp;
 
 static const char *const git_repack_usage[] = {
 	N_("git repack [options]"),
@@ -41,18 +35,16 @@ static void remove_temporary_files(void)
 	DIR *dir;
 	struct dirent *e;
 
-	/* .git/objects/pack */
-	strbuf_addstr(&buf, get_object_directory());
-	strbuf_addstr(&buf, "/pack");
-	dir = opendir(buf.buf);
-	if (!dir) {
-		strbuf_release(&buf);
+	dir = opendir(packdir);
+	if (!dir)
 		return;
-	}
 
-	/* .git/objects/pack/.tmp-$$-pack-* */
+	strbuf_addstr(&buf, packdir);
+
+	/* dirlen holds the length of the path before the file name */
 	dirlen = buf.len + 1;
-	strbuf_addf(&buf, "/.tmp-%d-pack-", (int)getpid());
+	strbuf_addf(&buf, "%s", packtmp);
+	/* prefixlen holds the length of the prefix */
 	prefixlen = buf.len - dirlen;
 
 	while ((e = readdir(dir))) {
@@ -73,11 +65,16 @@ static void remove_pack_on_signal(int signo)
 	raise(signo);
 }
 
+/*
+ * Adds all packs hex strings to the fname list, which do not
+ * have a corresponding .keep file.
+ */
 static void get_pack_filenames(struct string_list *fname_list)
 {
 	DIR *dir;
 	struct dirent *e;
 	char *fname;
+	size_t len;
 
 	if (!(dir = opendir(packdir)))
 		return;
@@ -86,7 +83,7 @@ static void get_pack_filenames(struct string_list *fname_list)
 		if (suffixcmp(e->d_name, ".pack"))
 			continue;
 
-		size_t len = strlen(e->d_name) - strlen(".pack");
+		len = strlen(e->d_name) - strlen(".pack");
 		fname = xmemdupz(e->d_name, len);
 
 		if (!file_exists(mkpath("%s/%s.keep", packdir, fname)))
@@ -95,14 +92,14 @@ static void get_pack_filenames(struct string_list *fname_list)
 	closedir(dir);
 }
 
-static void remove_redundant_pack(const char *path, const char *sha1)
+static void remove_redundant_pack(const char *path_prefix, const char *hex)
 {
 	const char *exts[] = {".pack", ".idx", ".keep"};
 	int i;
 	struct strbuf buf = STRBUF_INIT;
 	size_t plen;
 
-	strbuf_addf(&buf, "%s/%s", path, sha1);
+	strbuf_addf(&buf, "%s/%s", path_prefix, hex);
 	plen = buf.len;
 
 	for (i = 0; i < ARRAY_SIZE(exts); i++) {
@@ -115,15 +112,14 @@ static void remove_redundant_pack(const char *path, const char *sha1)
 int cmd_repack(int argc, const char **argv, const char *prefix)
 {
 	const char *exts[2] = {".idx", ".pack"};
-	char *packtmp;
 	struct child_process cmd;
 	struct string_list_item *item;
 	struct argv_array cmd_args = ARGV_ARRAY_INIT;
 	struct string_list names = STRING_LIST_INIT_DUP;
-	struct string_list rollback = STRING_LIST_INIT_DUP;
+	struct string_list rollback = STRING_LIST_INIT_NODUP;
 	struct string_list existing_packs = STRING_LIST_INIT_DUP;
 	struct strbuf line = STRBUF_INIT;
-	int count_packs, ext, ret;
+	int nr_packs, ext, ret, failed;
 	FILE *out;
 
 	/* variables to be filled by option parsing */
@@ -173,11 +169,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, builtin_repack_options,
 				git_repack_usage, 0);
 
-	sigchain_push_common(remove_pack_on_signal);
-
 	packdir = mkpathdup("%s/pack", get_object_directory());
 	packtmp = mkpathdup("%s/.tmp-%d-pack", packdir, (int)getpid());
 
+	sigchain_push_common(remove_pack_on_signal);
+
 	argv_array_push(&cmd_args, "pack-objects");
 	argv_array_push(&cmd_args, "--keep-true-parents");
 	argv_array_push(&cmd_args, "--honor-pack-keep");
@@ -233,14 +229,14 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	if (ret)
 		return ret;
 
-	count_packs = 0;
+	nr_packs = 0;
 	out = xfdopen(cmd.out, "r");
 	while (strbuf_getline(&line, out, '\n') != EOF) {
 		if (line.len != 40)
 			die("repack: Expecting 40 character sha1 lines only from pack-objects.");
 		strbuf_addstr(&line, "");
 		string_list_append(&names, line.buf);
-		count_packs++;
+		nr_packs++;
 	}
 	fclose(out);
 	ret = finish_command(&cmd);
@@ -248,10 +244,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 		return ret;
 	argv_array_clear(&cmd_args);
 
-	if (!count_packs && !quiet)
+	if (!nr_packs && !quiet)
 		printf("Nothing new to pack.\n");
 
-	int failed = 0;
+	failed = 0;
 	for_each_string_list_item(item, &names) {
 		for (ext = 0; ext < 2; ext++) {
 			char *fname, *fname_old;
@@ -262,7 +258,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 
-			fname_old = mkpathdup("%s/old-%s%s", packdir,
+			fname_old = mkpath("%s/old-%s%s", packdir,
 						item->string, exts[ext]);
 			if (file_exists(fname_old))
 				unlink(fname_old);
@@ -271,15 +267,13 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				failed = 1;
 				break;
 			}
-			string_list_append_nodup(&rollback, fname);
-			free(fname);
-			free(fname_old);
+			string_list_append(&rollback, fname);
 		}
 		if (failed)
 			break;
 	}
 	if (failed) {
-		struct string_list rollback_failure;
+		struct string_list rollback_failure = STRING_LIST_INIT_DUP;
 		for_each_string_list_item(item, &rollback) {
 			char *fname, *fname_old;
 			fname = mkpathdup("%s/%s", packdir, item->string);
@@ -289,7 +283,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 			free(fname);
 		}
 
-		if (rollback.nr) {
+		if (rollback_failure.nr) {
 			int i;
 			fprintf(stderr,
 				"WARNING: Some packs in use have been renamed by\n"
@@ -299,10 +293,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				"WARNING: attempt to rename them back to their\n"
 				"WARNING: original names also failed.\n"
 				"WARNING: Please rename them in %s manually:\n", packdir);
-			for (i = 0; i < rollback.nr; i++)
+			for (i = 0; i < rollback_failure.nr; i++)
 				fprintf(stderr, "WARNING:   old-%s -> %s\n",
-					rollback.items[i].string,
-					rollback.items[i].string);
+					rollback_failure.items[i].string,
+					rollback_failure.items[i].string);
 		}
 		exit(1);
 	}
-- 
1.8.4.rc3.1.gc1ebd90



Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]