Hi Chris
On 23/09/2022 19:55, Stefan Xenos via GitGitGadget wrote:
From: Stefan Xenos <sxenos@xxxxxxxxxx>
This patch adds the get_metacommit_content method, which can classify
commits as either metacommits or normal commits, determine whether they
are abandoned, and extract the content commit's object id from the
metacommit.
Signed-off-by: Stefan Xenos <sxenos@xxxxxxxxxx>
Signed-off-by: Chris Poucet <poucet@xxxxxxxxxx>
---
Makefile | 1 +
metacommit-parser.c | 110 ++++++++++++++++++++++++++++++++++++++++++++
metacommit-parser.h | 19 ++++++++
3 files changed, 130 insertions(+)
create mode 100644 metacommit-parser.c
create mode 100644 metacommit-parser.h
diff --git a/Makefile b/Makefile
index cac3452edb9..b2bcc00c289 100644
--- a/Makefile
+++ b/Makefile
@@ -999,6 +999,7 @@ LIB_OBJS += merge-ort.o
LIB_OBJS += merge-ort-wrappers.o
LIB_OBJS += merge-recursive.o
LIB_OBJS += merge.o
+LIB_OBJS += metacommit-parser.o
There seems to be a problem with the indent here
LIB_OBJS += midx.o
LIB_OBJS += name-hash.o
LIB_OBJS += negotiator/default.o
> diff --git a/metacommit-parser.h b/metacommit-parser.h
> new file mode 100644
> index 00000000000..1c74bd6d699
> --- /dev/null
> +++ b/metacommit-parser.h
> @@ -0,0 +1,19 @@
> +#ifndef METACOMMIT_PARSER_H
> +#define METACOMMIT_PARSER_H
> +
> +#include "commit.h"
> +#include "hash.h"
> +
> +/* Indicates a normal commit (non-metacommit) */
> +#define METACOMMIT_TYPE_NONE 0
> +/* Indicates a metacommit with normal content (non-abandoned) */
> +#define METACOMMIT_TYPE_NORMAL 1
> +/* Indicates a metacommit with abandoned content */
> +#define METACOMMIT_TYPE_ABANDONED 2
Is it possible to define these as an enum? It would make the signature
of get_meta_commit_content() nicer.
> +struct commit;
What's this for? We're including commit.h above.
> +extern int get_metacommit_content(
> + struct commit *commit, struct object_id *content);
diff --git a/metacommit-parser.c b/metacommit-parser.c
new file mode 100644
index 00000000000..70c1428bfc6
--- /dev/null
+++ b/metacommit-parser.c
@@ -0,0 +1,110 @@
+#include "cache.h"
+#include "metacommit-parser.h"
+#include "commit.h"
+
+/*
+ * Search the commit buffer for a line starting with the given key. Unlike
+ * find_commit_header, this also searches the commit message body.
+ */
There is no explanation in the code or commit message as to why this
function is needed. The documentation added in the first commit says
that "parent-type" header is a commit header. I think the answer is that
this series does not implement that header but uses the commit message
instead. That's perfectly fine for a proof of concept but it is
precisely the sort of detail that should be described it the commit
message and probably flagged up in the cover letter.
+static const char *find_key(const char *msg, const char *key, size_t *out_len)
+{
+ int key_len = strlen(key);
+ const char *line = msg;
+
+ while (line) {
+ const char *eol = strchrnul(line, '\n');
+
+ if (eol - line > key_len && !memcmp(line, key, key_len) &&
+ line[key_len] == ' ') {
+ *out_len = eol - line - key_len - 1;
+ return line + key_len + 1;
+ }
+ line = *eol ? eol + 1 : NULL;
+ }
+ return NULL;
+}
+
+static struct commit *get_commit_by_index(struct commit_list *to_search, int index)
+{
+ while (to_search && index) {
+ to_search = to_search->next;
+ index--;
+ }
+
+ if (!to_search)
+ return NULL;
+
+ return to_search->item;
+}
This function is a useful utility for struct commit_list and should live
in commit.c. It could be used to simplify object-name.c:get_parent() for
example.
+/*
+ * Writes the index of the content parent to "result". Returns the metacommit
+ * type. See the METACOMMIT_TYPE_* constants.
+ */
+static int index_of_content_commit(const char *buffer, int *result)
I found the signature confusing as it is returning an int but that is
not the index. Switching to an enum for the metacommit types would
clarify that.
+{
+ int index = 0;
+ int ret = METACOMMIT_TYPE_NONE;
+ size_t parent_types_size;
+ const char *parent_types = find_key(buffer, "parent-type",
+ &parent_types_size);
+ const char *end;
+ const char *enum_start = parent_types;
+ int enum_length = 0;
+
+ if (!parent_types)
+ return METACOMMIT_TYPE_NONE;
+
+ end = &parent_types[parent_types_size];
+
+ while (1) {
+ char next = *parent_types;
+ if (next == ' ' || parent_types >= end) {
+ if (enum_length == 1) {
if enum_length != 1 then there is an error in the parent-type header and
we should probably bail out.
+ char first_char_in_enum = *enum_start;
It's not just the first character, it's the only character, do we really
need such a long variable name? (how about just calling it "type")
I'll try and take at look at the next couple of patches later in the week.
Best Wishes
Phillip