Re: [PATCH v4 4/8] packed-backend: add "packed-refs" header consistency check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



shejialuo <shejialuo@xxxxxxxxx> writes:

> In "packed-backend.c::create_snapshot", if there is a header (the line
> which starts with '#'), we will check whether the line starts with "#
> pack-refs with:". Before we port this check into "packed_fsck", let's
> fix "create_snapshot" to check the prefix "# packed-ref with: " instead
> of "# packed-ref with:" due to that we will always write a single
> trailing space after the colon.
>

Okay. So we're extending the check to also include the trailing space.

>
> However, we need to consider other situations and discuss whether we
> need to add checks.
>
> 1. If the header does not exist, we should not report an error to the
>    user. This is because in older Git version, we never write header in
>    the "packed-refs" file. Also, we do allow no header in "packed-refs"
>    in runtime.

Makes sense.

> 2. If the header content does not start with "# packed-ref with: ", we
>    should report an error just like what "create_snapshot" does. So,
>    create a new fsck message "badPackedRefHeader(ERROR)" for this.
> 3. If the header content is not the same as the constant string
>    "PACKED_REFS_HEADER". This is expected because we make it extensible
>    intentionally. So, there is no need to report.

Do you think it's worthwhile adding a warning/info here? This would
allow users to re-run 'git pack-refs' to ensure that they have a more
up-to date version of 'packed-refs'.

>
> As we have analyzed, we only need to check the case 2 in the above. In
> order to do this, read the "packed-refs" file via "strbuf_read". Like
> what "create_snapshot" and other functions do, we could split the line
> by finding the next newline in the buffer. When we cannot find a
> newline, we could report an error.
>
> So, create a function "packed_fsck_ref_next_line" to find the next
> newline and if there is no such newline, use
> "packedRefEntryNotTerminated(ERROR)" to report an error to the user.
>
> Then, parse the first line to apply the checks. Update the test to
> exercise the code.
>
> Mentored-by: Patrick Steinhardt <ps@xxxxxx>
> Mentored-by: Karthik Nayak <karthik.188@xxxxxxxxx>
> Signed-off-by: shejialuo <shejialuo@xxxxxxxxx>
> ---
>  Documentation/fsck-msgids.txt |  8 ++++
>  fsck.h                        |  2 +
>  refs/packed-backend.c         | 75 ++++++++++++++++++++++++++++++++++-
>  t/t0602-reffiles-fsck.sh      | 52 ++++++++++++++++++++++++
>  4 files changed, 136 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> index b14bc44ca4..11906f90fd 100644
> --- a/Documentation/fsck-msgids.txt
> +++ b/Documentation/fsck-msgids.txt
> @@ -16,6 +16,10 @@
>  `badObjectSha1`::
>  	(ERROR) An object has a bad sha1.
>
> +`badPackedRefHeader`::
> +	(ERROR) The "packed-refs" file contains an invalid
> +	header.
> +
>  `badParentSha1`::
>  	(ERROR) A commit object has a bad parent sha1.
>
> @@ -176,6 +180,10 @@
>  `nullSha1`::
>  	(WARN) Tree contains entries pointing to a null sha1.
>
> +`packedRefEntryNotTerminated`::
> +	(ERROR) The "packed-refs" file contains an entry that is
> +	not terminated by a newline.
> +
>  `refMissingNewline`::
>  	(INFO) A loose ref that does not end with newline(LF). As
>  	valid implementations of Git never created such a loose ref
> diff --git a/fsck.h b/fsck.h
> index a44c231a5f..67e3c97bc0 100644
> --- a/fsck.h
> +++ b/fsck.h
> @@ -30,6 +30,7 @@ enum fsck_msg_type {
>  	FUNC(BAD_EMAIL, ERROR) \
>  	FUNC(BAD_NAME, ERROR) \
>  	FUNC(BAD_OBJECT_SHA1, ERROR) \
> +	FUNC(BAD_PACKED_REF_HEADER, ERROR) \
>  	FUNC(BAD_PARENT_SHA1, ERROR) \
>  	FUNC(BAD_REF_CONTENT, ERROR) \
>  	FUNC(BAD_REF_FILETYPE, ERROR) \
> @@ -53,6 +54,7 @@ enum fsck_msg_type {
>  	FUNC(MISSING_TYPE, ERROR) \
>  	FUNC(MISSING_TYPE_ENTRY, ERROR) \
>  	FUNC(MULTIPLE_AUTHORS, ERROR) \
> +	FUNC(PACKED_REF_ENTRY_NOT_TERMINATED, ERROR) \
>  	FUNC(TREE_NOT_SORTED, ERROR) \
>  	FUNC(UNKNOWN_TYPE, ERROR) \
>  	FUNC(ZERO_PADDED_DATE, ERROR) \
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 6401cecd5f..ff74ab915e 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -694,7 +694,7 @@ static struct snapshot *create_snapshot(struct packed_ref_store *refs)
>
>  		tmp = xmemdupz(snapshot->buf, eol - snapshot->buf);
>
> -		if (!skip_prefix(tmp, "# pack-refs with:", (const char **)&p))
> +		if (!skip_prefix(tmp, "# pack-refs with: ", (const char **)&p))
>  			die_invalid_line(refs->path,
>  					 snapshot->buf,
>  					 snapshot->eof - snapshot->buf);
> @@ -1749,12 +1749,76 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s
>  	return empty_ref_iterator_begin();
>  }
>
> +static int packed_fsck_ref_next_line(struct fsck_options *o,
> +				     unsigned long line_number, const char *start,
> +				     const char *eof, const char **eol)
> +{
> +	int ret = 0;
> +
> +	*eol = memchr(start, '\n', eof - start);
> +	if (!*eol) {
> +		struct strbuf packed_entry = STRBUF_INIT;
> +		struct fsck_ref_report report = { 0 };
> +
> +		strbuf_addf(&packed_entry, "packed-refs line %lu", line_number);
> +		report.path = packed_entry.buf;
> +		ret = fsck_report_ref(o, &report,
> +				      FSCK_MSG_PACKED_REF_ENTRY_NOT_TERMINATED,
> +				      "'%.*s' is not terminated with a newline",
> +				      (int)(eof - start), start);
> +
> +		/*
> +		 * There is no newline but we still want to parse it to the end of
> +		 * the buffer.
> +		 */
> +		*eol = eof;
> +		strbuf_release(&packed_entry);
> +	}
> +
> +	return ret;
> +}
> +
> +static int packed_fsck_ref_header(struct fsck_options *o,
> +				  const char *start, const char *eol)
> +{
> +	if (!starts_with(start, "# pack-refs with: ")) {
> +		struct fsck_ref_report report = { 0 };
> +		report.path = "packed-refs.header";
> +
> +		return fsck_report_ref(o, &report,
> +				       FSCK_MSG_BAD_PACKED_REF_HEADER,
> +				       "'%.*s' does not start with '# pack-refs with: '",
> +				       (int)(eol - start), start);
> +	}
> +
> +	return 0;
> +}
> +
> +static int packed_fsck_ref_content(struct fsck_options *o,
> +				   const char *start, const char *eof)
> +{
> +	unsigned long line_number = 1;
> +	const char *eol;
> +	int ret = 0;
> +
> +	ret |= packed_fsck_ref_next_line(o, line_number, start, eof, &eol);
> +	if (*start == '#') {
> +		ret |= packed_fsck_ref_header(o, start, eol);
> +
> +		start = eol + 1;
> +		line_number++;

Why do we increment `line_number` here? There is no usage beyond this.

> +	}
> +
> +	return ret;
> +}
> +
>  static int packed_fsck(struct ref_store *ref_store,
>  		       struct fsck_options *o,
>  		       struct worktree *wt)
>  {
>  	struct packed_ref_store *refs = packed_downcast(ref_store,
>  							REF_STORE_READ, "fsck");
> +	struct strbuf packed_ref_content = STRBUF_INIT;
>  	int ret = 0;
>  	int fd;
>
> @@ -1786,7 +1850,16 @@ static int packed_fsck(struct ref_store *ref_store,
>  		goto cleanup;
>  	}
>
> +	if (strbuf_read(&packed_ref_content, fd, 0) < 0) {
> +		ret = error_errno(_("unable to read %s"), refs->path);
> +		goto cleanup;
> +	}
> +

So we want to parse the whole ref content to a buffer, wonder if it
makes more sense to use `strbuf_read_line()` here instead. But let's
carry on.

> +	ret = packed_fsck_ref_content(o, packed_ref_content.buf,
> +				      packed_ref_content.buf + packed_ref_content.len);
> +

We pass the entire content and the EOF to the function.

>  cleanup:
> +	strbuf_release(&packed_ref_content);
>  	return ret;
>  }
>
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 42c8d4ca1e..30be1982df 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -639,4 +639,56 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
>  	)
>  '
>
> +test_expect_success 'packed-refs header should be checked' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit default &&
> +
> +		git refs verify 2>err &&
> +		test_must_be_empty err &&
> +
> +		for bad_header in "# pack-refs wit: peeled fully-peeled sorted " \
> +				  "# pack-refs with traits: peeled fully-peeled sorted " \
> +				  "# pack-refs with a: peeled fully-peeled" \
> +				  "# pack-refs with:peeled fully-peeled sorted"
> +		do
> +			printf "%s\n" "$bad_header" >.git/packed-refs &&
> +			test_must_fail git refs verify 2>err &&
> +			cat >expect <<-EOF &&
> +			error: packed-refs.header: badPackedRefHeader: '\''$bad_header'\'' does not start with '\''# pack-refs with: '\''
> +			EOF
> +			rm .git/packed-refs &&
> +			test_cmp expect err || return 1
> +		done
> +	)
> +'
> +
> +test_expect_success 'packed-refs missing header should not be reported' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit default &&
> +
> +		printf "$(git rev-parse HEAD) refs/heads/main\n" >.git/packed-refs &&
> +		git refs verify 2>err &&
> +		test_must_be_empty err
> +	)
> +'
> +
> +test_expect_success 'packed-refs unknown traits should not be reported' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +		test_commit default &&
> +
> +		printf "# pack-refs with: peeled fully-peeled sorted foo\n" >.git/packed-refs &&
> +		git refs verify 2>err &&
> +		test_must_be_empty err
> +	)
> +'
> +
>  test_done
> --
> 2.48.1

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux