Re: [PATCH v2 3/4] ref: add symbolic ref content check for files backend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 28, 2024 at 02:50:09PM +0200, Patrick Steinhardt wrote:
> On Wed, Aug 28, 2024 at 12:08:07AM +0800, shejialuo wrote:
> > We have already introduced the checks for regular refs. There is no need
> > to check the consistency of the target which the symbolic ref points to.
> > Instead, we just check the content of the symbolic ref itself.
> > 
> > In order to check the content of the symbolic ref, create a function
> > "files_fsck_symref_target". It will first check whether the "pointee" is
> > under the "refs/" directory and then we will check the "pointee" itself.
> > 
> > There is no specification about the content of the symbolic ref.
> > Although we do write "ref: %s\n" to create a symbolic ref by using
> > "git-symbolic-ref(1)" command. However, this is not mandatory. We still
> > accept symbolic refs with null trailing garbage. Put it more specific,
> > the following are correct:
> > 
> > 1. "ref: refs/heads/master   "
> > 2. "ref: refs/heads/master   \n  \n"
> > 3. "ref: refs/heads/master\n\n"
> 
> Now that we're talking about tightening the rules for direct refs, I
> wonder whether we'd also want to apply the same rules to symrefs.
> Namely, when there is trailing whitespace we should generate an
> INFO-level message about that, too. This is mostly for the sake of
> consistency.
> 

Yes, actually this patch does this. I think I need to mention we reuse
the "FSCK_INFO" message id defined in the [PATCH v2 2/4].

> [snip]
> > diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> > index fc074fc571..85fd058c81 100644
> > --- a/Documentation/fsck-msgids.txt
> > +++ b/Documentation/fsck-msgids.txt
> > @@ -28,6 +28,9 @@
> >  `badRefName`::
> >  	(ERROR) A ref has an invalid format.
> >  
> > +`badSymrefPointee`::
> > +	(ERROR) The pointee of a symref is bad.
> 
> I think we'd want to clarify what "bad" is supposed to mean. Like, is a
> missing symref pointee bad? If this is about the format of the pointee's
> name, we might want to call this "badSymrefPointeeName".
> 

I agree, bad is too general here, we need to make it concrete.

> Also, I think we don't typically call the value of a symbolic ref
> "pointee", but "target". Searching for "pointee" in our codebase only
> gives a single hit, and that one is not related to symbolic refs.
> 

Thanks, I will fix this in the next version.

> > +/*
> > + * Check the symref "pointee_name" and "pointee_path". The caller should
> > + * make sure that "pointee_path" is absolute. For symbolic ref, "pointee_name"
> > + * would be the content after "refs:".
> > + */
> > +static int files_fsck_symref_target(struct fsck_options *o,
> > +				    struct fsck_ref_report *report,
> > +				    const char *refname,
> > +				    struct strbuf *pointee_name,
> > +				    struct strbuf *pointee_path)
> > +{
> > +	const char *newline_pos = NULL;
> > +	const char *p = NULL;
> > +	struct stat st;
> > +	int ret = 0;
> > +
> > +	if (!skip_prefix(pointee_name->buf, "refs/", &p)) {
> > +
> > +		ret = fsck_report_ref(o, report,
> > +				      FSCK_MSG_BAD_SYMREF_POINTEE,
> > +				      "points to ref outside the refs directory");
> > +		goto out;
> > +	}
> > +
> > +	newline_pos = strrchr(p, '\n');
> > +	if (!newline_pos || *(newline_pos + 1)) {
> > +		ret = fsck_report_ref(o, report,
> > +				      FSCK_MSG_REF_MISSING_NEWLINE,
> > +				      "missing newline");
> > +	}
> 
> The second condition `*(newline_pos + 1)` checks whether there is any
> data after the newline, doesn't it? That indicates a different kind of
> error than "missing newline", namely that there is trailing garbage. I
> guess we'd want to report a separate info-level message for this.
> 
> Also, shouldn't we use `strchr` instead of `strrchr()`? Otherwise, we're
> only checking for trailing garbage after the _last_ newline, not after
> the first one.
> 

Yes, I totally made a mistake here. I will try to think about a new
design. I have already replied to Junio like the following:

> > And strrchr() to find the last LF is not sufficient for that
> > purpose.  We would never write "refs:  refs/head/master \n",
> > but the above code will find the LF, be satisified that the LF is
> > followed by NUL, without realizing that SP there is not something we
> > would have written!

> I totally ignored this situation, and in current patch, we cannot check
> this. I know why Patrick lets me use "strchr" but not "strrchr". I think
> we should find the last '\n'. But instead we need to find the first
> '\n'. However, in this example, we will still fail by using "strchr".
> This part should be totally re-designed.

> > +	if (check_refname_format(pointee_name->buf, 0)) {
> > +		/*
> > +		 * When containing null-garbage, "check_refname_format" will
> > +		 * fail, we should trim the "pointee" to check again.
> > +		 */
> > +		strbuf_rtrim(pointee_name);
> > +		if (!check_refname_format(pointee_name->buf, 0)) {
> > +			ret = fsck_report_ref(o, report,
> > +					      FSCK_MSG_TRAILING_REF_CONTENT,
> > +					      "trailing null-garbage");
> > +			goto out;
> > +		}
> 
> Ah, I didn't get at first that we're doing the check a second time here.
> As mentioned above, I think we should check for trailing garbage further
> up already and more explicitly.
> 

Well, I guess the implementation about this is totally wrong, which will
make the reviewers hard to understand. I will drop this way to
explicitly check the garbage.

Thanks,
Jialuo




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux