On Wed, Aug 28, 2024 at 02:50:09PM +0200, Patrick Steinhardt wrote: > On Wed, Aug 28, 2024 at 12:08:07AM +0800, shejialuo wrote: > > We have already introduced the checks for regular refs. There is no need > > to check the consistency of the target which the symbolic ref points to. > > Instead, we just check the content of the symbolic ref itself. > > > > In order to check the content of the symbolic ref, create a function > > "files_fsck_symref_target". It will first check whether the "pointee" is > > under the "refs/" directory and then we will check the "pointee" itself. > > > > There is no specification about the content of the symbolic ref. > > Although we do write "ref: %s\n" to create a symbolic ref by using > > "git-symbolic-ref(1)" command. However, this is not mandatory. We still > > accept symbolic refs with null trailing garbage. Put it more specific, > > the following are correct: > > > > 1. "ref: refs/heads/master " > > 2. "ref: refs/heads/master \n \n" > > 3. "ref: refs/heads/master\n\n" > > Now that we're talking about tightening the rules for direct refs, I > wonder whether we'd also want to apply the same rules to symrefs. > Namely, when there is trailing whitespace we should generate an > INFO-level message about that, too. This is mostly for the sake of > consistency. > Yes, actually this patch does this. I think I need to mention we reuse the "FSCK_INFO" message id defined in the [PATCH v2 2/4]. > [snip] > > diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt > > index fc074fc571..85fd058c81 100644 > > --- a/Documentation/fsck-msgids.txt > > +++ b/Documentation/fsck-msgids.txt > > @@ -28,6 +28,9 @@ > > `badRefName`:: > > (ERROR) A ref has an invalid format. > > > > +`badSymrefPointee`:: > > + (ERROR) The pointee of a symref is bad. > > I think we'd want to clarify what "bad" is supposed to mean. Like, is a > missing symref pointee bad? If this is about the format of the pointee's > name, we might want to call this "badSymrefPointeeName". > I agree, bad is too general here, we need to make it concrete. > Also, I think we don't typically call the value of a symbolic ref > "pointee", but "target". Searching for "pointee" in our codebase only > gives a single hit, and that one is not related to symbolic refs. > Thanks, I will fix this in the next version. > > +/* > > + * Check the symref "pointee_name" and "pointee_path". The caller should > > + * make sure that "pointee_path" is absolute. For symbolic ref, "pointee_name" > > + * would be the content after "refs:". > > + */ > > +static int files_fsck_symref_target(struct fsck_options *o, > > + struct fsck_ref_report *report, > > + const char *refname, > > + struct strbuf *pointee_name, > > + struct strbuf *pointee_path) > > +{ > > + const char *newline_pos = NULL; > > + const char *p = NULL; > > + struct stat st; > > + int ret = 0; > > + > > + if (!skip_prefix(pointee_name->buf, "refs/", &p)) { > > + > > + ret = fsck_report_ref(o, report, > > + FSCK_MSG_BAD_SYMREF_POINTEE, > > + "points to ref outside the refs directory"); > > + goto out; > > + } > > + > > + newline_pos = strrchr(p, '\n'); > > + if (!newline_pos || *(newline_pos + 1)) { > > + ret = fsck_report_ref(o, report, > > + FSCK_MSG_REF_MISSING_NEWLINE, > > + "missing newline"); > > + } > > The second condition `*(newline_pos + 1)` checks whether there is any > data after the newline, doesn't it? That indicates a different kind of > error than "missing newline", namely that there is trailing garbage. I > guess we'd want to report a separate info-level message for this. > > Also, shouldn't we use `strchr` instead of `strrchr()`? Otherwise, we're > only checking for trailing garbage after the _last_ newline, not after > the first one. > Yes, I totally made a mistake here. I will try to think about a new design. I have already replied to Junio like the following: > > And strrchr() to find the last LF is not sufficient for that > > purpose. We would never write "refs: refs/head/master \n", > > but the above code will find the LF, be satisified that the LF is > > followed by NUL, without realizing that SP there is not something we > > would have written! > I totally ignored this situation, and in current patch, we cannot check > this. I know why Patrick lets me use "strchr" but not "strrchr". I think > we should find the last '\n'. But instead we need to find the first > '\n'. However, in this example, we will still fail by using "strchr". > This part should be totally re-designed. > > + if (check_refname_format(pointee_name->buf, 0)) { > > + /* > > + * When containing null-garbage, "check_refname_format" will > > + * fail, we should trim the "pointee" to check again. > > + */ > > + strbuf_rtrim(pointee_name); > > + if (!check_refname_format(pointee_name->buf, 0)) { > > + ret = fsck_report_ref(o, report, > > + FSCK_MSG_TRAILING_REF_CONTENT, > > + "trailing null-garbage"); > > + goto out; > > + } > > Ah, I didn't get at first that we're doing the check a second time here. > As mentioned above, I think we should check for trailing garbage further > up already and more explicitly. > Well, I guess the implementation about this is totally wrong, which will make the reviewers hard to understand. I will drop this way to explicitly check the garbage. Thanks, Jialuo