On Sun, Aug 18, 2024 at 11:01:52PM +0800, shejialuo wrote: > We have already introduced the checks for regular refs. There is no need > to check the consistency of the target which the symbolic ref points to. > Instead, we just check the content of the symbolic ref itself. > > In order to check the content of the symbolic ref, create a function > "files_fsck_symref_target". It will first check whether the "pointee" is > under the "refs/" directory and then we will check the "pointee" itself. > > There is no specification about the content of the symbolic ref. > Although we do write "ref: %s\n" to create a symbolic ref by using > "git-symbolic-ref(1)" command. However, this is not mandatory. We still > accept symbolic refs with null trailing garbage. Put it more specific, > the following are correct: > > 1. "ref: refs/heads/master " > 2. "ref: refs/heads/master \n \n" > 3. "ref: refs/heads/master\n\n" > > But we do not allow any non-null trailing garbage. The following are bad > symbolic contents. > > 1. "ref: refs/heads/master garbage\n" > 2. "ref: refs/heads/master \n\n\n garbage " > > In order to provide above checks, we will traverse the "pointee" to > report the user whether this is null-garbage or no newline. And if > symbolic refs contain non-null garbage, we will report > "FSCK_MSG_BAD_REF_CONTENT" to the user. > > Then, we will check the name of the "pointee" is correct by using > "check_refname_format". And then if we can access the "pointee_path" in > the file system, we should ensure that the file type is correct. > > Mentored-by: Patrick Steinhardt <ps@xxxxxx> > Mentored-by: Karthik Nayak <karthik.188@xxxxxxxxx> > Signed-off-by: shejialuo <shejialuo@xxxxxxxxx> > --- > Documentation/fsck-msgids.txt | 3 ++ > fsck.h | 1 + > refs/files-backend.c | 87 +++++++++++++++++++++++++++++++++++ > t/t0602-reffiles-fsck.sh | 52 +++++++++++++++++++++ > 4 files changed, 143 insertions(+) > > diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt > index 1688c2f1fe..73587661dc 100644 > --- a/Documentation/fsck-msgids.txt > +++ b/Documentation/fsck-msgids.txt > @@ -28,6 +28,9 @@ > `badRefName`:: > (ERROR) A ref has an invalid format. > > +`badSymrefPointee`:: > + (ERROR) The pointee of a symref is bad. > + > `badTagName`:: > (INFO) A tag has an invalid format. > > diff --git a/fsck.h b/fsck.h > index 975d9b9da9..985b674dd9 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -34,6 +34,7 @@ enum fsck_msg_type { > FUNC(BAD_REF_CONTENT, ERROR) \ > FUNC(BAD_REF_FILETYPE, ERROR) \ > FUNC(BAD_REF_NAME, ERROR) \ > + FUNC(BAD_SYMREF_POINTEE, ERROR) \ > FUNC(BAD_TIMEZONE, ERROR) \ > FUNC(BAD_TREE, ERROR) \ > FUNC(BAD_TREE_SHA1, ERROR) \ > diff --git a/refs/files-backend.c b/refs/files-backend.c > index ae71692f36..bfb8d338d2 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -3434,12 +3434,92 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store, > const char *refs_check_dir, > struct dir_iterator *iter); > > +/* > + * Check the symref "pointee_name" and "pointee_path". The caller should > + * make sure that "pointee_path" is absolute. For symbolic ref, "pointee_name" > + * would be the content after "refs:". > + */ > +static int files_fsck_symref_target(struct fsck_options *o, > + struct fsck_ref_report *report, > + const char *refname, > + struct strbuf *pointee_name, > + struct strbuf *pointee_path) > +{ > + unsigned int newline_num = 0; > + unsigned int space_num = 0; > + const char *p = NULL; > + struct stat st; > + int ret = 0; > + > + if (!skip_prefix(pointee_name->buf, "refs/", &p)) { > + > + ret = fsck_report_ref(o, report, > + FSCK_MSG_BAD_SYMREF_POINTEE, > + "points to ref outside the refs directory"); > + goto out; > + } > + > + while (*p != '\0') { We typically write this `while (*p)`. > + if ((space_num || newline_num) && !isspace(*p)) { > + ret = fsck_report_ref(o, report, > + FSCK_MSG_BAD_REF_CONTENT, > + "contains non-null garbage"); > + goto out; > + } > + > + if (*p == '\n') { > + newline_num++; > + } else if (*p == ' ') { > + space_num++; > + } > + p++; > + } Can't we replace this with a single `strchr('\n')` call to check for the newline and then verify that the next character is a `\0`? The check for spaces would then be handled by `check_refname_format()`. > + /* > + * Missing target should not be treated as any error worthy event and > + * not even warn. It is a common case that a symbolic ref points to a > + * ref that does not exist yet. If the target ref does not exist, just > + * skip the check for the file type. > + */ > + if (lstat(pointee_path->buf, &st) < 0) > + goto out; > + > + if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) { > + ret = fsck_report_ref(o, report, > + FSCK_MSG_BAD_SYMREF_POINTEE, > + "points to an invalid file type"); > + goto out; > + } What exactly are we guarding against here? Don't we already verify that files in `refs/` have the correct type? Or are we checking that it does not point to a directory? Patrick