On Tue, Jul 30, 2024 at 10:31:54AM +0200, Patrick Steinhardt wrote: > On Mon, Jul 29, 2024 at 09:27:56PM +0800, shejialuo wrote: > > Enhance the git-fsck(1) command by adding a check for reference content > > in the files backend. The new functionality ensures that symrefs, real > > symbolic link and regular refs are validated correctly. > > > > In order to check the trailing content of the regular refs, add a new > > parameter `trailing` to `parse_loose_ref_contents`. > > > > For symrefs, `parse_loose_ref_contents` will set the "referent". > > However, symbolic link could be either absolute or relative. Use > > "strbuf_add_real_path" to read the symbolic link and convert the > > relative path to absolute path. Then use "skip_prefix" to make it align > > with symref "referent". > > > > Thus, the symrefs and symbolic links could share the same interface. Add > > a new function "files_fsck_symref_target" which aims at checking the > > following things: > > > > 1. whether the pointee is under the `refs/` directory. > > 2. whether the pointee name is correct. > > 3. whether the pointee path is a wrong type in filesystem. > > > > Last, add the following FSCK MESSAGEs: > > > > 1. "badRefContent(ERROR)": A ref has a bad content > > 2. "badSymrefPointee(ERROR)": The pointee of a symref is bad. > > 3. "trailingRefContent(WARN)": A ref content has trailing contents. > > I think it would have been fine to stop at the preceding commit as it > clearly demonstrates how the whole infrastructure is supposed to work. > Additional checks like those you add here would then be a good candidate > for a separate patch series. This would help you get the first patch > series landed faster as you really only have to focus on setting up the > baseline infrastructure. > > Feel free to keep or drop this patch as you prefer though, I don't want > to discourage you aiming higher. Just keep in mind that the more you add > on top the longer it takes to land a patch series :) > I will drop this patch in the next version. Actually, in the very former version, I didn't realise that the effort to set up the infra is so much. > > + /* > > + * If the ref is a symref, we need to check the destination name and > > + * connectivity. > > + */ > > + if (referent.len && (type & REF_ISSYMREF)) { > > + strbuf_addf(&pointee_path, "%s/%s", gitdir, referent.buf); > > + strbuf_rtrim(&referent); > > + > > + ret = files_fsck_symref_target(o, &info, refname.buf, > > + referent.buf, pointee_path.buf); > > + goto clean; > > + } else { > > + if (trailing && (*trailing != '\0' && *trailing != '\n')) { > > In case the ref ends with a newline, should we check that the next > character is `\0`? Otherwise, it may contain multiple lines, which is > not allowed for a normal ref. > > Also, shouldn't the ref always end with a newline? > This is a very interesting question here. Based on my experiments, I have found the following things: It's OK that regular refs contain multiple newlines. And git totally allows such case. The current code does not handle multiple newlines. For symrefs, it allows spaces and newlines, for example: ref: refs/heads/master <space> ref: refs/heads/master \n\n\n But for such case, git will report error: ref: refs/heads/master garbage And ref can be end with a newline or not. Both will be accepted. Junio have told me that there is no spec really. So, I ignore multiple newlines for regular refs and also ignore multiple newlines and trailing spaces for symref. > Patrick