shejialuo <shejialuo@xxxxxxxxx> writes: > +static int files_fsck_refs_content(const char *refs_check_dir, > + struct dir_iterator *iter) > +{ > + struct strbuf ref_content = STRBUF_INIT; The caller makes sure that this gets called only on a regular file, so ... > + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) { ... the use of strbuf_read_file() is fine (i.e. we do not have to worry about the iter->path to be pointing at a symbolic link) here. > + error(_("%s/%s: unable to read the ref"), refs_check_dir, iter->basename); > + goto clean; > + } > + > + /* > + * Case 1: check if the ref content length is valid and the last > + * character is a newline. > + */ > + if (ref_content.len != the_hash_algo->hexsz + 1 || > + ref_content.buf[ref_content.len - 1] != '\n') { Funny indentation. if (ref_content.len != the_hash_algo->hexsz + 1 || ref_content.buf[ref_content.len - 1] != '\n') { Also, we do not want {braces} around a single statement block. In any case, the two checks are good ONLY for regular refs and not for symbolic refs. The users are free to create symbolic refs next to their branches, e.g. here is a way to say, "among the maintenance tracks maint-2.30, maint-2.31, ... maint-2.44, maint-2.45, what I consider the primary maintenance track is currently maint-2.45": $ git symbolic-ref refs/heads/maint refs/heads/maint-2.45 and if your directory walk encounters such a symbolic ref, your strbuf_read_file() will read something like: $ cat .git/refs/heads/maint ref: refs/heads/maint-2.45 Also, the caller also needs to be prepared to find a real symbolic link that is used as a symbolic ref, but for them you'd need to do a readlink() and the expected contents would not have "ref: " prefix, so the code to implement actual check would have to be different. The way how refs/files-backend.c:read_ref_internal() handles S_ISLNK would be illuminating. > + goto failure; > + } > + /* > + * Case 2: the content should be range of [0-9a-f]. > + */ > + for (size_t i = 0; i < the_hash_algo->hexsz; i++) { > + if (!isdigit(ref_content.buf[i]) && > + (ref_content.buf[i] < 'a' || ref_content.buf[i] > 'f')) { > + goto failure; I do not think it is a good idea to suddenly redefine what a valid way to write object names in a loose ref file after ~20 years. Given the popularity of Git, I would not be surprised at all if a third-party tool or two have been writing their own refs with uppercase hexadecimal and we have been happily using them as everybody expects. It is a good idea to be strict when you are a producer, and we do write object names always in lowercase hex, but we are lenient when consuming what is possibly written by others. As long as the hexadecimal string names an existing object (otherwise we have a dangling ref which would be another kind of error, I presume), it should be at most FSCK_WARN when it does not look like what _we_ wrote (e.g., if it uses uppercase hex, or if it has extra bytes after the 40-hex (or 64-hex) other than the final LF). If it is shorter, of course that is an outright FSCK_ERROR. How well does this interact with the fsck error levels (aka fsck_msg_type), by the way? It should be made to work well if the current design does not. > + } > + } > + > + strbuf_release(&ref_content); > + return 0; > + > +failure: > + error(_("%s/%s: invalid ref content"), refs_check_dir, iter->basename); In addition, when we honor fsck error levels, such an unconditional call to "error" may become an issue. If the user says a certain kind of anomaly is tolerated by setting a specific finding to FSCK_IGNORE, we should be silent about our finding. > +clean: > + strbuf_release(&ref_content); > + return -1; > +} > + > +static int files_fsck_refs(struct ref_store *ref_store, > + const char* refs_check_dir, > + files_fsck_refs_fn *fsck_refs_fns) > +{ > + struct dir_iterator *iter; > + struct strbuf sb = STRBUF_INIT; > + int ret = 0; > + int iter_status; > + > + strbuf_addf(&sb, "%s/%s", ref_store->gitdir, refs_check_dir); > + > + iter = dir_iterator_begin(sb.buf, 0); > + > + /* > + * The current implementation does not care about the worktree, the worktree > + * may have no refs/heads or refs/tags directory. Simply return 0 now. > + */ > + if (!iter) { > + return 0; > + } > + > + while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) { > + if (S_ISDIR(iter->st.st_mode)) { > + continue; > + } else if (S_ISREG(iter->st.st_mode)) { > + for (files_fsck_refs_fn *fsck_refs_fn = fsck_refs_fns; > + *fsck_refs_fn; fsck_refs_fn++) { Funny indentation (again---the patch 2/2 has funny "two extra HT" indentation style all over the place). Write it like so: for (files_fsck_refs_fn *fn = fsck_refs_fns; fn; fn++) > + ret |= (*fsck_refs_fn)(refs_check_dir, iter); A pointer to a function does not need to be dereferenced explicitly like so. Also, writing if (fn(refs_check_dir, iter)) ret = -1; would be more consistent with the rest of the code around here, which does not OR int ret but explicitly set it to -1 when any helper function detects an error. > + } > + } else { > + error(_("unexpected file type for '%s'"), iter->basename); > + ret = -1; This is wrong. A symbolic link is a valid symbolic ref, even though we no longer create such a symbolic ref by default. > + } > + } > + > + if (iter_status != ITER_DONE) { > + ret = -1; > + error(_("failed to iterate over '%s'"), sb.buf); > + } > + > + strbuf_release(&sb); > + > + return ret; > +} > + > +static int files_fsck(struct ref_store *ref_store) > +{ > + int ret = 0; > + > + files_fsck_refs_fn fsck_refs_fns[] = { > + files_fsck_refs_name, > + files_fsck_refs_content, > + NULL > + }; > + > + ret = files_fsck_refs(ref_store, "refs/heads",fsck_refs_fns) Missing SP after a comma. > + | files_fsck_refs(ref_store, "refs/tags", fsck_refs_fns); Why only these two hierarchies? Shouldn't it also be checking the remote tracking branches and notes? > + return ret; > +} > + > struct ref_storage_be refs_be_files = { > .name = "files", > .init = files_ref_store_create,