On Fri, Apr 30, 2021 at 4:38 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > From: Han-Wen Nienhuys <hanwen@xxxxxxxxxx> > > > > A grep for EINVAL */*c reveals that no code inspects EINVAL after reading > > references. > > We often use a pattern (which is common) like this: > > if (some_func_in_ref_API(...) < 0) { > if (errno == ENOENT || errno == EISDIR) > ... it is OK for the file to be missing ... > else > ... error ... > } > > If a piece of code currently sets EINVAL to errno manually when > signalling a failure by returning a negative value to communicate > with such a caller, we wouldn't see EINVAL mentioned, so such a grep > alone would not help us guarantee the correctness of an update to > stop assignment of EINVAL at all. The callers must be vetted more > carefully than "we are happy that nobody explicitly mentions EINVAL". Sure. I looked at the callers, and documented further what I looked for. But how far should one go? It's a global variable, so transitively, almost all of the code could be observing the EINVAL value under very specific circumstances. But it would also be a terrible, fragile coding style and use of undocumented behavior. > > The files ref backend does use EINVAL so parse_loose_ref_contents() can > > communicate to lock_raw_ref() about garbage following the hex SHA1, or a short > > read in files_read_raw_ref(), but the files backend does not call into > > refs_read_raw_ref(), so its EINVAL sideband error is unused. > > This paragraph is confusing. It says EINVAL is used to signal > lock_raw_ref(), and it says EINVAL is not used by the same files > backend. Which is correct? If one part of the backend uses it, and > other parts don't, wouldn't the backend as a whole still use it? I tried to clarify the message. files-backend.c makes assumptions about the errno return for files_read_raw_ref, but it's not making assumptions about the abstracted API in refs.h > That is because there is a codeflow like this: > > if (files_read_raw_ref(...)) { > if (errno == ENOENT) { > ... do various things ... > } else if (errno == EISDIR) { > ... do different and various things ... > } else if (errno == EINVAL && (*type & REF_ISBROKEN)) { > ... deal with broken ref ... > } > ... > } as mentioned above, this isn't calling refs_read_raw_ref, so it's not affected by this patch. > > + * Return 0 on success. If the ref doesn't exist, set errno to ENOENT and return > > + * -1. If the ref exists but is neither a symbolic ref nor an object ID, it is > > + * broken; set REF_ISBROKEN in type, and return -1. If there is another error > > + * reading the ref, set errno appropriately and return -1. > > So, this is not sufficient to let caller correctly and safely handle > errors. "set REF_ISBROKEN in type, set errno to something other > than ENOENT or EISDIR, and then return -1" is necessary, I would > think. I tweaked the comment. Note that the function has only a handful of callers (and only one caller where this behavior is relevant), and it's changed in a follow-on patch in this series. Is it worth the effort to wordsmith this further? -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado