On Mon, Sep 30, 2024 at 3:04 PM brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: > > There is work underway to move some of the Git code out into a reusable > library. In such a case, it's not very desirable to have the library > code write random errors to standard error, since this is an antipattern > which annoys terminal users. > > Instead, we will expect callers of our library function to return > errors. The reusability of our library will be substantially improved > if we return typed errors so that users can easily determine what kind > of error might have occurred and react differently based on different > contexts. For example, if we are looking up an object in a partial > clone and it's not found, we might decide to download it, but we might > return an error to the user if the problem is instead that the revision > specified is not syntactically valid. > > To help the libification process and make our code more generally > maintainable, add an error type. This consists of on 64-bit integer, > which contains bit flags and a 32-bit code, and a pointer, which depends > on the code. It is designed to be passed and returned by value, not > pointer, and it is possible to do so in two registers on 64-bit systems. > Similar functionality works well for error types in Rust and for the > standard library's lldiv_t, so this should not pose a problem. > > Provide the ability to specify either an errno value or a git error code > as the code. This allows us to use this type generically when handling > errno values such as processing files, as well as express a rich set of > possible error codes specific to Git. We pick an unsigned 32-bit code > because Windows can use the full set of 32 bits in its error values, > even though most Unix systems use only a small set of codes which > traditionally start at 1. 32 bits for Git errors also allows us plenty > of space to expand as we see fit. > > Allow multiple errors to be provided and wrapped in a single object, > which is useful in many situations, and add helpers to determine if any > error in the set matches a particular code. > > Additionally, provide error formatting functions that produce a suitable > localized string for ease of use. > > Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> Left a few comments below about portability and readability, but general thoughts: of course this is quite limited, but I think it will actually be enough in most cases. It also seems like if we need more later, we can use this same code-OR-with-pointer thing to point to something more nuanced than `void* meta`. I'd be in favor of applying this model to somewhere with clear error codes and relatively few invocations (maybe to the internals of something like argparse or run-command, without initially modifying the public interface, for example?) and seeing where we run into friction. I have a little bit of concern about the idea of centralizing all the `meta` parsing and error string generation, but it seems like we might be able to adopt a model more similar to the one we use for `git_config()` callbacks and do more context-aware parsing instead to keep `git_error_strbuf` from getting too huge. Thanks for sending this. I'm looking forward to hearing others' opinions. - Emily > --- > Makefile | 1 + > error.c | 43 ++++++++++++++ > error.h | 168 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 212 insertions(+) > create mode 100644 error.c > create mode 100644 error.h > > diff --git a/Makefile b/Makefile > index 7344a7f725..5d9bf992e9 100644 > --- a/Makefile > +++ b/Makefile > @@ -1013,6 +1013,7 @@ LIB_OBJS += dir.o > LIB_OBJS += editor.o > LIB_OBJS += entry.o > LIB_OBJS += environment.o > +LIB_OBJS += error.o > LIB_OBJS += ewah/bitmap.o > LIB_OBJS += ewah/ewah_bitmap.o > LIB_OBJS += ewah/ewah_io.o > diff --git a/error.c b/error.c > new file mode 100644 > index 0000000000..713bc42187 > --- /dev/null > +++ b/error.c > @@ -0,0 +1,43 @@ > +#include "git-compat-util.h" > +#include "gettext.h" > +#include "error.h" > +#include "hex.h" > +#include "strbuf.h" > + > +const char *git_error_string(struct git_error err) > +{ > + struct strbuf buf = STRBUF_INIT; > + if (!git_error_strbuf(&buf, err)) > + return NULL; > + return strbuf_detach(&buf, NULL); > +} > + > +const char *git_error_strbuf(struct strbuf *buf, struct git_error err) Is the idea that we continue to extend `git_error_strbuf` with more `git_error_code` values as we add this method of error handling across the codebase? I'm worried it could get quite unwieldy. Or are you suggesting that we could write `git_error_strbuf_object_store`, `git_error_strbuf_hook`, etc to keep the code->string conversion local? That would let us do custom processing of err.meta depending on context, too, wouldn't it? > +{ > + if (GIT_ERROR_SUCCESS(err)) { > + return NULL; > + } else if (GIT_ERROR_ERRNO(err) != -1) { > + return xstrdup(strerror(GIT_ERROR_ERRNO(err))); > + } else { > + struct git_error_multiple *me = err.meta; > + switch (GIT_ERROR_GITERR(err)) { > + case GIT_ERR_OBJECT_NOT_FOUND: > + if (err.meta) > + strbuf_addf(buf, _("object not found: %s"), oid_to_hex(err.meta)); > + else > + strbuf_addf(buf, _("object not found")); > + case GIT_ERR_NULL_OID: > + if (err.meta) > + strbuf_addf(buf, _("null object ID not allowed in this context: %s"), (char *)err.meta); > + else > + strbuf_addf(buf, _("null object ID not allowed")); > + case GIT_ERR_MULTIPLE: > + strbuf_addf(buf, _("multiple errors:\n")); > + for (size_t i = 0; i < me->count; i++) { > + git_error_strbuf(buf, me->errs[i]); > + strbuf_addstr(buf, "\n"); > + } > + } > + return buf->buf; > + } > +} > diff --git a/error.h b/error.h > new file mode 100644 > index 0000000000..485cca99e0 > --- /dev/null > +++ b/error.h > @@ -0,0 +1,168 @@ > +#ifndef ERROR_H > +#define ERROR_H > + > +#include "git-compat-util.h" > + > +/* Set if this value is initialized. */ > +#define GIT_ERROR_BIT_INIT (1ULL << 63) > +/* Set if the code is an errno code, clear if it's a git error code. */ > +#define GIT_ERROR_BIT_ERRNO (1ULL << 62) > +/* > + * Set if the memory in meta should be freed; otherwise, it's statically > + * allocated. > + */ > +#define GIT_ERROR_BIT_ALLOC (1ULL << 61) > +/* > + * Set if the memory in meta is a C string; otherwise, it's a metadata struct. > + */ > +#define GIT_ERROR_BIT_MSG (1ULL << 60) > + > +#define GIT_ERROR_BIT_MASK (0xf << 60) > + > +#define GIT_ERROR_OK (git_error_ok()) > + > +#define GIT_ERROR_SUCCESS(e) (((e).code == GIT_ERROR_BIT_INIT)) > +#define GIT_ERROR_SUCCESS_CONSUME(e) ((git_error_free(&(e)).code == GIT_ERROR_BIT_INIT) > + > +#define GIT_ERROR_ERRNO(e) (((e).code & GIT_ERROR_BIT_ERRNO) ? ((e).code & 0xffffffff) : -1) > +#define GIT_ERROR_GITERR(e) (((e).code & GIT_ERROR_BIT_ERRNO) ? -1 : ((e).code & 0xffffffff)) Aurgh, too bad we don't get bitfields before C11. :) (Although I am not sure if that helps with your register-level optimization at that point... but it probably helps with readability.) But, I do wonder if this gluing-together-two-types-into-one-value thing may break based on endianness? (And if we care? I don't think we have any tests running on POWER systems, so maybe this falls under the umbrella of "you should give us tests if you want us to not break you"?) > + > +/* > + * A value representing an error in Git. > + */ > +struct git_error { > + uint64_t code; > + void *meta; > +}; > + > +struct git_error_multiple { > + struct git_error *errs; > + size_t count; > +}; > + > +enum git_error_code { > + /* The operation was a success. */ > + GIT_ERR_SUCCESS = 0, > + /* An object ID was provided, but it was not found. > + * > + * meta will be NULL or a pointer to struct object ID. > + */ > + GIT_ERR_OBJECT_NOT_FOUND = 1, > + /* > + * An object ID was provided, but it is all zeros and that is not > + * allowed. > + * > + * meta will be NULL or a message explaining the context. > + */ > + GIT_ERR_NULL_OID = 2, > + /* > + * Multiple errors occurred. > + * > + * meta must be a pointer to struct git_error_multiple. > + */ > + GIT_ERR_MULTIPLE = 3, > +}; > + > +const char *git_error_string(struct git_error err); > +const char *git_error_strbuf(struct strbuf *buf, struct git_error err); > + > +/* > + * A successful error status. > + */ > +static inline struct git_error git_error_ok(void) { > + struct git_error e = { > + .code = 0 | GIT_ERROR_BIT_INIT, > + .meta = NULL, > + }; > + return e; > +} > + > +static inline struct git_error git_error_new_errno(uint32_t errnoval, const char *msg, int to_free) { > + struct git_error e = { > + .code = errnoval | GIT_ERROR_BIT_INIT | GIT_ERROR_BIT_ERRNO | > + GIT_ERROR_BIT_MSG | (to_free ? GIT_ERROR_BIT_ALLOC : 0), > + .meta = (void *)msg, > + }; > + return e; > +} > + > +static inline struct git_error git_error_new_git(uint32_t gitcode, const char *msg, int to_free) { > + struct git_error e = { > + .code = gitcode | GIT_ERROR_BIT_INIT | > + GIT_ERROR_BIT_MSG | (to_free ? GIT_ERROR_BIT_ALLOC : 0), > + .meta = (void *)msg, > + }; > + return e; > +} > + > +static inline struct git_error git_error_new_simple(uint32_t gitcode) { > + struct git_error e = { > + .code = gitcode | GIT_ERROR_BIT_INIT, > + .meta = NULL, > + }; > + return e; > +} > + > +static inline struct git_error git_error_new_multiple(struct git_error *errs, size_t count) > +{ > + struct git_error_multiple *me = xmalloc(sizeof(*me)); > + struct git_error e = { > + .code = GIT_ERR_MULTIPLE | GIT_ERROR_BIT_INIT | GIT_ERROR_BIT_ALLOC, > + .meta = me, > + }; > + me->errs = errs; > + me->count = count; > + return e; > +} > + > +/* > + * If this is a git error and the code matches the given code, or if this is a > + * multiple error and any of the contained errors are a git error whose code > + * matches, returns a pointer to that error. If there is no match, returns > + * NULL. > + */ > +static inline struct git_error *git_error_is_git(struct git_error *e, int code) { > + int64_t giterr = GIT_ERROR_GITERR(*e); > + if (giterr == code) { > + return e; > + } else if (giterr == GIT_ERR_MULTIPLE) { > + struct git_error_multiple *me = e->meta; > + for (size_t i = 0; i < me->count; i++) > + return git_error_is_git(me->errs + i, code); > + return NULL; > + } else { > + return NULL; > + } > +} > + > +/* > + * If this is an errno error and the code matches the given code, or if this is > + * a multiple error and any of the contained errors are an errno error whose > + * code matches, returns a pointer to that error. Otherwise, returns NULL. > + */ > +static inline struct git_error *git_error_is_errno(struct git_error *e, int code) { > + int64_t giterr = GIT_ERROR_GITERR(*e); > + if (GIT_ERROR_ERRNO(*e) == code) { > + return e; > + } else if (giterr == GIT_ERR_MULTIPLE) { > + struct git_error_multiple *me = e->meta; > + for (size_t i = 0; i < me->count; i++) > + return git_error_is_errno(me->errs + i, code); > + return NULL; > + } else { > + return NULL; > + } > +} > + > +/* Frees and deinitializes this error structure. */ > +static inline uint64_t git_error_free(struct git_error *e) > +{ > + uint64_t code = e->code; > + if (e->code & GIT_ERROR_BIT_ALLOC) > + free(e->meta); > + e->code = 0; > + e->meta = NULL; > + return code; > +} > + > +#endif