Re: [RFC PATCH 1/1] Add a type for errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 30, 2024 at 3:04 PM brian m. carlson
<sandals@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> There is work underway to move some of the Git code out into a reusable
> library.  In such a case, it's not very desirable to have the library
> code write random errors to standard error, since this is an antipattern
> which annoys terminal users.
>
> Instead, we will expect callers of our library function to return
> errors.  The reusability of our library will be substantially improved
> if we return typed errors so that users can easily determine what kind
> of error might have occurred and react differently based on different
> contexts.  For example, if we are looking up an object in a partial
> clone and it's not found, we might decide to download it, but we might
> return an error to the user if the problem is instead that the revision
> specified is not syntactically valid.
>
> To help the libification process and make our code more generally
> maintainable, add an error type.  This consists of on 64-bit integer,
> which contains bit flags and a 32-bit code, and a pointer, which depends
> on the code.  It is designed to be passed and returned by value, not
> pointer, and it is possible to do so in two registers on 64-bit systems.
> Similar functionality works well for error types in Rust and for the
> standard library's lldiv_t, so this should not pose a problem.
>
> Provide the ability to specify either an errno value or a git error code
> as the code.  This allows us to use this type generically when handling
> errno values such as processing files, as well as express a rich set of
> possible error codes specific to Git.  We pick an unsigned 32-bit code
> because Windows can use the full set of 32 bits in its error values,
> even though most Unix systems use only a small set of codes which
> traditionally start at 1.  32 bits for Git errors also allows us plenty
> of space to expand as we see fit.
>
> Allow multiple errors to be provided and wrapped in a single object,
> which is useful in many situations, and add helpers to determine if any
> error in the set matches a particular code.
>
> Additionally, provide error formatting functions that produce a suitable
> localized string for ease of use.
>
> Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>

Left a few comments below about portability and readability, but
general thoughts: of course this is quite limited, but I think it will
actually be enough in most cases. It also seems like if we need more
later, we can use this same code-OR-with-pointer thing to point to
something more nuanced than `void* meta`. I'd be in favor of applying
this model to somewhere with clear error codes and relatively few
invocations (maybe to the internals of something like argparse or
run-command, without initially modifying the public interface, for
example?) and seeing where we run into friction. I have a little bit
of concern about the idea of centralizing all the `meta` parsing and
error string generation, but it seems like we might be able to adopt a
model more similar to the one we use for `git_config()` callbacks and
do more context-aware parsing instead to keep `git_error_strbuf` from
getting too huge.

Thanks for sending this. I'm looking forward to hearing others' opinions.
 - Emily

> ---
>  Makefile |   1 +
>  error.c  |  43 ++++++++++++++
>  error.h  | 168 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 212 insertions(+)
>  create mode 100644 error.c
>  create mode 100644 error.h
>
> diff --git a/Makefile b/Makefile
> index 7344a7f725..5d9bf992e9 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1013,6 +1013,7 @@ LIB_OBJS += dir.o
>  LIB_OBJS += editor.o
>  LIB_OBJS += entry.o
>  LIB_OBJS += environment.o
> +LIB_OBJS += error.o
>  LIB_OBJS += ewah/bitmap.o
>  LIB_OBJS += ewah/ewah_bitmap.o
>  LIB_OBJS += ewah/ewah_io.o
> diff --git a/error.c b/error.c
> new file mode 100644
> index 0000000000..713bc42187
> --- /dev/null
> +++ b/error.c
> @@ -0,0 +1,43 @@
> +#include "git-compat-util.h"
> +#include "gettext.h"
> +#include "error.h"
> +#include "hex.h"
> +#include "strbuf.h"
> +
> +const char *git_error_string(struct git_error err)
> +{
> +       struct strbuf buf = STRBUF_INIT;
> +       if (!git_error_strbuf(&buf, err))
> +               return NULL;
> +       return strbuf_detach(&buf, NULL);
> +}
> +
> +const char *git_error_strbuf(struct strbuf *buf, struct git_error err)

Is the idea that we continue to extend `git_error_strbuf` with more
`git_error_code` values as we add this method of error handling across
the codebase? I'm worried it could get quite unwieldy. Or are you
suggesting that we could write `git_error_strbuf_object_store`,
`git_error_strbuf_hook`, etc to keep the code->string conversion
local? That would let us do custom processing of err.meta depending on
context, too, wouldn't it?

> +{
> +       if (GIT_ERROR_SUCCESS(err)) {
> +               return NULL;
> +       } else if (GIT_ERROR_ERRNO(err) != -1) {
> +               return xstrdup(strerror(GIT_ERROR_ERRNO(err)));
> +       } else {
> +               struct git_error_multiple *me = err.meta;
> +               switch (GIT_ERROR_GITERR(err)) {
> +               case GIT_ERR_OBJECT_NOT_FOUND:
> +                       if (err.meta)
> +                               strbuf_addf(buf, _("object not found: %s"), oid_to_hex(err.meta));
> +                       else
> +                               strbuf_addf(buf, _("object not found"));
> +               case GIT_ERR_NULL_OID:
> +                       if (err.meta)
> +                               strbuf_addf(buf, _("null object ID not allowed in this context: %s"), (char *)err.meta);
> +                       else
> +                               strbuf_addf(buf, _("null object ID not allowed"));
> +               case GIT_ERR_MULTIPLE:
> +                       strbuf_addf(buf, _("multiple errors:\n"));
> +                       for (size_t i = 0; i < me->count; i++) {
> +                               git_error_strbuf(buf, me->errs[i]);
> +                               strbuf_addstr(buf, "\n");
> +                       }
> +               }
> +               return buf->buf;
> +       }
> +}
> diff --git a/error.h b/error.h
> new file mode 100644
> index 0000000000..485cca99e0
> --- /dev/null
> +++ b/error.h
> @@ -0,0 +1,168 @@
> +#ifndef ERROR_H
> +#define ERROR_H
> +
> +#include "git-compat-util.h"
> +
> +/* Set if this value is initialized. */
> +#define GIT_ERROR_BIT_INIT (1ULL << 63)
> +/* Set if the code is an errno code, clear if it's a git error code. */
> +#define GIT_ERROR_BIT_ERRNO (1ULL << 62)
> +/*
> + * Set if the memory in meta should be freed; otherwise, it's statically
> + * allocated.
> + */
> +#define GIT_ERROR_BIT_ALLOC (1ULL << 61)
> +/*
> + * Set if the memory in meta is a C string; otherwise, it's a metadata struct.
> + */
> +#define GIT_ERROR_BIT_MSG (1ULL << 60)
> +
> +#define GIT_ERROR_BIT_MASK (0xf << 60)
> +
> +#define GIT_ERROR_OK (git_error_ok())
> +
> +#define GIT_ERROR_SUCCESS(e) (((e).code == GIT_ERROR_BIT_INIT))
> +#define GIT_ERROR_SUCCESS_CONSUME(e) ((git_error_free(&(e)).code == GIT_ERROR_BIT_INIT)
> +
> +#define GIT_ERROR_ERRNO(e) (((e).code & GIT_ERROR_BIT_ERRNO) ? ((e).code & 0xffffffff) : -1)
> +#define GIT_ERROR_GITERR(e) (((e).code & GIT_ERROR_BIT_ERRNO) ? -1 : ((e).code & 0xffffffff))

Aurgh, too bad we don't get bitfields before C11. :) (Although I am
not sure if that helps with your register-level optimization at that
point... but it probably helps with readability.)

But, I do wonder if this gluing-together-two-types-into-one-value
thing may break based on endianness? (And if we care? I don't think we
have any tests running on POWER systems, so maybe this falls under the
umbrella of "you should give us tests if you want us to not break
you"?)

> +
> +/*
> + * A value representing an error in Git.
> + */
> +struct git_error {
> +       uint64_t code;
> +       void *meta;
> +};
> +
> +struct git_error_multiple {
> +       struct git_error *errs;
> +       size_t count;
> +};
> +
> +enum git_error_code {
> +       /* The operation was a success. */
> +       GIT_ERR_SUCCESS = 0,
> +       /* An object ID was provided, but it was not found.
> +        *
> +        * meta will be NULL or a pointer to struct object ID.
> +        */
> +       GIT_ERR_OBJECT_NOT_FOUND = 1,
> +       /*
> +        * An object ID was provided, but it is all zeros and that is not
> +        * allowed.
> +        *
> +        * meta will be NULL or a message explaining the context.
> +        */
> +       GIT_ERR_NULL_OID = 2,
> +       /*
> +        * Multiple errors occurred.
> +        *
> +        * meta must be a pointer to struct git_error_multiple.
> +        */
> +       GIT_ERR_MULTIPLE = 3,
> +};
> +
> +const char *git_error_string(struct git_error err);
> +const char *git_error_strbuf(struct strbuf *buf, struct git_error err);
> +
> +/*
> + * A successful error status.
> + */
> +static inline struct git_error git_error_ok(void) {
> +       struct git_error e =  {
> +               .code = 0 | GIT_ERROR_BIT_INIT,
> +               .meta = NULL,
> +       };
> +       return e;
> +}
> +
> +static inline struct git_error git_error_new_errno(uint32_t errnoval, const char *msg, int to_free) {
> +       struct git_error e =  {
> +               .code = errnoval | GIT_ERROR_BIT_INIT | GIT_ERROR_BIT_ERRNO |
> +                       GIT_ERROR_BIT_MSG | (to_free ? GIT_ERROR_BIT_ALLOC : 0),
> +               .meta = (void *)msg,
> +       };
> +       return e;
> +}
> +
> +static inline struct git_error git_error_new_git(uint32_t gitcode, const char *msg, int to_free) {
> +       struct git_error e =  {
> +               .code = gitcode | GIT_ERROR_BIT_INIT |
> +                       GIT_ERROR_BIT_MSG | (to_free ? GIT_ERROR_BIT_ALLOC : 0),
> +               .meta = (void *)msg,
> +       };
> +       return e;
> +}
> +
> +static inline struct git_error git_error_new_simple(uint32_t gitcode) {
> +       struct git_error e =  {
> +               .code = gitcode | GIT_ERROR_BIT_INIT,
> +               .meta = NULL,
> +       };
> +       return e;
> +}
> +
> +static inline struct git_error git_error_new_multiple(struct git_error *errs, size_t count)
> +{
> +       struct git_error_multiple *me = xmalloc(sizeof(*me));
> +       struct git_error e =  {
> +               .code = GIT_ERR_MULTIPLE | GIT_ERROR_BIT_INIT | GIT_ERROR_BIT_ALLOC,
> +               .meta = me,
> +       };
> +       me->errs = errs;
> +       me->count = count;
> +       return e;
> +}
> +
> +/*
> + * If this is a git error and the code matches the given code, or if this is a
> + * multiple error and any of the contained errors are a git error whose code
> + * matches, returns a pointer to that error.  If there is no match, returns
> + * NULL.
> + */
> +static inline struct git_error *git_error_is_git(struct git_error *e, int code) {
> +       int64_t giterr = GIT_ERROR_GITERR(*e);
> +       if (giterr == code) {
> +               return e;
> +       } else if (giterr == GIT_ERR_MULTIPLE) {
> +               struct git_error_multiple *me = e->meta;
> +               for (size_t i = 0; i < me->count; i++)
> +                       return git_error_is_git(me->errs + i, code);
> +               return NULL;
> +       } else {
> +               return NULL;
> +       }
> +}
> +
> +/*
> + * If this is an errno error and the code matches the given code, or if this is
> + * a multiple error and any of the contained errors are an errno error whose
> + * code matches, returns a pointer to that error.  Otherwise, returns NULL.
> + */
> +static inline struct git_error *git_error_is_errno(struct git_error *e, int code) {
> +       int64_t giterr = GIT_ERROR_GITERR(*e);
> +       if (GIT_ERROR_ERRNO(*e) == code) {
> +               return e;
> +       } else if (giterr == GIT_ERR_MULTIPLE) {
> +               struct git_error_multiple *me = e->meta;
> +               for (size_t i = 0; i < me->count; i++)
> +                       return git_error_is_errno(me->errs + i, code);
> +               return NULL;
> +       } else {
> +               return NULL;
> +       }
> +}
> +
> +/* Frees and deinitializes this error structure. */
> +static inline uint64_t git_error_free(struct git_error *e)
> +{
> +       uint64_t code = e->code;
> +       if (e->code & GIT_ERROR_BIT_ALLOC)
> +               free(e->meta);
> +       e->code = 0;
> +       e->meta = NULL;
> +       return code;
> +}
> +
> +#endif





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux