Re: [RFC PATCH 1/1] Add a type for errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-10-02 at 09:54:52, Phillip Wood wrote:
> On 30/09/2024 23:03, brian m. carlson wrote:
> > There is work underway to move some of the Git code out into a reusable
> > library.  In such a case, it's not very desirable to have the library
> > code write random errors to standard error, since this is an antipattern
> > which annoys terminal users.
> > 
> > Instead, we will expect callers of our library function to return
> > errors.
> 
> Isn't it the callers that will expect the function to return an error?

Yes, that's correct.  I'll fix that.

> > To help the libification process and make our code more generally
> > maintainable, add an error type.  This consists of on 64-bit integer,
> > which contains bit flags and a 32-bit code, and a pointer, which depends
> > on the code.  It is designed to be passed and returned by value, not
> > pointer, and it is possible to do so in two registers on 64-bit systems.
> > Similar functionality works well for error types in Rust and for the
> > standard library's lldiv_t, so this should not pose a problem.
> 
> Part of the reason it works well in rust is that it supports discriminated
> unions with pattern matching and has the "?" macro for early returns. In C
> the code ends up being quite verbose compared to taking a pointer to error
> struct as a function parameter and returning a boolean success/fail flag.
> 
>     struct git_error e;
>     struct object_id oid;
> 
>     e = repo_get_oid(r, "HEAD", &oid);
>     if (!GIT_ERROR_SUCCESS(e))
>         return e;
> 
> With a boolean return we can have
> 
>     struct object_id oid;
> 
>     if (repo_get_oid(r, "HEAD", &oid, e))
>         return -1;
> 
> where "e" is a "struct git_error*" passed into the function.

Yes, I agree that this is more verbose than in Rust.  If we were using
Rust (which is a thing I'm working on), then we'd have nicer error
handling, but for now we don't.

However, Go still uses this kind of error handling, and many people use
it every day with this limitation, so I don't think it's too awful for
what we're getting.  I won't say that Go is my favourite language and I
do prefer the less verbose error handling in Rust, but the fact that
this design is widely used means that it's at least a defensible
decision.

> I think the design of the struct is fine. It does mean we need a global list
> of error values. GError [1] avoids this by having a "domain" field that is
> an interned string so that error codes only need to be unique within a given
> domain. I think that for a self-contained project like git a global list is
> probably fine - svn does this for example [2].

I think so, too.  libgit2 does the same thing, which seems to have
worked out okay.

> > Allow multiple errors to be provided and wrapped in a single object,
> > which is useful in many situations, and add helpers to determine if any
> > error in the set matches a particular code.
> 
> The api appears to require the caller know up front how many errors there
> will be which seems unlikely to be true in practice. I think a more
> realistic design would allow callers to push errors as they occur and grow
> the array accordingly. For example ref_transaction_prepare() would want to
> return a list of errors, one for each ref that it was unable to lock or
> which did not match the expected value but it would not know up-front how
> many errors there were going to be.

That seems reasonable.  We could add helpers to extend the number of
errors.

> It would be useful to be able to add context to an error as the stack is
> unwound. For example if unpack_trees() detects that it would overwrite
> untracked files it prints a error listing those files. The exact formatting
> of that message depends on the command being run. That is currently handled
> by calling setup_unpack_trees_porcelain() with the name of the command
> before calling unpack_trees(). In a world where unpack_trees() returns a
> error containing the list of files we would want to customize the message by
> adding some context before converting it to a string.

I think we could do that with either a particular error type (e.g.,
`GIT_ERR_WOULD_OVERWRITE`) that contains nested errors and text or a
generic `GIT_ERR_ANNOTATED`.

> > Additionally, provide error formatting functions that produce a suitable
> > localized string for ease of use.
> 
> I share Emily's concern that this function has to know the details of how to
> format every error. We could mitigate that somewhat using a switch that
> calls external helper functions that do the actual formatting
> 
>     switch (e.code) {
>     case GIT_ERR_OBJECT_NOT_FOUND:
>         format_object_not_found(buf, e); /* lives in another file */
>         break;
>     ...
> 
> I know this is an RFC but I couldn't resist one code comment

I'm fine with that approach.  This is mostly designed to solicit
comments about the overall design, and I don't have a strong opinion
about how we format.

> > +#define GIT_ERROR_SUCCESS(e) (((e).code == GIT_ERROR_BIT_INIT))
> > +#define GIT_ERROR_SUCCESS_CONSUME(e) ((git_error_free(&(e)).code == GIT_ERROR_BIT_INIT)
> 
> git_error_free() returns the code as in integer so we don't need ".code"
> here. Also our coding guidelines would suggest git_error_clear() for the
> name of that function.

Got it.

> In conclusion I like the general idea but have concerns about the verbosity
> of returning an error struct. It would be helpful to see some examples of
> this being used to see how it works in practice.

If I send a v2, I'll try to wire up some code so folks can see some
examples.
-- 
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux