Re: [PATCH 1/2] sequencer: truncate labels to accommodate loose refs

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 16 Aug 2023 09:28:57 -0700

Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> It is not a performance-critical code path, so I erred on the side of
> simplicity (although I have to admit that the post image of the diff is
> not exactly for the faint of heart).
>
> Could we maybe form the plan to keep in the back of our heads that we
> already have a UTF-8-truncating functionality in sequencer, and in case
> another user should turn up, implemnt that optimized function in
> `utf8.[ch]`?

Yup, that is a good idea.  Even though this one only cares about the
bytecount, we'd eventually benefit from two variants, truncate by
bytecount and truncate by display width.  Both variants should
return an error when given a bytestring that does not make a valid
UTF-8 sequence, and leave it to the caller to truncate at byte
boundary as a fallback, which is trivial (the alternative would be
to do the truncation by the callee, but then caller cannot tell if
the returned result is a fallback result that the end user may need
to be warned about or a known-valid UTF-8 substring if we go that
route, so it would be suboptimal).

Thanks.