Re: [PATCH v3 4/4] commit: rewrite read_graft_line

Patryk Obara <patryk.obara@xxxxxxxxx> · Fri, 18 Aug 2017 13:30:23 +0200

Jeff King <peff@xxxxxxxx> wrote:
>
> So we're probably fine. The two parsing passes are right next to each
> other and are sufficiently simple and strict that we don't have to
> worry about them diverging.

That was my conclusion as well. I added comment before the first pass and
avoided any "cleverness" to make it perfectly clear to a reader.

> We'd reject such an input totally (though as an interesting side effect,
> you can convince the parser to allocate 20x as much RAM as you send it;
> one oid for each space).

Grafts are not populated during clone operation, so it really would be user
making his life miserable. I could allocate FLEXI_ARRAY of size
min(n, line->len / (GIT_*MIN*_HEXSZ+1)) instead… but I think it's not even
worth the cost of making the code more complicated (and I don't want
to reintroduce these size macros in here.

We _could_ put an artificial limit on graft parents, though (e.g. 10) and
display an error message urging the user to stop using grafts?

> The single-pass alternative would probably be to read into a dynamic
> structure like an oid_array, and then copy the result into the flex
> structure.

Before sending v3 I tried two other alternative implementations (perhaps I
should've listed them in the v3 cover letter):

  1. Using string_list_split_in_place. I resigned from this approach as soon
     as I noticed, that line->buf needs to be preserved for possible
     error message. string_list_split would have no benefits over using
     oid_array.

  2. Parsing into temporary oid_array and then copying memory to FLEXI_ARRAY.
     Throw-away oid_array still needs to be cleaned, which means we have
     new/different return path (one before xmalloc and one after xmalloc),
     which means "bad_graft_data" label needs to be changed into "cleanup"
     label (or removed), which means error description needs be conditionally
     put in earlier code… and at this point, I decided these changes are not
     making code cleaner nor more readable at all :)

Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> If I were doing the two-pass thing, I'd probably write a for loop
> that runs exactly twice, where the first iteration parses into a
> single throw-away oid struct only to count, and the second iteration
> parses the same input into the allocated array of oid struct.  That
> way, you do not have to worry about two phrases going out of sync.

Two passes would still differ in error handling due to xmalloc between them…

-- 
| ← Ceci n'est pas une pipe
Patryk Obara