On Fri, Aug 18, 2017 at 03:59:38AM +0200, Patryk Obara wrote: > Determine the number of object_id's to parse in a single graft line by > counting separators (whitespace characters) instead of dividing by > length of hash representation. > > This way graft parsing code can support different sizes of hashes > without any further code adaptations. Sounds like a reasonable approach, though I wonder what happens if our counting pass differs in its behavior from the actual parse. E.g., here: > + /* count number of blanks to determine size of array to allocate */ > + for (i = 0, n = 0; i < line->len; i++) > + if (isspace(line->buf[i])) > + n++; If we see multiple spaces like "1234abcd 5678abcd" we'll allocate a slot for each. I think that's OK because here: > + for (i = 0; i < graft->nr_parent; i++) > + if (!isspace(*tail++) || parse_oid_hex(tail, &graft->parent[i], &tail)) > goto bad_graft_data; We'd reject such an input totally (though as an interesting side effect, you can convince the parser to allocate 20x as much RAM as you send it; one oid for each space). So we're probably fine. The two parsing passes are right next to each other and are sufficiently simple and strict that we don't have to worry about them diverging. The single-pass alternative would probably be to read into a dynamic structure like an oid_array, and then copy the result into the flex structure. Or of course to stop using a flex structure, as your original pass did. I agree with Junio that the use of object_id's is orthogonal to using a FLEX_ARRAY. But I could also see an argument that the complexity the flex array adds here isn't worth the savings. The main benefits of a flex array are: 1. Less memory used. But we don't expect to see a large enough number of grafts for this to matter. 2. A more compact memory representation, which can be faster. But accessing the parent list of a graft isn't going to be the hot code path. It probably only happens once in a program run when we rewrite the parents (versus checking the grafted commit, which is looked up once per commit we access). 3. It's easier to free the struct and its associated resources in a single free(). But we never free the graft list. Reading your original, my thought was "why _not_ keep doing it as a FLEX_ARRAY, as it saves a little memory, which can't hurt". But seeing this pre-counting phase, I think it does make the code a little more complicated. But I'd be OK with doing it either way. -Peff