Re: [RFC/PATCH 0/5] remote: eliminate remote->{fetch,push}_refspec and lazy parsing of refspecs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 16, 2017 at 11:55 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
>
>> 'struct remote' stores refspecs twice: once in their original string
>> form in remote->{fetch,push}_refspecs and once in their parsed form in
>> remote->{fetch,push}.  This is necessary, because we need the refspecs
>> for lazy parsing after we finished reading the configuration: we don't
>> want to die() on a bogus refspec while reading the configuration of a
>> remote we are not going to access at all.
>>
>> However, storing refspecs in both forms has some drawbacks:
>>
>>   - The same information is stored twice, wasting memory.
>
> True (but a few hundred bytes is nothing among friends ;-)

Indeed.  Even in my repos with close to 10k remotes the amount of
memory wasted by the duplicated refspecs is not an problem, there are
more pressing issues there.

>>   - remote->{fetch,push}_refspecs, i.e. the string arrays are
>>     conveniently ALLOC_GROW()-able with associated
>>     {fetch,push}_refspec_{nr,alloc} fields, but remote->{fetch,push}
>>     are not.
>
> This is a more real issue.
>
>>   - Wherever remote->{fetch,push} are accessed, the number of parsed
>>     refspecs in there is specified by remote->{fetch,push}_refspec_nr.
>>     This requires us to keep the two arrays in sync and makes adding
>>     additional refspecs cumbersome and error prone.
>
> You haven't told us which way you want to dedup.

Well, I actually did, right at the beginning.  The Subject:
specifically mentions which fields will be removed, and the first one
and a half line says in more usual terms what their roles are.

Anyway, made a note to use more natural language in the subjects (and
elsewhere) when we get there, maybe "remote.c: don't store refspecs as
strings in 'struct remote'" or something.

>  Are you keeping
> the original and removing the pre-parsed?  or are you only keeping
> the pre-parsed ones?  As long as you want ALLOC_GROW() ability, you
> need to maintain the invariants in three-tuple (foo, foo_alloc,
> foo_nr).
>
>>   - And worst of all, it pissed me off while working on
>>     sg/clone-refspec-from-command-line-config ;)
>
> Your feelings (or mine) do not count ;-).

Feelings have nothing to do with it, "it pissed me off" is a concise
way of saying "it made me waste quite some time debugging segfaults
and other nonsense resulting from ALLOC_GROW()ing remote->fetch,
misled by the previous point and the confusing order in which these
fields are listed in 'struct remote's definition".

> I do not think we would terribly mind if you only kept a list of
> pre-parsed form, with some mechanism to keep an "error" entry in
> that list with its original, so that an error can be reported with
> the refspec as the user originally gave us (which may mean the
> "error" entry may have to keep the original form, since it wasn't
> correctly parsable in the first place for it to trigger an error).
>
>> So here is my crack at getting rid of them.
>
> You still haven't told us what "them" are.  Parsed form, or the
> original?  Let's find out by reading on....
>
>> The idea is to parse refspecs gently while reading the configuration:
>> this way we won't need to store all refspecs as strings, and won't
>> die() on a bogus refspec right away.  A bogus refspec, if there's one,
>> will be stored in the remote it belongs to, so it will be available
>> later when that remote is accessed and can be used in the error
>> message.
>
> So normally we only have a list of parsed ones, but optionally there
> is a list of malformed originals that are before attempted (and
> failed) parsing used for error reporting?

For each remote there are two arrays of parsed refspecs, one for fetch
and one for push, and a single malformed original as string.

The reason for storing only a single malformed refspec per remote is
that I didn't want to noteworthily change the behaviour: the current
implementation die()s on the first malformed refspec it encounters
while parsing and reports only that one malformed refspec in the error
message.  This series essentially does the same as far as observable
behaviour goes, though it might happen that a different malformed
refspec is reported in the error message (if there are more than one,
depending on their order in the configuration).

Of course, if we want to, then this could be extended to record all
malformed refspecs while reading the configuration and report all of
them in the error message.  But that's a behaviour change which I
think should come on top as a separate patch.

>  That sounds sensible,
> especially given that we can recreate the original textual form from
> correctly parsed result (which allows us to report on other kinds
> of errors as necessary).
>
>> This applies on top of a merge of master and the fresh reroll (v5) of
>> sg/clone-refspec-from-command-line-config:
>
> Thanks.  Will take a look (but not immediately).




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]