Re: [PATCH] revision: replace "struct cmdline_pathspec" with argv_array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Jeff King wrote:

> But mostly I am fundamentally against using UNLEAK() in a case like
> this, because it does not match either of the properties which justified
> adding UNLEAK() in the first place:
>
>   1. We are about to exit the program, so the "leak" is only caused by
>      the memory going out of scope at that exit.
>
>      By contrast, the revision machinery may be called many times in the
>      same program.
>
>   2. The memory remains useful until around the time of program exit.
>
>      This most certainly does not, as it would not even be reachable.
[...]
> On Wed, Sep 20, 2017 at 03:48:26PM -0700, Jonathan Nieder wrote:

>> In other words, proposed changes:
>>
>>  1. Could the commit message describe what effect this would have on
>>     maximum heap usage, if any?  (In qualitative terms is fine, though
>>     actual numbers would be even better if it's easy to get them.)
>>     That would make it easier to justify not using UNLEAK.
>
> What wording are you looking for? It was a leak, and now it's gone.  The
> size of the leak depends on how much you feed to --stdin. IMHO using
> UNLEAK is totally inappropriate for this case, and doesn't even seem
> like an alternative worth rejecting.
>
>>  2. Can parse_pathspec get a comment in pathspec.h saying that it
>>     defensively copies anything it needs from args so the caller is
>>     free to modify or free it?  That way, it should be more obvious
>>     to people in the future modifying parse_pathspec() that callers
>>     may rely on that.  (The current API comment describes argv as
>>     "command line arguments", which I fear would send the opposite
>>     message to implementors.)
>
> I certainly agree that the pathspec interface could use better
> documentation. Patches welcome? :)

I think I failed at communicating here.  That is not what I meant at
all.

The context is that Git (especially older parts of it) suffers from a
pretty severe lack of API documentation.  For newcomers that is
especially obvious --- long-time git contributors, on the other hand,
may get used to patterns and common interfaces and may have trouble
seeing just how bad the lack of clearly communicated API contracts is.

There is a bit of a "broken window" problem, too: authors of one-off
patches may reasonably assume from existing code that this is just the
way it is and, lacking examples of how to document an API, add more
underdocumented API contracts.

The patch I am replying to tightens the contract for parse_pathspec().
I am not asking for comprehensive documentation of that function ---
that would be clearly off-topic for the patch.  Instead, I am saying
that we should document what we are newly requiring of the function in
the patch.  That way, the documented contract becomes clearer over
time as people document what they are relying on.  I think of that as
generally a good practice.

In other words, it was not at all obvious that "(2) The memory remains
useful until around the time of program exit" did not hold.  To a
casual reader it instead looks like (2) does hold, and there's no
documentation short of delving into the implementation of
parse_pathspec() to convince a reader otherwise.  The documentation
that is present leads to the opposite conclusion.

The assertion (1) that this allocation is going to happen multiple
times in a program isn't true either.  As you noted, we only read
stdin once.  But that doesn't matter as long as (2) doesn't hold.

TBH saying that I should write the one-sentence doc patch feels like
a cop-out.  Just like it is not sustainable for those reviewers that
are interested in good test coverage to be the only ones who write
tests, I think we cannot avoid treating documentation of API contracts
as a shared responsibility.

Thanks and hope that clarifies,
Jonathan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux