Re: [PATCH v2 05/25] sequencer: allow the sequencer to take custody of malloc()ed data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



W dniu 06.10.2016 o 21:23, Junio C Hamano pisze:
> Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:
> 
>> If you prefer to accept such sloppy work, I will change it of course,
>> feeling dirty that it has my name on it.
> 
> I do want neither leaks nor sloppyness, and I thought that was
> understood by everybody, hence I thought the last round made it
> clear that _entrust() must not exist.

Also, the *_entrust() mechanism in v1 was quite sequencer-specific,
rather than be a generic mechanism.

> 
> Let's try again.
> 
> We manage lifetime of a field in a structure in one of three ways in
> our codebase [*1*].
> 
>  * A field can always point at a borrowed piece of memory that the
>    destruction of the containing structure or re-assignment to the
>    field do not have to free the current value of the field.  This
>    is a minority, simply because it is rare for a field to be always
>    able to borrow from others.  The destruction of the containing
>    structure or re-assignment to the field needs to do nothing
>    special.
> 
>  * A field can own the piece of memory that it points at.  The
>    destruction of the containing structure or re-assignment to the
>    field needs to free the current value of the field [*2*].  A
>    field that can be assigned a value from the configuration (which
>    is typically xstrdup()'ed) is an example of such a field, and if
>    a command line option also can assign to the field, we'd need to
>    xstrdup() it, even though we know we could borrow from argv[],
>    because cleaning-up needs to be able to free(3) it.
> 
>  * A field can sometimes own and sometimes borrow the memory, and it
>    is accompanied by another field to tell which case it is, so that
>    cleaning-up can tell when it needs to be free(3)d.  This is a
>    minority case, and we generally avoid it especially in modern
>    code for small allocation, as it makes the lifetime rule more
>    complex than it is worth.

Great writeup!

> The _entrust() thing may have been an excellent fourth option if it
> were cost-free.  xstrdup() that the second approach necessitates for
> literal strings (like part of argv[]) would become unnecessary and
> we can treat as if all the fields in a structure were all borrowing
> the memory from elsewhere (in fact, _entrust() creates the missing
> owners of pieces of memory that need to be freed later); essentially
> it turns a "mixed ownership" case into the first approach.
> 
> But it is not cost-free.  The mechanism needs to allocate memory to
> become the missing owner and keep track of which pieces of memory
> need to be freed later, and the resulting code does not become any
> easier to follow.  The programmer still needs to know which ones to
> _entrust() just like the programmer in the model #2 needs to make
> sure xstrdup() is done for appropriate pieces of memory--the only
> difference is that an _entrust() programmer needs to mark the pieces
> of memory to be freed, while a #2 programmer needs to xstrdup() the
> pieces of memory that are being "borrowed".
> 
> So let's not add the fourth way to our codebase.  "mixed ownership"
> case should be turned into "field owns the memory", i.e. approach #2.

On the other hand the _entrust() mechanism might be a good solution
if the amount of memory was large, for example order of magnitude more
than what would be needed to keep ownership info *and* borrowing would
not be possible for some reason.

But the _entrust() mechanism reminds me on one hand side of memory
debuggers like dmalloc, Electric Fence (eFence), or valgrind; combined
a bit with garbage collection.  Namely, when we are in a library call,
record all allocations (perhaps by redirecting alloc functions), then
free those resources at the exit of library call.
 
> 
> [Footnotes]
> 
> *1* It is normal for different fields in the same structure follow
>     different lifetime rules.
> 
> *2* Unless leaks are allowed, that is.  I think we have instances
>     where "git cmd --opt=A --opt=B" allocates and stores a new piece
>     of memory that is computed based on A and store it to a field,
>     and then overwrites the field with another allocation of a value
>     based on B without freeing old value, saying "the caller can
>     avoid passing the same thing twice, and such a leak is miniscule
>     anyway".  That is not nice, and we've been reducing them over
>     time.
> 




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]