On 2016-02-23 06:55, Ahmed S. Danish wrote: > Hello everyone, > > As you probably know by now, the second patch in the memfd series > was created to transform the global, shared by all clients, > srbchannel mempool to a "per-client" one. [1] > > Reason of this transition is to avoid data leaks between clients. > In a follow-up patch series, it's even hoped that the global core > mempool will also be transformed to the per-client model. [2] > > > ==> Current problem: > -------------------- > > By injecting faults at the system-call level, it was discovered > that current design of per-client mempools leads to a reproducible > memory fault on certain error-recovery paths. > > The fault is caused by a design issue, rather than a small coding > error. > > > ==> Why are you dedicating a full mail to this? > ----------------------------------------------- > > The problematic issue discovered implies that a bigger design > change is required; your input is very much needed :-) > > > ==> What was the current design? > -------------------------------- > > To do the per-client mempool transformation, a simple decision > was taken: let 'pa_native_connection' own the per-client mempool > in question. > > Why? Because when a client connects, a 'native connection' object > is created for it by default. Inside this object is the pstream > pipe and other per client resources. So this seemed like the best > place for a per-client mempool: > > /* protocol-native.c - Called for each client connection */ > void pa_native_protocol_connect(...) { > pa_native_connection *c; > > ... > c = pa_msgobject_new(pa_native_connection); > c->pstream = pa_pstream_new(...); > c->mempool = pa_mempool_new(...); // per-client mempool > ... > } > > And 'naturally', the pool should be deallocated when the > connection closes: > > /* protocol-native.c */ > native_connection_free(pa_object *o) { > ... > pa_pdispatch_unref(c->pdispatch); > pa_pstream_unref(c->pstream); > > // Srbchannels allocates from this per-client pool, > // so free it only after the pstream is freed > pa_mempool_free(c->mempool); > ... > } > > All looked fine and dandy .. > > > ==> And where is the problem exactly? > ------------------------------------- > > By injecting failures in sendmsg(2), thus reaching up to the > iochannel and pstream layers, the following leak is shown: > > E: Memory pool destroyed but not all mem blocks freed! 1 remain > > and a segfault is then produced due to a use-after-free operation > on the leaked memblock. > > The sequence of failure, bottom-up, is as follows: > > -> sendmsg(2) failes -EACCES > -> pa_iochannel_write_with_creds() return failure > -> pstream do_write() fails > -> do_pstream_read_write() fails, jumps to error recovery > > And the error recovery goes as this: > > -> pstream die callback (p->callback) is called > -> That's protocol-native.c pstream_die_callback() > -> native_connection_unlink // Stops srbchannel, etc. > -> pa_pstream_unlink // Reset all pstream callbacks > -> native_connection_free // Per-client mempool _freed!_ > -> pa_pstream_unlink // Second time, does nothing.. > -> pa_pstram_free > -> pa_queue_free(p->send_queue, item_free) > -> 1. PACKET item found .. safely removed > -> 2. MEMBLOCK item found .. from freed srbchannel mempool > BOOM!!! segfault > > SUMMARY : As seen above, a per-client mempool's lifetime must be > a superset of the pstream's lifetime. Putting the per-client pool > in the native_connection object provided only an _illusion_ of > such a superset. [*] > > During error recovery, stale memblocks remain in pstrem's send > queue long after __all the per-client objects__ has been removed. > > [*] Check native_connection_free() under "What was the current > design?" for why this illusion was maintained. > > > ==> Suggested solutions? > ------------------------ > > The situation a is bit tricky. > > Where can we put the per-client pool while insuring a correct > lifecycle â?? especially during error recovery? > > The safest option, it seems, is to let the per-client pool be > owned by the pstream object, thus the pool can be freed at the > very end of pstream_free() itself. Is such a solution acceptable? Having read your detailed problem description and Tanu's answers, I think making mempools reference counted seems like the safest option. From a quick glance it does not look like mempools own anything reference counted (right?), so we shouldn't have a problem with cycles if we did that refcount. -- David Henningsson, Canonical Ltd. https://launchpad.net/~diwic