Re: Is it OK to pass non-acquired objects to kfree?

Dmitry Vyukov <dvyukov@xxxxxxxxxx> · Thu, 10 Sep 2015 14:08:20 +0200

On Thu, Sep 10, 2015 at 12:42 PM, Jesper Dangaard Brouer
<brouer@xxxxxxxxxx> wrote:
> On Thu, 10 Sep 2015 11:55:35 +0200 Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
>> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@xxxxxxxxx> wrote:
>> > On Wed, 9 Sep 2015, Paul E. McKenney wrote:
>> >
>> >> Either way, Dmitry's tool got a hit on real code using the slab
>> >> allocators.  If that hit is a false positive, then clearly Dmitry
>> >> needs to fix his tool, however, I am not (yet) convinced that it is a
>> >> false positive.  If it is not a false positive, we might well need to
>> >> articulate the rules for use of the slab allocators.
>> >
>> > Could I get a clear definiton as to what exactly is positive? Was this
>> > using SLAB, SLUB or SLOB?
>> >
>> >> > This would all use per cpu data. As soon as a handoff is required within
>> >> > the allocators locks are being used. So I would say no.
>> >>
>> >> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
>> >> in this example", right?
>> >
>> > Actually SLUB contains a barrier already in kfree(). Has to be there
>> > because of the way the per cpu pointer is being handled.
>>
>> The positive was reporting of data races in the following code:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> //drivers/tty/tty_buffer.c
>> while ((next = buf->head->next) != NULL) {
>>      tty_buffer_free(port, buf->head);
>>      buf->head = next;
>> }
>>
>> Namely, the tool reported data races between usage of the object in
>> other threads before they released the object and kfree.
>>
>> I am not sure why we are so concentrated on details like SLAB vs SLUB
>> vs SLOB or cache coherency protocols. This looks like waste of time to
>> me. General kernel code should not be safe only when working with SLxB
>> due to current implementation details of SLxB, it should be safe
>> according to memory allocator contract. And this contract seem to be:
>> memory allocator can do arbitrary reads and writes to the object
>> inside of kmalloc and kfree.
>> Similarly for memory model. There is officially documented kernel
>> memory model, which all general kernel code must adhere to. Reasoning
>> about whether a particular piece of code works on architecture X, or
>> how exactly it can break on architecture Y in unnecessary in such
>> context. In the end, there can be memory allocator implementation and
>> new architectures.
>>
>> My question is about contracts, not about current implementation
>> details or specific architectures.
>>
>> There are memory allocator implementations that do reads and writes of
>> the object, and there are memory allocator implementations that do not
>> do any barriers on fast paths. From this follows that objects must be
>> passed in quiescent state to kfree.
>> Now, kernel memory model says "A load-load control dependency requires
>> a full read memory barrier".
>> From this follows that the following code is broken:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> and it should be:
>>
>> // kernel/pid.c
>>          if ((smp_load_acquire(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>
> This reminds me of some code in the network stack[1] in kfree_skb()
> where we have a smp_rmb().  Should we have used smp_load_acquire() ?
>
>  void kfree_skb(struct sk_buff *skb)
>  {
>         if (unlikely(!skb))
>                 return;
>         if (likely(atomic_read(&skb->users) == 1))
>                 smp_rmb();
>         else if (likely(!atomic_dec_and_test(&skb->users)))
>                 return;
>         trace_kfree_skb(skb, __builtin_return_address(0));
>         __kfree_skb(skb);
>  }
>  EXPORT_SYMBOL(kfree_skb);

rmb is much better than nothing :)
I generally prefer to use smp_load_acquire just because it's more
explicit (you see what memory access the barrier relates to), fewer
lines of code, agrees with modern atomic APIs in C, C++, Java, etc,
and FWIW is much better for dynamic race detectors.
As for semantic difference between rmb and smp_load_acquire, rmb does
not order stores, so stores from __kfree_skb can hoist above the
atomic_read(&skb->users) == 1 check. The only architecture that can do
that is Alpha, I don't know enough about Alpha and barrier
implementation on Alpha (maybe rmb and smp_load_acquire do the same
hardware barrier on Alpha) to say whether it can break in real life or
not. But I would still consider smp_load_acquire as safer and cleaner
alternative.

-- 
Dmitry Vyukov, Software Engineer, dvyukov@xxxxxxxxxx
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href