Is it OK to pass non-acquired objects to kfree?

Dmitry Vyukov <dvyukov@xxxxxxxxxx> · Tue, 8 Sep 2015 09:51:41 +0200

Hello mm-maintainers,

I have a question about kfree semantics, I can't find answer in docs
and opinions I hear differ.
Namely, is it OK to pass non-acquired objects to
kfree/kmem_cache_free? By non-acquired mean objects unsafely passed
between threads without using proper release/acquire (wmb/rmb) memory
barriers.

The question arose during work on KernelThreadSanitizer, a kernel data
race, and in particular caused by the following existing code:

// kernel/pid.c
         if ((atomic_read(&pid->count) == 1) ||
              atomic_dec_and_test(&pid->count)) {
                 kmem_cache_free(ns->pid_cachep, pid);
                 put_pid_ns(ns);
         }

//drivers/tty/tty_buffer.c
while ((next = buf->head->next) != NULL) {
     tty_buffer_free(port, buf->head);
     buf->head = next;
}
// Here another thread can concurrently append to the buffer list, and
tty_buffer_free eventually calls kfree.

Both these cases don't contain proper memory barrier before handing
off the object to kfree. In my opinion the code should use
smp_load_acquire or READ_ONCE_CTRL ("control-dependnecy-acquire").
Otherwise there can be pending memory accesses to the object in other
threads that can interfere with slab code or the next usage of the
object after reuse.

Paul McKenney suggested that:

"
The maintainers probably want this sort of code to be allowed:
        p->a++;
        if (p->b) {
                kfree(p);
                p = NULL;
        }
And the users even more so.
So if the compiler really is free to reorder any scribbling/checking
by the caller with any scribbling/checking by kfree(), that should
be fixed in kfree() rather than in all the callers.
"

This does not look reasonable to me for 2 reasons:
- this incurs unnecessary cost for all kfree users, kfree would have
to execute a memory barrier always while most callers already have the
object acquired (either single-threaded use, or mutex protected, or
the object was properly handed off to the freeing thread)
- as far as I understand if an object is unsafely passed between a
chain of threads A->B->C->D and then the last does kfree, then kfree
can't acquire visibility over the object by executing a memory
barrier. All threads in the chain must play by the rules to properly
hand off the object to kfree.

As far as I understand, that's why atomic_dec_and_test used for
reference counting contains full memory barrier; and also kfree does
not seem to contain any memory barriers on fast path.

Can you please clarify the rules here?

Thank you

-- 
Dmitry Vyukov, Software Engineer, dvyukov@xxxxxxxxxx
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href