Adding FS people to figure out whether GFP_KERNEL allocations with i_rwsem's held for writing are okay. On Wed, Sep 19, 2018 at 9:10 AM Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote: > On Wed, Sep 19, 2018 at 10:07:37AM +0300, Cyrill Gorcunov wrote: > > Hi Oleg! While been testing criu with linux-next we've triggered a BUG. > > https://api.travis-ci.org/v3/job/430308998/log.txt > > > > [ 2.461618] BUG: sleeping function called from invalid context at security/apparmor/include/cred.h:154 > > [ 2.461794] in_atomic(): 1, irqs_disabled(): 1, pid: 152, name: init > > [ 2.461890] 1 lock held by init/152: > > [ 2.461981] #0: 00000000f30c3fda (tasklist_lock){.+.+}, at: ptrace_traceme+0x1c/0x70 > > [ 2.462114] irq event stamp: 2524 > > [ 2.462242] hardirqs last enabled at (2523): [<ffffffff98002922>] do_syscall_64+0x12/0x190 > > [ 2.462363] hardirqs last disabled at (2524): [<ffffffff98b8b02f>] _raw_write_lock_irq+0xf/0x40 > > [ 2.462476] softirqs last enabled at (1904): [<ffffffff98ac79ef>] unix_sock_destructor+0x4f/0xc0 > > [ 2.462586] softirqs last disabled at (1902): [<ffffffff98ac79ef>] unix_sock_destructor+0x4f/0xc0 > > [ 2.462697] CPU: 1 PID: 152 Comm: init Not tainted 4.19.0-rc4-next-20180918+ #1 > > > > Which is due to commit > > > > commit 4b105cbbaf7c06e01c27391957dc3c446328d087 > > Author: Oleg Nesterov <oleg@xxxxxxxxxx> > > Date: Wed Jun 17 16:27:33 2009 -0700 > > > > ptrace: do not use task_lock() for attach > > > > because now after write_lock_irq(&tasklist_lock); apparmor calls for > > traceme and > > > > static inline struct aa_label *begin_current_label_crit_section(void) > > { > > struct aa_label *label = aa_current_raw_label(); > > > > --> might_sleep(); > > > > Take a look please, once time permit. > > Heh, actually not :) It is due to commit > > commit 1f8266ff58840d698a1e96d2274189de1bdf7969 > Author: Jann Horn <jannh@xxxxxxxxxx> > Date: Thu Sep 13 18:12:09 2018 +0200 > > which introduced might_sleep. Seems it is bad idea to send bug report > without having a cup of coffee at the morning :) Yeah, I fixed one sleep-in-atomic bug and figured I'd throw a might_sleep() in there for good measure... sigh. I guess now I have to go through all the callers of begin_current_label_crit_section() to see what else looks wrong... apparmor_ptrace_traceme() is wrong, as reported... apparmor_path_link() looks icky, but I'm not sure - from what I can tell, it's called with an i_rwsem held for writing, and that probably makes calling back into filesystem context from there a bad idea? OTOH, it's just the i_rwsem of a newly-created path, so I don't know whether that's actually an issue... security_path_rename() is called with two i_rwsem's held, but again, I'm not sure whether that's a problem.