On Tue, Jan 28, 2020 at 9:27 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote: > On Tue, Jan 28, 2020 at 2:44 PM Stephen Smalley <sds@xxxxxxxxxxxxx> wrote: > > On 1/28/20 8:39 AM, Paul Moore wrote: > > > On Tue, Jan 28, 2020 at 7:50 AM syzbot > > > <syzbot+61cba5033e2072d61806@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > >> > > >> syzbot has found a reproducer for the following crash on: > > >> > > >> HEAD commit: b0be0eff Merge tag 'x86-pti-2020-01-28' of git://git.kerne.. > > >> git tree: upstream > > >> console output: https://syzkaller.appspot.com/x/log.txt?x=1432aebee00000 > > >> kernel config: https://syzkaller.appspot.com/x/.config?x=9784e57c96a92f20 > > >> dashboard link: https://syzkaller.appspot.com/bug?extid=61cba5033e2072d61806 > > >> compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10088e95e00000 > > >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13fa605ee00000 > > >> > > >> IMPORTANT: if you fix the bug, please add the following tag to the commit: > > >> Reported-by: syzbot+61cba5033e2072d61806@xxxxxxxxxxxxxxxxxxxxxxxxx > > >> > > >> ===================================================== > > >> WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected > > >> 5.5.0-syzkaller #0 Not tainted > > >> ----------------------------------------------------- > > >> syz-executor305/10624 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire: > > >> ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] > > >> ffff888098c14098 (&(&s->cache_lock)->rlock){+.+.}, at: sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533 > > >> > > >> and this task is already holding: > > >> ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] > > >> ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91 > > >> which would create a new lock dependency: > > >> (&(&nf_conntrack_locks[i])->rlock){+.-.} -> (&(&s->cache_lock)->rlock){+.+.} > > >> > > >> but this new dependency connects a SOFTIRQ-irq-safe lock: > > >> (&(&nf_conntrack_locks[i])->rlock){+.-.} > > >> > > >> ... which became SOFTIRQ-irq-safe at: > > >> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 > > >> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] > > >> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 > > >> spin_lock include/linux/spinlock.h:338 [inline] > > >> nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91 > > > > > > ... > > > > > >> to a SOFTIRQ-irq-unsafe lock: > > >> (&(&s->cache_lock)->rlock){+.+.} > > >> > > >> ... which became SOFTIRQ-irq-unsafe at: > > >> ... > > >> lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4484 > > >> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] > > >> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 > > >> spin_lock include/linux/spinlock.h:338 [inline] > > >> sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533 > > >> sidtab_sid2str_put+0xa0/0xc0 security/selinux/ss/sidtab.c:566 > > >> sidtab_entry_to_string security/selinux/ss/services.c:1279 [inline] > > >> sidtab_entry_to_string+0xf2/0x110 security/selinux/ss/services.c:1266 > > >> security_sid_to_context_core+0x2c6/0x3c0 security/selinux/ss/services.c:1361 > > >> security_sid_to_context+0x34/0x40 security/selinux/ss/services.c:1384 > > >> avc_audit_post_callback+0x102/0x790 security/selinux/avc.c:709 > > >> common_lsm_audit+0x5ac/0x1e00 security/lsm_audit.c:466 > > >> slow_avc_audit+0x16a/0x1f0 security/selinux/avc.c:782 > > >> avc_audit security/selinux/include/avc.h:140 [inline] > > >> avc_has_perm+0x543/0x610 security/selinux/avc.c:1185 > > >> inode_has_perm+0x1a8/0x230 security/selinux/hooks.c:1631 > > >> selinux_mmap_file+0x10a/0x1d0 security/selinux/hooks.c:3701 > > >> security_mmap_file+0xa4/0x1e0 security/security.c:1482 > > >> vm_mmap_pgoff+0xf0/0x230 mm/util.c:502 > > > > > > ... > > > > > >> other info that might help us debug this: > > >> > > >> Possible interrupt unsafe locking scenario: > > >> > > >> CPU0 CPU1 > > >> ---- ---- > > >> lock(&(&s->cache_lock)->rlock); > > >> local_irq_disable(); > > >> lock(&(&nf_conntrack_locks[i])->rlock); > > >> lock(&(&s->cache_lock)->rlock); > > >> <Interrupt> > > >> lock(&(&nf_conntrack_locks[i])->rlock); > > >> > > >> *** DEADLOCK *** > > >> > > >> 4 locks held by syz-executor305/10624: > > >> #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnl_lock net/netfilter/nfnetlink.c:62 [inline] > > >> #0: ffffffff8c1acc68 (&table[i].mutex){+.+.}, at: nfnetlink_rcv_msg+0x9ee/0xfb0 net/netfilter/nfnetlink.c:224 > > >> #1: ffff8880836415d8 (nlk_cb_mutex-NETFILTER){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199 > > >> #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] > > >> #2: ffffffff89865770 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91 > > >> #3: ffffffff8b7df008 (&selinux_ss.policy_rwlock){.+.?}, at: security_sid_to_context_core+0x1ca/0x3c0 security/selinux/ss/services.c:1344 > > > > > > I think this is going to be tricky to fix due to the differing > > > contexts from which sidtab_sid2str_put() may be called. We already > > > have a check for !in_task() in sidtab_sid2str_put(), do we want to add > > > a check for !in_serving_softirq() too? > > > > No, we should just use spin_lock_irqsave/unlock_irqrestore() IMHO, but > > that then means we need to re-evaluate the performance gain of this change. > > I just tested a patch that switches to the IRQ-disabling locking under > the systemd-journald scenario from [1] and the impact seems not that > bad - the security_secid_to_secctx() function only takes up 3.18% of > total run time vs. ~2% I reported in the original patch > (d97bd23c2d7d). I'll post the patch shortly. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259, i.e.: > cat /dev/urandom | base64 | logger & > timeout 30s perf record -p $(pidof systemd-journald) -a -g I'm skeptical of the real world impact of disabling IRQs here, although I'm not sure off the top of my head how we could manage a cache (I'm thinking more about adding to, and evicting entries, from the cache) without some form of locking. Perhaps we look into per-cpu caching? -- paul moore www.paul-moore.com