On 04/21/2017 10:40 AM, Bart Van Assche wrote: > On Fri, 2017-04-21 at 10:33 -0600, Jens Axboe wrote: >> On 04/21/2017 10:31 AM, Bart Van Assche wrote: >>> On Fri, 2017-04-21 at 10:25 -0600, Jens Axboe wrote: >>>> On 04/21/2017 09:32 AM, Bart Van Assche wrote: >>>>> Hello Jens, >>>>> >>>>> Since yesterday the following complaint is reported frequently after having >>>>> installed the for-4.12/block branch on my test setup. Unless someone has a >>>>> better proposal, I will run a bisect. >>>>> >>>>> BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:349 >>>>> in_atomic(): 1, irqs_disabled(): 0, pid: 8019, name: find >>>>> CPU: 10 PID: 8019 Comm: find Tainted: G W I 4.11.0-rc4-dbg+ #2 >>>>> Call Trace: >>>>> dump_stack+0x68/0x93 >>>>> ___might_sleep+0x16e/0x230 >>>>> __might_sleep+0x4a/0x80 >>>>> __ext4_get_inode_loc+0x1e0/0x4e0 >>>>> ext4_iget+0x70/0xbc0 >>>>> ext4_iget_normal+0x2f/0x40 >>>>> ext4_lookup+0xb6/0x1f0 >>>>> lookup_slow+0x104/0x1e0 >>>>> walk_component+0x19a/0x330 >>>>> path_lookupat+0x4b/0x100 >>>>> filename_lookup+0x9a/0x110 >>>>> user_path_at_empty+0x36/0x40 >>>>> vfs_statx+0x67/0xc0 >>>>> SYSC_newfstatat+0x20/0x40 >>>>> SyS_newfstatat+0xe/0x10 >>>>> entry_SYSCALL_64_fastpath+0x18/0xad >>>> >>>> How are you reproducing this? I've been running testing on the test box >>>> and I run it on my laptop as well, but I haven't seen anything odd. >>> >>> Hello Jens, >>> >>> All I have to do to reproduce this is to build, install and boot the kernel. >>> Maybe we are using a different kernel config? >> >> I'd say odds are good we are not using an identical kernel config :-) >> What is your root device? Is it using mq and scheduling, or what's >> the config? > > Hello Jens, > > The boot device is a SATA disk: > # lsscsi > [0:0:0:0] disk ATA ST1000NM0033-9ZM GA67 /dev/sda > > SCSI-mq is enabled and the default I/O scheduler is the deadline scheduler. > From the kernel .config: > CONFIG_DEFAULT_IOSCHED="deadline" > CONFIG_SCSI_MQ_DEFAULT=y I wonder if it's an imbalance in the preempt count. Looking at it, it looks like we're not clearing the alloc data. But I would think that would potentially cause much worse problems, but maybe we got lucky? Let me generate a cleanup patch for that. -- Jens Axboe