> Il giorno 03 gen 2018, alle ore 10:06, Guoqing Jiang <gqjiang@xxxxxxxx> ha scritto: > > > > On 01/03/2018 03:44 PM, Paolo Valente wrote: >> >>> Il giorno 03 gen 2018, alle ore 04:58, Guoqing Jiang <gqjiang@xxxxxxxx> ha scritto: >>> >>> Hi, >>> >> Hi >> >>> In my test, I found some issues when try bfq with xfs. >>> The test basically just set the disk's scheduler to bfq, >>> create xfs on top of it, mount fs and write something, >>> then umount the fs. After several rounds of iteration, >>> I can see different calltraces appeared. >>> >>> For example, the one which happened frequently: >>> >>> Jan 03 11:35:19 linux-mainline kernel: XFS (vdd): Mounting V5 Filesystem >>> Jan 03 11:35:19 linux-mainline kernel: XFS (vdd): Ending clean mount >>> Jan 03 11:35:19 linux-mainline kernel: XFS (vdd): Unmounting Filesystem >>> Jan 03 11:35:19 linux-mainline kernel: XFS (vdd): Mounting V5 Filesystem >>> Jan 03 11:35:19 linux-mainline kernel: XFS (vdd): Ending clean mount >>> Jan 03 11:35:19 linux-mainline kernel: BUG: unable to handle kernel paging request at 0000000000029ec0 >>> Jan 03 11:35:19 linux-mainline kernel: IP: __mod_node_page_state+0x5/0x50 >>> Jan 03 11:35:19 linux-mainline kernel: PGD 0 P4D 0 >>> Jan 03 11:35:19 linux-mainline kernel: Oops: 0000 [#1] SMP KASAN >>> Jan 03 11:35:19 linux-mainline kernel: Modules linked in: bfq(E) joydev(E) uinput(E) fuse(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) snd_hda_codec_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) xfs(E) ghash_clmulni_intel(E) libcrc32c(E) crc32c_intel(E) pcbc(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) snd_pcm(E) ppdev(E) aesni_intel(E) snd_timer(E) aes_x86_64(E) crypto_simd(E) snd(E) glue_helper(E) cryptd(E) pcspkr(E) virtio_balloon(E) virtio_net(E) parport_pc(E) parport(E) soundcore(E) i2c_piix4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) virtio_console(E) virtio_rng(E) virtio_blk(E) ata_generic(E) ata_piix(E) ahci(E) libahci(E) floppy(E) ehci_pci(E) qxl(E) serio_raw(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) sym53c8xx(E) scsi_transport_spi(E) button(E) libata(E) >>> Jan 03 11:35:19 linux-mainline kernel: ttm(E) drm(E) uhci_hcd(E) ehci_hcd(E) usbcore(E) virtio_pci(E) virtio_ring(E) virtio(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E) >>> Jan 03 11:35:19 linux-mainline kernel: CPU: 0 PID: 3349 Comm: ps Tainted: G E 4.15.0-rc1-69-default #1 >>> Jan 03 11:35:19 linux-mainline kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 >>> Jan 03 11:35:19 linux-mainline kernel: task: ffff880061efce80 task.stack: ffff880058bd0000 >>> Jan 03 11:35:19 linux-mainline kernel: RIP: 0010:__mod_node_page_state+0x5/0x50 >>> Jan 03 11:35:19 linux-mainline kernel: RSP: 0018:ffff880058bd7ce8 EFLAGS: 00010a07 >>> Jan 03 11:35:19 linux-mainline kernel: RAX: 00000000000003ff RBX: ffffea00011a3d80 RCX: 00000000011a3d80 >>> Jan 03 11:35:19 linux-mainline kernel: RDX: ffffffffffffffff RSI: 000000000000000d RDI: 0000000000000000 >>> Jan 03 11:35:19 linux-mainline kernel: RBP: ffffffffffffffff R08: ffff88006378a630 R09: ffff880058bd7d98 >>> Jan 03 11:35:19 linux-mainline kernel: R10: 00007f7f4d806280 R11: 0000000000000000 R12: ffffea00011a3d80 >>> Jan 03 11:35:19 linux-mainline kernel: R13: 00007f7f4f318000 R14: 00007f7f4f31c000 R15: ffff880058bd7e18 >>> Jan 03 11:35:19 linux-mainline kernel: FS: 0000000000000000(0000) GS:ffff880066c00000(0000) knlGS:0000000000000000 >>> Jan 03 11:35:19 linux-mainline kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> Jan 03 11:35:19 linux-mainline kernel: CR2: 0000000000029ec0 CR3: 0000000001c0d006 CR4: 00000000001606f0 >>> Jan 03 11:35:19 linux-mainline kernel: Call Trace: >>> Jan 03 11:35:19 linux-mainline kernel: page_remove_rmap+0x11a/0x2b0 >>> Jan 03 11:35:19 linux-mainline kernel: unmap_page_range+0x547/0xa30 >>> Jan 03 11:35:19 linux-mainline kernel: unmap_vmas+0x42/0x90 >>> Jan 03 11:35:19 linux-mainline kernel: exit_mmap+0x86/0x180 >>> Jan 03 11:35:19 linux-mainline kernel: mmput+0x4a/0x110 >>> Jan 03 11:35:19 linux-mainline kernel: do_exit+0x25d/0xae0 >>> Jan 03 11:35:19 linux-mainline kernel: do_group_exit+0x39/0xa0 >>> Jan 03 11:35:19 linux-mainline kernel: SyS_exit_group+0x10/0x10 >>> Jan 03 11:35:19 linux-mainline kernel: entry_SYSCALL_64_fastpath+0x1a/0x7d >>> Jan 03 11:35:19 linux-mainline kernel: RIP: 0033:0x7f7f4eb8c338 >>> Jan 03 11:35:19 linux-mainline kernel: RSP: 002b:00007ffca4400d48 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 >>> Jan 03 11:35:19 linux-mainline kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7f4eb8c338 >>> Jan 03 11:35:19 linux-mainline kernel: RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 >>> Jan 03 11:35:19 linux-mainline kernel: RBP: 00007ffca4400d40 R08: 00000000000000e7 R09: fffffffffffffef8 >>> Jan 03 11:35:19 linux-mainline kernel: R10: 00007f7f4d806280 R11: 0000000000000246 R12: 00007f7f4f756000 >>> Jan 03 11:35:19 linux-mainline kernel: R13: 00007ffca4400cc8 R14: 00007f7f4f732b20 R15: 00007f7f4d5ebc70 >>> Jan 03 11:35:19 linux-mainline kernel: Code: f7 d9 48 39 ca 7c 05 65 88 50 0f c3 f0 48 01 94 f7 00 05 00 00 f0 48 01 14 f5 c0 c4 c0 81 31 d2 eb e5 0f 1f 40 00 0f 1f 44 00 00 <48> 8b 8f c0 9e 02 00 89 f6 48 8d 04 31 65 44 8a 40 01 4d 0f be >>> Jan 03 11:35:19 linux-mainline kernel: RIP: __mod_node_page_state+0x5/0x50 RSP: ffff880058bd7ce8 >>> Jan 03 11:35:19 linux-mainline kernel: CR2: 0000000000029ec0 >>> Jan 03 11:35:19 linux-mainline kernel: ---[ end trace b5314eeef943a473 ]--- >>> Jan 03 11:35:19 linux-mainline kernel: Fixing recursive fault but reboot is needed! >>> Jan 03 11:35:19 linux-mainline kernel: XFS (vdd): Unmounting Filesystem >>> >>> >> Yes, this call trace may be related with bfq, because it concerns >> (non-bfq) code that may get executed only when bfq is set as I/O >> scheduler. In fact, the failure happens on an exit_group, and bfq >> supports cgroups, while either mq-deadline or kyber don't. >> >> Did you also check what __mod_node_page_state+0x5 corresponds to in >> your sources? Maybe this piece of information could ring some bell, at >> least for people more expert than me on the involved code. > > > Seems it is a null dereferenced since RDI is 0000000000000000. > > (gdb) l *__mod_node_page_state+0x5 > 0x385 is in __mod_node_page_state (/usr/src/kernels/4.15.0-rc1-69-default/mm/vmstat.c:338). > 333 EXPORT_SYMBOL(__mod_zone_page_state); > 334 > 335 void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item, > 336 long delta) > 337 { > 338 struct per_cpu_nodestat __percpu *pcp = pgdat->per_cpu_nodestats; > 339 s8 __percpu *p = pcp->vm_node_stat_diff + item; > 340 long x; > 341 long t; > 342 > > Let me know if you need anything else. > This failure seems related to the mm data structures of the task. Unfortunately, I have no idea of how a mistake in bfq could corrupt such unrelated structures; unless bfq contains some serious error, corrupting unrelated memory areas. I hope someone can provide better insights, Paolo >>> Occasionally mount process hangs forever. >>> linux-mainline:~ # cat /proc/19627/stack >>> [<ffffffff810a65f2>] io_schedule+0x12/0x40 >>> [<ffffffff8119fbb7>] wait_on_page_bit+0xd7/0x100 >>> [<ffffffff811b3713>] truncate_inode_pages_range+0x423/0x7c0 >>> [<ffffffff81273768>] set_blocksize+0x98/0xb0 >>> [<ffffffff81273798>] sb_set_blocksize+0x18/0x40 >>> [<ffffffffa06a2e58>] xfs_fs_fill_super+0x1b8/0x590 [xfs] >>> [<ffffffff8123bd4d>] mount_bdev+0x17d/0x1b0 >>> [<ffffffff8123c6d4>] mount_fs+0x34/0x150 >>> [<ffffffff81259702>] vfs_kern_mount+0x62/0x110 >>> [<ffffffff8125bd1a>] do_mount+0x1ca/0xc30 >>> [<ffffffff8125ca6e>] SyS_mount+0x7e/0xd0 >>> [<ffffffff8172fff3>] entry_SYSCALL_64_fastpath+0x1a/0x7d >>> [<ffffffffffffffff>] 0xffffffffffffffff >> Maybe this hang has to do with the one already recently reported for >> USB drives. We have already found the cause of that one, and are >> finalizing our fix. > > I am not sure it is the same one since the test is run against virtio disk > not usb, but anyway I can try the fix. > > Thanks, > Guoqing