Re: [bug report] NBD: rbd-nbd + ext4 stuck after nbd resized

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Ming,

Thanks for following up on this issue. It can be reproduced on v5.9 kernel.

I reproduced it just now. Here is the details.

ln@ubuntu:linux>$ git describe HEAD
v5.9-14722-gd76913908102

ln@ubuntu:linux>$ uname -a
Linux ubuntu 5.9.0+ #3 SMP Mon Oct 26 16:56:48 CST 2020 x86_64 x86_64 x86_64 GNU/Linux

ln@ubuntu:~>$ sudo bash -x repro.sh
+ umount /tmp/mntnbd
umount: /tmp/mntnbd: no mount point specified.
+ rbd-nbd unmap kcp/foo
rbd-nbd: kcp/foo is not mapped
+ rbd rm kcp/foo
Removing image: 100% complete...done.
+ rbd create -s 2G kcp/foo
+ rbd-nbd map kcp/foo
/dev/nbd0
+ mkfs.ext4 /dev/nbd0
mke2fs 1.45.5 (07-Jan-2020)
Discarding device blocks: done
Creating filesystem with 524288 4k blocks and 131072 inodes
Filesystem UUID: f4b9635c-152f-4042-b9ca-602428628cf0
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

+ mkdir -p /tmp/mntnbd
+ mount /dev/nbd0 /tmp/mntnbd
+ rbd resize kcp/foo --size 4G
Resizing image: 100% complete...done.
ln@ubuntu:~>$ ls /tmp/mntnbd/

^C^C


ln@ubuntu:~>$ top
top - 10:30:19 up 7 min,  2 users,  load average: 2.06, 1.63, 0.82
Tasks: 378 total,   2 running, 376 sleeping,   0 stopped,   0 zombie
%Cpu(s): 0.0 us, 8.3 sy, 0.0 ni, 91.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem :  15787.1 total,  13036.7 free,    970.5 used,   1779.8 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  14529.2 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
   3020 ln        20   0    6508    828    724 R 100.0   0.0   5:48.08 ls
199 root 20 0 0 0 0 I 0.3 0.0 0:00.18 kworker/10:2-events 1058 root 20 0 0 0 0 I 0.3 0.0 0:00.03 kworker/8:2-events


ln@ubuntu:~>$ dmesg
...
[ 75.279029] EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null) [ 78.490171] BUG: kernel NULL pointer dereference, address: 0000000000000010
[   78.490212] #PF: supervisor read access in kernel mode
[   78.490223] #PF: error_code(0x0000) - not-present page
[   78.490254] PGD 0 P4D 0
[   78.490262] Oops: 0000 [#1] SMP PTI
[   78.490271] CPU: 9 PID: 2972 Comm: ext4lazyinit Not tainted 5.9.0+ #3
[ 78.490297] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
[   78.490321] RIP: 0010:__ext4_journal_get_write_access+0x2c/0x120
[ 78.490347] Code: 44 00 00 55 48 89 e5 41 57 49 89 cf 41 56 41 55 41 54 49 89 d4 53 48 83 ec 18 48 89 7d d0 89 75 cc e8 78 74 7b 00 49 8b 47 30 <4c> 8b 68 10 4d 85 ed 74 2f 49 8b 85 d8 00 00 00 49 8b 9d 80 03 00
[   78.490379] RSP: 0018:ffffb0f581793dd0 EFLAGS: 00010246
[ 78.490389] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff9167954c4000 [ 78.490402] RDX: ffff91679550f690 RSI: 000000000000061f RDI: ffffffff84c4aa50 [ 78.490416] RBP: ffffb0f581793e10 R08: 0000000000001ff5 R09: 0000000000000001 [ 78.490428] R10: ffff916784cb0a00 R11: 000000000a002b8c R12: ffff91679550f690 [ 78.490441] R13: ffff9167901ce000 R14: 0000000000000200 R15: ffff9167954c4000 [ 78.490454] FS: 0000000000000000(0000) GS:ffff916aae040000(0000) knlGS:0000000000000000
[   78.490469] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.490479] CR2: 0000000000000010 CR3: 000000010a1d8004 CR4: 00000000003706e0
[   78.490537] Call Trace:
[   78.490547]  ? __ext4_journal_start_sb+0x106/0x120
[   78.490558]  ext4_init_inode_table+0x168/0x390
[   78.490976]  ext4_lazyinit_thread+0x38b/0x520
[   78.491359]  kthread+0x114/0x150
[   78.491603]  ? ext4_journalled_writepage_callback+0x60/0x60
[   78.491849]  ? kthread_park+0x90/0x90
[   78.492103]  ret_from_fork+0x22/0x30
[ 78.492348] Modules linked in: nbd rfcomm xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter bnep bonding vsock_loopback vmw_vsock_virtio_transport_common vsock binfmt_misc intel_rapl_msr intel_rapl_common kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl btusb btrtl btbcm btintel bluetooth joydev vmw_balloon psmouse input_leds ecdh_generic ecc e1000 vmw_vmci i2c_piix4 mac_hid sch_fq_codel btrfs blake2b_generic libcrc32c xor zstd_compress raid6_pq overlay iptable_filter ip6table_filter ip6_tables br_netfilter serio_raw bridge mptspi scsi_transport_spi ahci mptscsih libahci mptbase pata_acpi stp llc arp_tables vmwgfx hid_generic drm_kms_helper usbhid hid syscopyarea sysfillrect sysimgblt fb_sys_fops cec ttm drm parport_pc ppdev lp parport ip_tables x_tables autofs4
[   78.495718] CR2: 0000000000000010
[   78.496074] ---[ end trace d98825069bfe2e2a ]---
[   78.496425] RIP: 0010:__ext4_journal_get_write_access+0x2c/0x120
[ 78.496775] Code: 44 00 00 55 48 89 e5 41 57 49 89 cf 41 56 41 55 41 54 49 89 d4 53 48 83 ec 18 48 89 7d d0 89 75 cc e8 78 74 7b 00 49 8b 47 30 <4c> 8b 68 10 4d 85 ed 74 2f 49 8b 85 d8 00 00 00 49 8b 9d 80 03 00
[   78.497834] RSP: 0018:ffffb0f581793dd0 EFLAGS: 00010246
[ 78.498259] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff9167954c4000 [ 78.498606] RDX: ffff91679550f690 RSI: 000000000000061f RDI: ffffffff84c4aa50 [ 78.498944] RBP: ffffb0f581793e10 R08: 0000000000001ff5 R09: 0000000000000001 [ 78.499295] R10: ffff916784cb0a00 R11: 000000000a002b8c R12: ffff91679550f690 [ 78.499630] R13: ffff9167901ce000 R14: 0000000000000200 R15: ffff9167954c4000 [ 78.499964] FS: 0000000000000000(0000) GS:ffff916aae040000(0000) knlGS:0000000000000000
[   78.500316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.500655] CR2: 0000000000000010 CR3: 000000010a1d8004 CR4: 00000000003706e0
...


在 2020/10/27 9:18, Ming Lei 写道:
On Wed, Oct 21, 2020 at 09:08:10AM +0800, lining wrote:
(Sorry for sending this mail again, this one add nbd@xxxxxxxxxxxxxxxx)

Hi kernel、ceph comunity:

We run into an issue that mainly related to the (kernel) nbd driver and (ceph) rbd-nbd.
After some investigations, I found that the root cause of the problem seems to be related to the change in the block size of nbd.

I am not sure whether it is the nbd driver or rbd-nbd bug, however there is such a problem.


What happened:
It will always hang when accessing the mount point of nbd device with ext4 after nbd resized.


Environment information:
- kernel:               v4.19.25 or master
- rbd-nbd(ceph):  v12.2.0 Luminous or master
- the fs of nbd:    ext4

Hello lining,

Can you reproduce this issue on v5.9 kernel?


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux