Re: [bug report] Format FS failed with ublk device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 26, 2024 at 09:09:57AM +0800, Ming Lei wrote:
> Hi Guangwu,
> 
> Thanks for the report!
> 
> On Thu, Apr 25, 2024 at 09:54:04AM +0800, Guangwu Zhang wrote:
> > Hi,
> > the format FS command will hung up  with ublk device.
> > 
> > # ublk --version
> > ublksrv 1.1-7-gf01c509
> > 
> > kerne: 6.9.0-rc4.kasan
> > 
> > 
> > nvme0n1                     259:1    0   1.5T  0 disk
> > └─nvme0n1p1                 259:2    0     5G  0 part
> > # ublk add -t loop -f /dev/nvme0n1p1
> > dev id 0: nr_hw_queues 1 queue_depth 128 block size 4096 dev_capacity 10485760
> > max rq size 524288 daemon pid 3227 flags 0x42 state LIVE
> > ublkc: 245:0 ublkb: 259:3 owner: 0:0
> > queue 0: tid 3228 affinity(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
> > 18 19 20 21 22 23 24 25 26 27 28 29 30 31 )
> > target {"backing_file":"/dev/nvme0n1p1","dev_size":5368709120,"direct_io":1,"name":"loop","type":1}
> > 
> > # mkfs.xfs -f /dev/ublkb0    << can not finish,  pid 3239
> > meta-data=/dev/ublkb0            isize=512    agcount=4, agsize=327680 blks
> >          =                       sectsz=4096  attr=2, projid32bit=1
> >          =                       crc=1        finobt=1, sparse=1, rmapbt=0
> >          =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
> > data     =                       bsize=4096   blocks=1310720, imaxpct=25
> >          =                       sunit=0      swidth=0 blks
> > naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
> > log      =internal log           bsize=4096   blocks=16384, version=2
> >          =                       sectsz=4096  sunit=1 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> > 
> > # cat /proc/3239/stack
> > [<0>] rq_qos_wait+0x12a/0x1f0
> > [<0>] wbt_wait+0x11a/0x240
> > [<0>] __rq_qos_throttle+0x49/0x90
> > [<0>] blk_mq_submit_bio+0x58c/0x19d0
> > [<0>] submit_bio_noacct_nocheck+0x40d/0x780
> > [<0>] blk_next_bio+0x41/0x50
> > [<0>] __blkdev_issue_zero_pages+0x1ba/0x370
> > [<0>] blkdev_issue_zeroout+0x1a7/0x390
> > [<0>] blkdev_fallocate+0x264/0x3d0
> > [<0>] vfs_fallocate+0x2b0/0xad0
> > [<0>] __x64_sys_fallocate+0xb4/0x100
> > [<0>] do_syscall_64+0x7b/0x160
> > [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > 
> > [  862.171377] INFO: task mkfs.xfs:3239 blocked for more than 122 seconds.
> > [  862.178073]       Not tainted 6.9.0-rc4.kasan+ #1
> > [  862.182820] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > disables this message.
> 
> Looks it might be one blk-wbt issue, and ublk-loop doesn't setup
> write_zero_max_bytes and it may take a bit long for __blkdev_issue_zero_pages
> to complete, but it shouldn't hang.
> 
> Can you collect the following bpftrace by starting it before running mkfs?
> And I can't reproduce it in my environment.
> 
> #!/usr/bin/bpftrace
> kretfunc:vfs_fallocate
> {
> 	printf("vfs_fallocate on %s ret %d (%x %lx %u)\n",
> 		str(args->file->f_path.dentry->d_name.name),
> 		retval, args->mode, args->offset, args->len);
> }

After co-working with Guangwu, the issue is now root-caused:

1) vfs_fallocate() can't translate block DISCARD into real discard, and
'FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE | FALLOC_FL_NO_HIDE_STALE'
is supposed to be capable of doing that, but vfs doesn't allow
FALLOC_FL_NO_HIDE_STALE

2) so ublk discard is actually converted to write-zeroes because ublksrv
converts discard into fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE),
and that is the reason why mkfs.xfs takes too long, and Guangwu confirmed
that zeroing out is in-progress actually, not hang.

Does FALLOC_FL_PUNCH_HOLE have to imply zeroing out for block device?

3) now fix is pushed to ublksrv by translating ublk discard into
ioctl(DISCARD) for block device

And same issue exists on kernel loop driver too.

Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux