Re: [PATCH 2/4] ublk: refactor recovery configuration flag helpers

Uday Shankar <ushankar@xxxxxxxxxxxxxxx> · Thu, 27 Jun 2024 11:09:15 -0600

When I say "behavior A + 2," I mean behavior A and behavior 2 at the
same time on the same ublk device. I still think this is not supported
with current ublk_drv, see below.

> > the ublk server can "handle" the I/O error because during this time,
> > there is no ublk server and all decisions on how to handle I/O are made
> > by ublk_drv directly (based on configuration flags specified when the
> > device was created).
> > 
> > If the ublk server created the device with UBLK_F_USER_RECOVERY, then
> > when the ublk server has crashed (and not restarted yet), I/Os issued by
> > the application will queue/hang until the ublk server comes back and
> > recovers the device, because the underlying request_queue is left in a
> > quiesced state. So in this case, behavior A is not possible.
> 
> When ublk server is crashed, ublk_abort_requests() will be called to fail
> queued inflight requests. Meantime ubq->canceling is set to requeue
> new request instead of forwarding it to ublk server.
> 
> So behavior A should be supported easily by failing request in
> ublk_queue_rq() if ubq->canceling is set.

This argument only works for devices created without
UBLK_F_USER_RECOVERY. If UBLK_F_USER_RECOVERY is set, then the
request_queue for the device is left in a quiesced state and so I/Os
will not even get to ublk_queue_rq. See the following as proof (using a
build of ublksrv master):

# ./ublk add -t loop -f file -r 1
dev id 0: nr_hw_queues 1 queue_depth 128 block size 4096 dev_capacity 2097152
        max rq size 524288 daemon pid 244608 flags 0x4a state LIVE
        ublkc: 240:0 ublkb: 259:0 owner: 0:0
        queue 0: tid 244610 affinity(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )
        target {"backing_file":"file","dev_size":1073741824,"direct_io":1,"name":"loop","type":1}
# kill -9 244608
# dd if=/dev/urandom of=/dev/ublkb0 count=1 bs=4096 oflag=direct
(hung)

# ps aux | grep " D"
root      244626  0.0  0.0   5620  1880 pts/0    D+   16:57   0:00 dd if=/dev/urandom of=/dev/ublkb0 count=1 bs=4096 oflag=direct
root      244656  0.0  0.0   6408  2188 pts/1    S+   16:58   0:00 grep --color=auto  D
# cat /proc/244626/stack
[<0>] submit_bio_wait+0x63/0x90
[<0>] __blkdev_direct_IO_simple+0xd9/0x1e0
[<0>] blkdev_write_iter+0x1b4/0x230
[<0>] vfs_write+0x2ae/0x3d0
[<0>] ksys_write+0x4f/0xc0
[<0>] do_syscall_64+0x5d/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
# cat /sys/kernel/debug/block/ublkb0/state
SAME_COMP|NONROT|IO_STAT|INIT_DONE|STATS|REGISTERED|QUIESCED|NOWAIT|SQ_SCHED

Therefore, in order to obtain behavior A with current ublk_drv, one must
not set UBLK_F_USER_RECOVERY.

> > 
> > If the ublk server created the device without UBLK_F_USER_RECOVERY, then
> > when the ublk server has crashed (and not restarted yet), I/Os issued by
> > the application will immediately error (since in this case, ublk will
> > call del_gendisk).  However, when the ublk server restarts, it cannot
> > recover the existing ublk device - the disk has been deleted and the
> > ublk device is in state UBLK_S_DEV_DEAD from which recovery is not
> > permitted. So in this case, behavior 2 is not possible.
> 
> UBLK_F_USER_RECOVERY is supposed for supporting to recover device, and
> if this flag isn't enabled, we don't support the feature simply, so
> looks behavior 2 isn't one valid case, is it?

Sure, so we're in agreement that recovery is impossible if
UBLK_F_USER_RECOVERY is not set.

So:
- To get behavior A, UBLK_F_USER_RECOVERY must be unset
- To get behavior 2, UBLK_F_USER_RECOVERY must be set

Hence, having behavior A and behavior 2, at the same time, on the same
device, would require UBLK_F_USER_RECOVERY to be both set and unset when
that device is created. Obviously that's impossible.