On Wed, 14 Apr 2021 13:21:37 -0400 Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > On 4/14/21 11:21 AM, xiaojun.zhao141@xxxxxxxxx wrote: > > On Wed, 14 Apr 2021 13:27:43 +0200 (CEST) > > Miroslav Benes <mbenes@xxxxxxx> wrote: > > > >> Hi, > >> > >> On Wed, 14 Apr 2021, xiaojun.zhao141@xxxxxxxxx wrote: > >> > >>> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0 > >>> nbd.qcow2) will automatically exit when I patched for functions of > >>> the nbd with livepatch. > >>> > >>> The nbd relative source: > >>> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct > >>> block_device *bdev) > >>> { struct nbd_config *config = > >>> nbd->config; int > >>> ret; > >>> ret = > >>> nbd_start_device(nbd); if > >>> (ret) return > >>> ret; > >>> if > >>> (max_part) bdev->bd_invalidated = > >>> 1; > >>> mutex_unlock(&nbd->config_lock); ret = > >>> wait_event_interruptible(config->recv_wq, > >>> atomic_read(&config->recv_threads) == 0); if > >>> (ret) > >>> sock_shutdown(nbd); > >>> flush_workqueue(nbd->recv_workq); > >>> mutex_lock(&nbd->config_lock); > >>> nbd_bdev_reset(bdev); > >>> /* user requested, ignore socket errors > >>> */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED, > >>> &config->runtime_flags)) ret = > >>> 0; if (test_bit(NBD_RT_TIMEDOUT, > >>> &config->runtime_flags)) ret = > >>> -ETIMEDOUT; return > >>> ret; } > >> > >> So my understanding is that ndb spawns a number > >> (config->recv_threads) of workqueue jobs and then waits for them to > >> finish. It waits interruptedly. Now, any signal would make > >> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake > >> signal is no exception there. The error is then propagated back to > >> the userspace. Unless a user requested a disconnection or there is > >> timeout set. How does the userspace then reacts to it? Is > >> _interruptible there because the userspace sends a signal in case > >> of NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles > >> ordinary signals? This all sounds a bit strange, but I may be > >> missing something easily. > >> > >>> When the nbd waits for atomic_read(&config->recv_threads) == 0, > >>> the klp will send a fake signal to it then the qemu-nbd process > >>> exits. And the signal of sysfs to control this action was removed > >>> in the commit 10b3d52790e 'livepatch: Remove signal sysfs > >>> attribute'. Are there other ways to control this action? How? > >> > >> No, there is no way currently. We send a fake signal automatically. > >> > >> Regards > >> Miroslav > > It occurs IO error of the nbd device when I use livepatch of the > > nbd, and I guess that any livepatch on other kernel source maybe > > cause the IO error. Well, now I decide to workaround for this > > problem by adding a livepatch for the klp to disable a automatic > > fake signal. > > Would wait_event_killable() fix this problem? I'm not sure any > client implementations depend on being able to send other signals to > the client process, so it should be safe from that standpoint. Not > sure if the livepatch thing would still get an error at that point > tho. Thanks, > Josef Yes, I tested that wait_event_killable() can fix this problem. Thanks.