Hi Max, On 17.09.2024 00:06, Max Gurtovoy wrote: > > On 12/09/2024 9:46, Marek Szyprowski wrote: >> Dear All, >> >> On 08.08.2024 00:41, Max Gurtovoy wrote: >>> Set the driver data of the hardware context (hctx) to point directly to >>> the virtio block queue. This cleanup improves code readability and >>> reduces the number of dereferences in the fast path. >>> >>> Reviewed-by: Stefan Hajnoczi <stefanha@xxxxxxxxxx> >>> Signed-off-by: Max Gurtovoy <mgurtovoy@xxxxxxxxxx> >>> --- >>> drivers/block/virtio_blk.c | 42 >>> ++++++++++++++++++++------------------ >>> 1 file changed, 22 insertions(+), 20 deletions(-) >> This patch landed in recent linux-next as commit 8d04556131c1 >> ("virtio_blk: implement init_hctx MQ operation"). In my tests I found >> that it introduces a regression in system suspend/resume operation. From >> time to time system crashes during suspend/resume cycle. Reverting this >> patch on top of next-20240911 fixes this problem. > > Could you please provide a detailed explanation of the system > suspend/resume operation and the specific testing methodology employed? In my tests I just call the 'rtcwake -s10 -mmem' command many times in a loop. I use standard Debian image under QEMU/ARM64. Nothing really special. > > The occurrence of a kernel panic from this commit is unexpected, given > that it primarily involves pointer reassignment without altering the > lifecycle of vblk/vqs. > > In the virtqueue_add_split function, which pointer is becoming null > and causing the issue? A detailed analysis would be helpful. > > The report indicates that the crash occurs sporadically rather than > consistently. > > is it possible that this is a race condition introduced by a different > commit? How can we rule out this possibility? This is the commit pointed by bisecting between v6.11-rc1 and next-20240911. The problem is reproducible, it just need a few calls to the rtcwake command. > > Prior to applying this commit, what were the test results? > Specifically, out of 100 test runs, how many passed successfully? All 100 were successful, see https://pastebin.com/3yETvXK9 (kernel is compiled from 6d17035a7402, which is a parent of $subject in linux-next). > > After applying this commit, what are the updated test results? Again, > out of 100 test runs, how many passed successfully? Usually it freezes or panics after the second try, see https://pastebin.com/u5n9K1Dz (kernel compiled from 8d04556131c1, which is $subject in linux-next). > >> >> I've even managed to catch a kernel panic log of this problem on QEMU's >> ARM64 'virt' machine: >> >> root@target:~# time rtcwake -s10 -mmem >> rtcwake: wakeup from "mem" using /dev/rtc0 at Thu Sep 12 07:11:52 2024 >> Unable to handle kernel NULL pointer dereference at virtual address >> 0000000000000090 >> Mem abort info: >> ESR = 0x0000000096000046 >> EC = 0x25: DABT (current EL), IL = 32 bits >> SET = 0, FnV = 0 >> EA = 0, S1PTW = 0 >> FSC = 0x06: level 2 translation fault >> Data abort info: >> ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000 >> CM = 0, WnR = 1, TnD = 0, TagAccess = 0 >> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >> user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046bbb000 >> ... >> Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP >> Modules linked in: bluetooth ecdh_generic ecc rfkill ipv6 >> CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0H Not tainted 6.11.0-rc6+ #9024 >> Hardware name: linux,dummy-virt (DT) >> Workqueue: kblockd blk_mq_requeue_work >> pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> pc : virtqueue_add_split+0x458/0x63c >> lr : virtqueue_add_split+0x1d0/0x63c >> ... >> Call trace: >> virtqueue_add_split+0x458/0x63c >> virtqueue_add_sgs+0xc4/0xec >> virtblk_add_req+0x8c/0xf4 >> virtio_queue_rq+0x6c/0x1bc >> blk_mq_dispatch_rq_list+0x21c/0x714 >> __blk_mq_sched_dispatch_requests+0xb4/0x58c >> blk_mq_sched_dispatch_requests+0x30/0x6c >> blk_mq_run_hw_queue+0x14c/0x40c >> blk_mq_run_hw_queues+0x64/0x124 >> blk_mq_requeue_work+0x188/0x1bc >> process_one_work+0x20c/0x608 >> worker_thread+0x238/0x370 >> kthread+0x124/0x128 >> ret_from_fork+0x10/0x20 >> Code: f9404282 79401c21 b9004a81 f94047e1 (f8206841) >> ---[ end trace 0000000000000000 ]--- >> note: kworker/0:0H[9] exited with irqs disabled >> note: kworker/0:0H[9] exited with preempt_count 1 >> >> >>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c >>> index 2351f411fa46..35a7a586f6f5 100644 >>> --- a/drivers/block/virtio_blk.c >>> +++ b/drivers/block/virtio_blk.c >>> @@ -129,14 +129,6 @@ static inline blk_status_t virtblk_result(u8 >>> status) >>> } >>> } >>> -static inline struct virtio_blk_vq *get_virtio_blk_vq(struct >>> blk_mq_hw_ctx *hctx) >>> -{ >>> - struct virtio_blk *vblk = hctx->queue->queuedata; >>> - struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num]; >>> - >>> - return vq; >>> -} >>> - >>> static int virtblk_add_req(struct virtqueue *vq, struct >>> virtblk_req *vbr) >>> { >>> struct scatterlist out_hdr, in_hdr, *sgs[3]; >>> @@ -377,8 +369,7 @@ static void virtblk_done(struct virtqueue *vq) >>> static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx) >>> { >>> - struct virtio_blk *vblk = hctx->queue->queuedata; >>> - struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num]; >>> + struct virtio_blk_vq *vq = hctx->driver_data; >>> bool kick; >>> spin_lock_irq(&vq->lock); >>> @@ -428,10 +419,10 @@ static blk_status_t virtio_queue_rq(struct >>> blk_mq_hw_ctx *hctx, >>> const struct blk_mq_queue_data *bd) >>> { >>> struct virtio_blk *vblk = hctx->queue->queuedata; >>> + struct virtio_blk_vq *vq = hctx->driver_data; >>> struct request *req = bd->rq; >>> struct virtblk_req *vbr = blk_mq_rq_to_pdu(req); >>> unsigned long flags; >>> - int qid = hctx->queue_num; >>> bool notify = false; >>> blk_status_t status; >>> int err; >>> @@ -440,26 +431,26 @@ static blk_status_t virtio_queue_rq(struct >>> blk_mq_hw_ctx *hctx, >>> if (unlikely(status)) >>> return status; >>> - spin_lock_irqsave(&vblk->vqs[qid].lock, flags); >>> - err = virtblk_add_req(vblk->vqs[qid].vq, vbr); >>> + spin_lock_irqsave(&vq->lock, flags); >>> + err = virtblk_add_req(vq->vq, vbr); >>> if (err) { >>> - virtqueue_kick(vblk->vqs[qid].vq); >>> + virtqueue_kick(vq->vq); >>> /* Don't stop the queue if -ENOMEM: we may have failed to >>> * bounce the buffer due to global resource outage. >>> */ >>> if (err == -ENOSPC) >>> blk_mq_stop_hw_queue(hctx); >>> - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); >>> + spin_unlock_irqrestore(&vq->lock, flags); >>> virtblk_unmap_data(req, vbr); >>> return virtblk_fail_to_queue(req, err); >>> } >>> - if (bd->last && virtqueue_kick_prepare(vblk->vqs[qid].vq)) >>> + if (bd->last && virtqueue_kick_prepare(vq->vq)) >>> notify = true; >>> - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); >>> + spin_unlock_irqrestore(&vq->lock, flags); >>> if (notify) >>> - virtqueue_notify(vblk->vqs[qid].vq); >>> + virtqueue_notify(vq->vq); >>> return BLK_STS_OK; >>> } >>> @@ -504,7 +495,7 @@ static void virtio_queue_rqs(struct request >>> **rqlist) >>> struct request *requeue_list = NULL; >>> rq_list_for_each_safe(rqlist, req, next) { >>> - struct virtio_blk_vq *vq = get_virtio_blk_vq(req->mq_hctx); >>> + struct virtio_blk_vq *vq = req->mq_hctx->driver_data; >>> bool kick; >>> if (!virtblk_prep_rq_batch(req)) { >>> @@ -1164,6 +1155,16 @@ static const struct attribute_group >>> *virtblk_attr_groups[] = { >>> NULL, >>> }; >>> +static int virtblk_init_hctx(struct blk_mq_hw_ctx *hctx, void >>> *data, >>> + unsigned int hctx_idx) >>> +{ >>> + struct virtio_blk *vblk = data; >>> + struct virtio_blk_vq *vq = &vblk->vqs[hctx_idx]; >>> + >>> + hctx->driver_data = vq; >>> + return 0; >>> +} >>> + >>> static void virtblk_map_queues(struct blk_mq_tag_set *set) >>> { >>> struct virtio_blk *vblk = set->driver_data; >>> @@ -1205,7 +1206,7 @@ static void virtblk_complete_batch(struct >>> io_comp_batch *iob) >>> static int virtblk_poll(struct blk_mq_hw_ctx *hctx, struct >>> io_comp_batch *iob) >>> { >>> struct virtio_blk *vblk = hctx->queue->queuedata; >>> - struct virtio_blk_vq *vq = get_virtio_blk_vq(hctx); >>> + struct virtio_blk_vq *vq = hctx->driver_data; >>> struct virtblk_req *vbr; >>> unsigned long flags; >>> unsigned int len; >>> @@ -1236,6 +1237,7 @@ static const struct blk_mq_ops virtio_mq_ops = { >>> .queue_rqs = virtio_queue_rqs, >>> .commit_rqs = virtio_commit_rqs, >>> .complete = virtblk_request_done, >>> + .init_hctx = virtblk_init_hctx, >>> .map_queues = virtblk_map_queues, >>> .poll = virtblk_poll, >>> }; >> Best regards > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland