On Thu, Nov 21, 2024 at 7:10 PM Nilay Shroff <nilay@xxxxxxxxxxxxx> wrote: > > > > On 11/21/24 08:28, Yi Zhang wrote: > > On Wed, Nov 20, 2024 at 10:07 PM Nilay Shroff <nilay@xxxxxxxxxxxxx> wrote: > >> > >> > >> > >> On 11/19/24 16:34, Yi Zhang wrote: > >>> Hello > >>> > >>> CKI recently reported the blktests nvme/029 failed[1] on the > >>> linux-block/for-next, and bisect shows it was introduced from [2], > >>> please help check it and let me know if you need any info/test for it, thanks. > >>> > >>> [1] > >>> nvme/029 (tr=loop) (test userspace IO via nvme-cli read/write > >>> interface) [failed] > >>> runtime ... 1.568s > >>> --- tests/nvme/029.out 2024-11-19 08:13:41.379272231 +0000 > >>> +++ /root/blktests/results/nodev_tr_loop/nvme/029.out.bad > >>> 2024-11-19 10:55:13.615939542 +0000 > >>> @@ -1,2 +1,8 @@ > >>> Running nvme/029 > >>> +FAIL > >>> +FAIL > >>> +FAIL > >>> +FAIL > >>> +FAIL > >>> +FAIL > >>> ... > >>> (Run 'diff -u tests/nvme/029.out > >>> /root/blktests/results/nodev_tr_loop/nvme/029.out.bad' to see the > >>> entire diff) > >>> [2] > >>> 64a51080eaba (HEAD) nvmet: implement id ns for nvm command set > >>> > >>> > >>> -- > >>> Best Regards, > >>> Yi Zhang > >>> > >>> > >> I couldn't reproduce it even after running nvme/029 in a loop > >> for multiple times. Are you following any specific steps to > >> recreate it? > > > > From the reproduced data[1], seems it only reproduced on x86_64 and > > aarch64, and from the 029.full[2], we can see the failure comes from > > the "nvme write" cmd. > > [1] > > https://datawarehouse.cki-project.org/issue/3263 > > [2] > > # cat results/nodev_tr_loop/nvme/029.full > > Reference tag larger than allowed by PIF > > NQN:blktests-subsystem-1 disconnected 1 controller(s) > > disconnected 1 controller(s) > > > > I also attached the kernel config file in case you want to try it, thanks. > > > Thanks for the additional information! > Now I could understand the issue and have a probable fix. If possible, can you try > the below patch and check if it help resolve this issue? Yes, the issue was fixed now. > > diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c > index 934b401fbc2f..7a8256ae3085 100644 > --- a/drivers/nvme/target/admin-cmd.c > +++ b/drivers/nvme/target/admin-cmd.c > @@ -901,12 +901,14 @@ static void nvmet_execute_identify_ctrl_nvm(struct nvmet_req *req) > static void nvme_execute_identify_ns_nvm(struct nvmet_req *req) > { > u16 status; > + void *zero_buf; > > status = nvmet_req_find_ns(req); > if (status) > goto out; > > - status = nvmet_copy_to_sgl(req, 0, ZERO_PAGE(0), > + zero_buf = __va(page_to_pfn(ZERO_PAGE(0)) << PAGE_SHIFT); > + status = nvmet_copy_to_sgl(req, 0, zero_buf, > NVME_IDENTIFY_DATA_SIZE); > out: > nvmet_req_complete(req, status); > > Thanks, > --Nilay > -- Best Regards, Yi Zhang