> On Feb 8, 2021, at 11:50 PM, Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > On Mon, Feb 08, 2021 at 11:21:53PM -0800, Sagi Grimberg wrote: >> >> >>> On 2/8/21 8:21 PM, Ming Lei wrote: >>> On Mon, Feb 08, 2021 at 10:42:28AM -0800, Sagi Grimberg wrote: >>>> >>>>>> Hi Sagi >>>>>> >>>>>> On 2/8/21 5:46 PM, Sagi Grimberg wrote: >>>>>>> >>>>>>>> Hello >>>>>>>> >>>>>>>> We found this kernel NULL pointer issue with latest >>>>>>>> linux-block/for-next and it's 100% reproduced, let me know >>>>>>>> if you need more info/testing, thanks >>>>>>>> >>>>>>>> Kernel repo: >>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git >>>>>>>> Commit: 11f8b6fd0db9 - Merge branch 'for-5.12/io_uring' into for-next >>>>>>>> >>>>>>>> Reproducer: blktests nvme-tcp/012 >>>>>>> >>>>>>> Thanks for reporting Ming, I've tried to reproduce this on my VM >>>>>>> but did not succeed. Given that you have it 100% reproducible, >>>>>>> can you try to revert commit: >>>>>>> >>>>>>> 0dc9edaf80ea nvme-tcp: pass multipage bvec to request iov_iter >>>>>>> >>>>>> >>>>>> Revert this commit fixed the issue and I've attached the config. :) >>>>> >>>>> Good to know, >>>>> >>>>> I see some differences that I should probably change to hit this: >>>>> -- >>>>> @@ -254,14 +256,15 @@ CONFIG_PERF_EVENTS=y >>>>> # end of Kernel Performance Events And Counters >>>>> >>>>> CONFIG_VM_EVENT_COUNTERS=y >>>>> +CONFIG_SLUB_DEBUG=y >>>>> # CONFIG_COMPAT_BRK is not set >>>>> -CONFIG_SLAB=y >>>>> -# CONFIG_SLUB is not set >>>>> -# CONFIG_SLOB is not set >>>>> -CONFIG_SLAB_MERGE_DEFAULT=y >>>>> -# CONFIG_SLAB_FREELIST_RANDOM is not set >>>>> +# CONFIG_SLAB is not set >>>>> +CONFIG_SLUB=y >>>>> +# CONFIG_SLAB_MERGE_DEFAULT is not set >>>>> +CONFIG_SLAB_FREELIST_RANDOM=y >>>>> # CONFIG_SLAB_FREELIST_HARDENED is not set >>>>> -# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set >>>>> +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y >>>>> +CONFIG_SLUB_CPU_PARTIAL=y >>>>> CONFIG_SYSTEM_DATA_VERIFICATION=y >>>>> CONFIG_PROFILING=y >>>>> CONFIG_TRACEPOINTS=y >>>>> @@ -299,7 +302,8 @@ CONFIG_HAVE_INTEL_TXT=y >>>>> CONFIG_X86_64_SMP=y >>>>> CONFIG_ARCH_SUPPORTS_UPROBES=y >>>>> CONFIG_FIX_EARLYCON_MEM=y >>>>> -CONFIG_PGTABLE_LEVELS=4 >>>>> +CONFIG_DYNAMIC_PHYSICAL_MASK=y >>>>> +CONFIG_PGTABLE_LEVELS=5 >>>>> CONFIG_CC_HAS_SANE_STACKPROTECTOR=y >>>>> -- >>>>> >>>>> Probably CONFIG_SLUB and CONFIG_SLUB_DEBUG should be used. >>>> >>>> Used your profile and this still does not happen :( >>> >>> One obvious error is that nr_segments is computed wrong. >>> >>> Yi, can you try the following patch? >>> >>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c >>> index 881d28eb15e9..a393d99b74e1 100644 >>> --- a/drivers/nvme/host/tcp.c >>> +++ b/drivers/nvme/host/tcp.c >>> @@ -239,9 +239,14 @@ static void nvme_tcp_init_iter(struct nvme_tcp_request *req, >>> offset = 0; >>> } else { >>> struct bio *bio = req->curr_bio; >>> + struct bio_vec bv; >>> + struct bvec_iter iter; >>> + >>> + nsegs = 0; >>> + bio_for_each_bvec(bv, bio, iter) >>> + nsegs++; >>> vec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter); >>> - nsegs = bio_segments(bio); >> >> This was exactly the patch that caused the issue. > > What was the issue you are talking about? Any link or commit hash? > > nvme-tcp builds iov_iter(BVEC) from __bvec_iter_bvec(), the segment > number has to be the actual bvec number. But bio_segment() just returns > number of the single-page segment, which is wrong for iov_iter. > > Please see the same usage in lo_rw_aio(). > That what I have suggested but I've also suggested the memory allocation part which Sagi explained why it is better to avoid. In my opinion we should at least try bvec calculation in lo_aio_rw() and see the problem can be fixed or not, unless reverting the commit it right approach for some reason. > -- > Ming >