On Mon, Feb 08, 2021 at 11:21:53PM -0800, Sagi Grimberg wrote: > > > On 2/8/21 8:21 PM, Ming Lei wrote: > > On Mon, Feb 08, 2021 at 10:42:28AM -0800, Sagi Grimberg wrote: > > > > > > > > Hi Sagi > > > > > > > > > > On 2/8/21 5:46 PM, Sagi Grimberg wrote: > > > > > > > > > > > > > Hello > > > > > > > > > > > > > > We found this kernel NULL pointer issue with latest > > > > > > > linux-block/for-next and it's 100% reproduced, let me know > > > > > > > if you need more info/testing, thanks > > > > > > > > > > > > > > Kernel repo: > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git > > > > > > > Commit: 11f8b6fd0db9 - Merge branch 'for-5.12/io_uring' into for-next > > > > > > > > > > > > > > Reproducer: blktests nvme-tcp/012 > > > > > > > > > > > > Thanks for reporting Ming, I've tried to reproduce this on my VM > > > > > > but did not succeed. Given that you have it 100% reproducible, > > > > > > can you try to revert commit: > > > > > > > > > > > > 0dc9edaf80ea nvme-tcp: pass multipage bvec to request iov_iter > > > > > > > > > > > > > > > > Revert this commit fixed the issue and I've attached the config. :) > > > > > > > > Good to know, > > > > > > > > I see some differences that I should probably change to hit this: > > > > -- > > > > @@ -254,14 +256,15 @@ CONFIG_PERF_EVENTS=y > > > > # end of Kernel Performance Events And Counters > > > > > > > > CONFIG_VM_EVENT_COUNTERS=y > > > > +CONFIG_SLUB_DEBUG=y > > > > # CONFIG_COMPAT_BRK is not set > > > > -CONFIG_SLAB=y > > > > -# CONFIG_SLUB is not set > > > > -# CONFIG_SLOB is not set > > > > -CONFIG_SLAB_MERGE_DEFAULT=y > > > > -# CONFIG_SLAB_FREELIST_RANDOM is not set > > > > +# CONFIG_SLAB is not set > > > > +CONFIG_SLUB=y > > > > +# CONFIG_SLAB_MERGE_DEFAULT is not set > > > > +CONFIG_SLAB_FREELIST_RANDOM=y > > > > # CONFIG_SLAB_FREELIST_HARDENED is not set > > > > -# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set > > > > +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y > > > > +CONFIG_SLUB_CPU_PARTIAL=y > > > > CONFIG_SYSTEM_DATA_VERIFICATION=y > > > > CONFIG_PROFILING=y > > > > CONFIG_TRACEPOINTS=y > > > > @@ -299,7 +302,8 @@ CONFIG_HAVE_INTEL_TXT=y > > > > CONFIG_X86_64_SMP=y > > > > CONFIG_ARCH_SUPPORTS_UPROBES=y > > > > CONFIG_FIX_EARLYCON_MEM=y > > > > -CONFIG_PGTABLE_LEVELS=4 > > > > +CONFIG_DYNAMIC_PHYSICAL_MASK=y > > > > +CONFIG_PGTABLE_LEVELS=5 > > > > CONFIG_CC_HAS_SANE_STACKPROTECTOR=y > > > > -- > > > > > > > > Probably CONFIG_SLUB and CONFIG_SLUB_DEBUG should be used. > > > > > > Used your profile and this still does not happen :( > > > > One obvious error is that nr_segments is computed wrong. > > > > Yi, can you try the following patch? > > > > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c > > index 881d28eb15e9..a393d99b74e1 100644 > > --- a/drivers/nvme/host/tcp.c > > +++ b/drivers/nvme/host/tcp.c > > @@ -239,9 +239,14 @@ static void nvme_tcp_init_iter(struct nvme_tcp_request *req, > > offset = 0; > > } else { > > struct bio *bio = req->curr_bio; > > + struct bio_vec bv; > > + struct bvec_iter iter; > > + > > + nsegs = 0; > > + bio_for_each_bvec(bv, bio, iter) > > + nsegs++; > > vec = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter); > > - nsegs = bio_segments(bio); > > This was exactly the patch that caused the issue. What was the issue you are talking about? Any link or commit hash? nvme-tcp builds iov_iter(BVEC) from __bvec_iter_bvec(), the segment number has to be the actual bvec number. But bio_segment() just returns number of the single-page segment, which is wrong for iov_iter. Please see the same usage in lo_rw_aio(). -- Ming