Re: [PATCH 01/10] block: don't decrement nr_phys_segments for physically contigous segments

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 13, 2019 at 02:03:44PM +0200, Christoph Hellwig wrote:
> On Mon, May 13, 2019 at 05:45:45PM +0800, Ming Lei wrote:
> > On Mon, May 13, 2019 at 08:37:45AM +0200, Christoph Hellwig wrote:
> > > Currently ll_merge_requests_fn, unlike all other merge functions,
> > > reduces nr_phys_segments by one if the last segment of the previous,
> > > and the first segment of the next segement are contigous.  While this
> > > seems like a nice solution to avoid building smaller than possible
> > 
> > Some workloads need this optimization, please see 729204ef49ec00b
> > ("block: relax check on sg gap"):
> 
> And we still allow to merge the segments with this patch.  The only
> difference is that these merges do get accounted as extra segments.

It is easy for .nr_phys_segments to reach the max segment limit by this
way, then no new bio can be merged any more.

We don't consider segment merge between two bios in ll_new_hw_segment(),
in my mkfs test over virtio-blk, request size can be increased to ~1M(several
segments) from 63k(126 bios/segments) easily if the segment merge between
two bios is considered.

> 
> > > requests it causes a mismatch between the segments actually present
> > > in the request and those iterated over by the bvec iterators, including
> > > __rq_for_each_bio.  This could cause overwrites of too small kmalloc
> > 
> > Request based drivers usually shouldn't iterate bio any more.
> 
> We do that in a couple of places.  For one the nvme single segment
> optimization that triggered this bug.  Also for range discard support
> in nvme and virtio.  Then we have loop that  iterate the segments, but
> doesn't use the nr_phys_segments count, and plenty of others that
> iterate over pages at the moment but should be iterating bvecs,
> e.g. ubd or aoe.

Seems discard segment doesn't consider bios merge for nvme and virtio,
so it should be fine in this way. Will take a close look at nvme/virtio
discard segment merge later.


Thanks,
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux