[PATCH RFC 0/7] dm: add support of iopoll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch set adds support of iopoll for dm devices.

Several months ago, I also sent a patch set adding support of iopoll
for dm devices [1]. The old patch set implement this by polling all
polling mode hardware queues of all underlying target devices
unconditionally, no matter which hardware queue the bio is enqueued
into.

Ming Lei pointed out that this implementation may have performance
issue. Mike Snitzer also discussed and helpt a lot on the design
issue. At that time, we agreed that this feature should be
implemented on the basis of split-bio tracking, since the bio to the
dm device could be split into multiple split bios to the underlying
target devices.

This patch set actually implement the split-bio tracking part.
Regrettably this implementation is quite coarse and original. Quite
code refactoring is introduced to the block core, since device mapper
also calls methods provided by block core to split bios. The new
fields are directly added into 'struct bio' structure. So this is
just an RFC version and there may be quite a lot design issues be
fronted on. I just implement a version quickly and would like to share
with you my thoughts and problems at hand.

This implementation works but has poor performance. Awkwardly the
performance of direct 8k randread decreases ~20% on a 8k-striped
dm-stripe device.

The first 5 patches prepare the introduction of iopoll for dm device.
Patch 6 is the core part and implements a mechanism of tracking split
bios for bio-based device. Patch 7 just enables this feature.

[1] https://patchwork.kernel.org/project/linux-block/cover/20201020065420.124885-1-jefflexu@xxxxxxxxxxxxxxxxx/


[Design Notes]

- recursive way or non-recursive way?

The core of the split-bio tracking mechanism is that, a list is
maintained in the top bio (the original bio submitted to the dm
device), which is actually used to maintain all valid cookies of all
split bios.

This is actually a non-recursive way to implement the tracking
mechanism. On the contrary, the recursive way is that, maintain the
split bios at each dm layer. DM device can be build into a device
stack. For example, considering the following device stack,

```
            dm0(bio 0)
dm1(bio 1)             dm2(bio 2)
nvme0(bio 3)           nvme1(bio 4)

```

The non-recursive way is that bio 3/4 are directly maintained as a
list beneath bio 0. The recursive way is that, bio 3 is maintained
as a list beneath bio 1 and bio 4 is maintained as a list beneath
bio 2, while bio 1/2 are maintained as a list beneath bio 0.

The reason why I choose the non-recursive way is that, we need call
blk_bio_poll() or something recursively if it's implemented in the
recursive way, and I worry this would consume up the stack space if
the device stack is too deep. After all bio_list is used to prevent
the dm device using up stack space when submitting bio. 


- why embed new fields into struct bio directly?

Mike Snitzer had ever suggested that the newly added fields could be
integrated into the clone bio allocated by dm subsystem through
bio_set. There're 3 reasons why I directly embed these fields into
struct bio:

1. The implementation difference of DM and MD
DM subsystem indeed allocate clone bio itself, in which case we could
allocate extra per-bio space for specific usage. However MD subsystem
doesn't allocate extra clone bio, and just uses the bio structure
originally submitted to MD device as the bio structure submitted to
the underlying target device. (At least raid0 works in this way.)

2. In the previously mentioned non-recursive way of iopoll, a list
containing all valid cookies should be maintained. For convenience, I
just put this list in the top bio (the original bio structure
submitted to the dm device). This original bio structure is allocated
by the upper layer, e.g. filesystem, and is out of control of DM
subsystem. (Of course we could resolve this problem technically, e.g.,
put these newlly added fields into the corresponding dm_io structure
of the top bio.)

3. As a quick implementation, I just put these fields into struct bio
directly. Obviously more design issues need to be considered when it
comes into the formal version.


Jeffle Xu (7):
  block: move definition of blk_qc_t to types.h
  block: add helper function fetching gendisk from queue
  block: add iopoll method for non-mq device
  block: define blk_qc_t as uintptr_t
  dm: always return BLK_QC_T_NONE for bio-based device
  block: track cookies of split bios for bio-based device
  dm: add support for IO polling

 block/bio.c                  |   8 ++
 block/blk-core.c             | 163 ++++++++++++++++++++++++++++++++++-
 block/blk-mq.c               |  70 ++-------------
 drivers/md/dm-table.c        |  28 ++++++
 drivers/md/dm.c              |  27 +++---
 include/linux/blk-mq.h       |   3 +
 include/linux/blk_types.h    |  41 ++++++++-
 include/linux/blkdev.h       |   3 +
 include/linux/fs.h           |   2 +-
 include/linux/types.h        |   3 +
 include/trace/events/kyber.h |   6 +-
 11 files changed, 270 insertions(+), 84 deletions(-)

-- 
2.27.0




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux