вт, 10 дек. 2019 г. в 06:09, Ming Lei <tom.leiming@xxxxxxxxx>: > > On Mon, Dec 9, 2019 at 10:25 PM Aleksei Zakharov <zaharov@xxxxxxxxxxx> wrote: > > > > I/O request can be splitted for two or more requests, > > if it's size is greater than queue/max_sectors_kb > > setting. > > > > Knowledge of splits frequency helps to understand workload > > profile better and to make decision to tune queue/max_sectors_kb. > > IO split should be one static behavior, so if one bio's sector/size is provided, > it is easy to figure out the split pattern, not sure if it is worth of > adding statistics > in iostat. > > That said the IO split analysis can be done simply as post-process of IO trace, > or even it can be done runtime via bcc/bpftrace script without much difficulty. Yes, you're right, it can be done via bpftrace. But iostat is just well-known and overall used tool. It's better to have most of stats with default tool, than have a lot of tools for different parts of one subsystem layer, IMO. > > > > > This patch adds three counters to /sys/class/block/$dev/stat > > and /proc/diskstats: number of reads, writes and discards > > splitted. > > > > There's also counter for flush requests, but it is ignored, > > because flush cannot be splitted. > > --- > > Documentation/ABI/testing/procfs-diskstats | 3 +++ > > Documentation/ABI/testing/sysfs-block | 5 ++++- > > Documentation/admin-guide/iostats.rst | 10 ++++++++++ > > block/bio.c | 2 ++ > > block/blk-core.c | 2 ++ > > block/genhd.c | 8 ++++++-- > > block/partition-generic.c | 8 ++++++-- > > include/linux/genhd.h | 1 + > > 8 files changed, 34 insertions(+), 5 deletions(-) > > > > diff --git a/Documentation/ABI/testing/procfs-diskstats b/Documentation/ABI/testing/procfs-diskstats > > index 70dcaf2481f4..e1473e93d901 100644 > > --- a/Documentation/ABI/testing/procfs-diskstats > > +++ b/Documentation/ABI/testing/procfs-diskstats > > @@ -33,5 +33,8 @@ Description: > > > > 19 - flush requests completed successfully > > 20 - time spent flushing > > + 21 - reads splitted > > + 22 - writes splitted > > + 23 - discards splitted > > > > For more details refer to Documentation/admin-guide/iostats.rst > > diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block > > index ed8c14f161ee..ffbac0e72508 100644 > > --- a/Documentation/ABI/testing/sysfs-block > > +++ b/Documentation/ABI/testing/sysfs-block > > @@ -3,7 +3,7 @@ Date: February 2008 > > Contact: Jerome Marchand <jmarchan@xxxxxxxxxx> > > Description: > > The /sys/block/<disk>/stat files displays the I/O > > - statistics of disk <disk>. They contain 11 fields: > > + statistics of disk <disk>. They contain 20 fields: > > 1 - reads completed successfully > > 2 - reads merged > > 3 - sectors read > > @@ -21,6 +21,9 @@ Description: > > 15 - time spent discarding (ms) > > 16 - flush requests completed > > 17 - time spent flushing (ms) > > + 18 - reads splitted > > + 19 - writes splitted > > + 20 - discrads splitted > > For more details refer Documentation/admin-guide/iostats.rst > > > > > > diff --git a/Documentation/admin-guide/iostats.rst b/Documentation/admin-guide/iostats.rst > > index 4f0462af3ca7..7f3f374b82b7 100644 > > --- a/Documentation/admin-guide/iostats.rst > > +++ b/Documentation/admin-guide/iostats.rst > > @@ -130,6 +130,16 @@ Field 16 -- # of flush requests completed > > Field 17 -- # of milliseconds spent flushing > > This is the total number of milliseconds spent by all flush requests. > > > > +Field 18 -- # of reads splitted, field 19 -- # of writes splitted > > + This is the total number of requests splitted before queue. > > + > > + The maximum size of I/O is limited by queue/max_sectors_kb. > > + If size of I/O is greater than this limit, it will be splitted > > + as many times as needed to keep I/O size withing the limit. > > + > > +Field 20 -- # of discrads Splitted > > + See description of fileds 18 and 19. > > + > > To avoid introducing performance bottlenecks, no locks are held while > > modifying these counters. This implies that minor inaccuracies may be > > introduced when changes collide, so (for instance) adding up all the > > diff --git a/block/bio.c b/block/bio.c > > index 8f0ed6228fc5..c8a051e128f8 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -1855,6 +1855,8 @@ struct bio *bio_split(struct bio *bio, int sectors, > > if (bio_flagged(bio, BIO_TRACE_COMPLETION)) > > bio_set_flag(split, BIO_TRACE_COMPLETION); > > > > + bio_set_flag(split, BIO_SPLITTED); > > + > > return split; > > } > > EXPORT_SYMBOL(bio_split); > > diff --git a/block/blk-core.c b/block/blk-core.c > > index f0d82227a2fc..776d28b9a5bf 100644 > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -1379,6 +1379,8 @@ void blk_account_io_start(struct request *rq, bool new_io) > > } > > part_inc_in_flight(rq->q, part, rw); > > rq->part = part; > > + if (bio_flagged(rq->bio, BIO_SPLITTED)) > > + part_stat_inc(part, splits[rw]); > > It is the only use of BIO_SPLITTED, and it should have been done as one rq flag, > then extra change in bio struct can be avoided. If the bio is > splitted can be easily > figured out in blk_mq_make_request(), then you may pass that info to > blk_mq_bio_to_request(). Thanks for pointing that out! I'll rethink this patch. > > Thanks, > Ming -- Best Regards, Aleksei Zakharov System administrator www.selectel.com