In order to account io_ticks precisely, update_io_ticks() must know if
there are IO inflight already, and this requires overhead slightly,
hence precise io accounting is disabled by default, and user can enable
it through sysfs entry.
Noted that for rq-based devcie, part_stat_local_inc/dec() and
part_in_flight() is used to track inflight instead of iterating tags,
which is not supposed to be used in fast path because 'tags->lock' is
grabbed in blk_mq_find_and_get_req().
Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---
Changes from RFC v1:
- remove the new parameter for update_io_ticks();
- simplify update_io_ticks();
- use swith in queue_iostats_store();
- add missing part_stat_local_dec() in
blk_account_io_merge_request();
Changes from RFC v2:
- fix that precise is ignored for the first io in update_io_ticks();
Documentation/ABI/stable/sysfs-block | 8 ++++--
block/blk-core.c | 10 +++++--
block/blk-merge.c | 3 ++
block/blk-mq-debugfs.c | 2 ++
block/blk-mq.c | 11 +++++++-
block/blk-sysfs.c | 42
++++++++++++++++++++++++++--
block/blk.h | 1 +
block/genhd.c | 2 +-
include/linux/blk-mq.h | 1 +
include/linux/blkdev.h | 3 ++
10 files changed, 74 insertions(+), 9 deletions(-)
diff --git a/Documentation/ABI/stable/sysfs-block
b/Documentation/ABI/stable/sysfs-block
index 1fe9a553c37b..79027bf2661a 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -358,8 +358,12 @@ What: /sys/block/<disk>/queue/iostats
Date: January 2009
Contact: linux-block@xxxxxxxxxxxxxxx
Description:
- [RW] This file is used to control (on/off) the iostats
- accounting of the disk.
+ [RW] This file is used to control the iostats accounting of
the
+ disk. If this value is 0, iostats accounting is disabled; If
+ this value is 1, iostats accounting is enabled, but
io_ticks is
+ accounted by sampling and the result is not accurate; If this
+ value is 2, iostats accounting is enabled and io_ticks is
+ accounted precisely, but there will be slightly more overhead.
What: /sys/block/<disk>/queue/logical_block_size
diff --git a/block/blk-core.c b/block/blk-core.c
index 9520ccab3050..c70dc311e3b7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -954,11 +954,15 @@ EXPORT_SYMBOL_GPL(iocb_bio_iopoll);
void update_io_ticks(struct block_device *part, unsigned long now,
bool end)
{
unsigned long stamp;
+ bool precise = blk_queue_precise_io_stat(part->bd_queue);
again:
stamp = READ_ONCE(part->bd_stamp);
- if (unlikely(time_after(now, stamp))) {
- if (likely(try_cmpxchg(&part->bd_stamp, &stamp, now)))
- __part_stat_add(part, io_ticks, end ? now - stamp : 1);
+ if (unlikely(time_after(now, stamp)) &&
+ likely(try_cmpxchg(&part->bd_stamp, &stamp, now))) {
+ if (end || (precise && part_in_flight(part)))
+ __part_stat_add(part, io_ticks, now - stamp);
+ else if (!precise)
+ __part_stat_add(part, io_ticks, 1);