Hi,
在 2025/02/07 10:35, colyli@xxxxxxxxxx 写道:
From: Coly Li <colyli@xxxxxxxxxx>
which_dev() is a top hot function in md-linear.c, every I/O request will
call it to find out which component disk the bio should be issued to.
Current witch_dev() performs a standard binary search, indeed this is
not the fastest algorithm in practice. When the whole conf->disks array
can be stored within a single cache line, simple linear search is faster
than binary search for a small disks number.
From micro benchmark, around 20%~30% latency reduction can be observed.
Of course such huge optimization cannot be achieved in real workload, in
my benchmark with,
1) One md linear device assembled by 2 or 4 Intel Optane memory block
device on Lenovo ThinkSystem SR650 server.
2) Random write I/O issued by fio, with I/O depth 1 and 512 bytes block
size.
The percentage of I/O latencies completed with 750 nsec increases from
97.186% to 99.324% in average, in a rough estimation the write latency
improves (reduces) around 2.138%.
This is quite ideal result, I believe on slow hard drives such small
code-running optimization will be overwhelmed by hardware latency and
hard to be recognized.
This patch will go back to binary search when the linear device grows
and conf->disks array cannot be placed within a single cache line.
Although the optimization result is tiny in most of cases, it is good to
have it since we don't pay any other cost.
Signed-off-by: Coly Li <colyli@xxxxxxxxxx>
Cc: Song Liu <song@xxxxxxxxxx>
Cc: Yu Kuai <yukuai3@xxxxxxxxxx>
---
Changelog,
v3: fix typo and email address which are reported by raid kernel ci.
v2: return last item of conf->disks[] if fast search missed.
v1: initial version.
drivers/md/md-linear.c | 45 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index a382929ce7ba..cdb59f2b2a1c 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -25,10 +25,12 @@ struct linear_conf {
struct dev_info disks[] __counted_by(raid_disks);
};
+static int prefer_linear_dev_search;
Why using global variable instead of per linear? There are still hole in
the linear_conf() that you can use.
+
/*
* find which device holds a particular offset
*/
-static inline struct dev_info *which_dev(struct mddev *mddev, sector_t sector)
+static inline struct dev_info *__which_dev(struct mddev *mddev, sector_t sector)
{
int lo, mid, hi;
struct linear_conf *conf;
@@ -53,6 +55,34 @@ static inline struct dev_info *which_dev(struct mddev *mddev, sector_t sector)
return conf->disks + lo;
}
+/*
+ * If conf->disk[] can be hold within a L1 cache line,
+ * linear search is fater than binary search.
+ */
+static inline struct dev_info *which_dev(struct mddev *mddev, sector_t sector)
+{
+ int i;
+
+ if (prefer_linear_dev_search) {
+ struct linear_conf *conf;
+ struct dev_info *dev;
+ int max;
+
+ conf = mddev->private;
+ dev = conf->disks;
+ max = conf->raid_disks;
+ for (i = 0; i < max; i++, dev++) {
+ if (sector < dev->end_sector)
+ return dev;
+ }
+ return &conf->disks[max-1];
+ }
+
+ /* slow path */
+ return __which_dev(mddev, sector);
+}
+
+
static sector_t linear_size(struct mddev *mddev, sector_t sectors, int raid_disks)
{
struct linear_conf *conf;
@@ -222,6 +252,18 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
kfree_rcu(oldconf, rcu);
+
+ /*
+ * When elements in linear_conf->disks[] becomes large enough,
+ * set prefer_linear_dev_search as 0 to indicate linear search
+ * in which_dev() is not optimized. Slow path in __which_dev()
+ * might be faster.
+ */
+ if ((mddev->raid_disks * sizeof(struct dev_info)) >
+ cache_line_size() &&
BTW, you said in the case *a single cache line*, however, the field
disks is 32 bytes offset in linear_conf, you might want to align it to
cacheline, or add sizeof(struct linear_conf) on the left and align
linear_conf to the cacheline.
Thanks,
Kuai
+ prefer_linear_dev_search == 1)
+ prefer_linear_dev_search = 0;
+
return 0;
}
@@ -337,6 +379,7 @@ static struct md_personality linear_personality = {
static int __init linear_init(void)
{
+ prefer_linear_dev_search = 1;
return register_md_personality(&linear_personality);
}