Re: [PATCH v2 01/15] block: Make bio_set_ioprio() modify fewer bio->bi_ioprio bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/9/23 22:22, Kanchan Joshi wrote:
On 10/6/2023 1:10 AM, Bart Van Assche wrote:
A later patch will store the data lifetime in the bio->bi_ioprio member
before bio_set_ioprio() is called. Make sure that bio_set_ioprio()
doesn't clear more bits than necessary.

Cc: Damien Le Moal <dlemoal@xxxxxxxxxx>
Cc: Niklas Cassel <niklas.cassel@xxxxxxx>
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
   block/blk-mq.c         |  3 ++-
   include/linux/ioprio.h | 10 ++++++++++
   2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e2d11183f62e..bc086660ffd3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2926,7 +2926,8 @@ static void bio_set_ioprio(struct bio *bio)
   {
   	/* Nobody set ioprio so far? Initialize it based on task's nice value */
   	if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE)
-		bio->bi_ioprio = get_current_ioprio();
+		ioprio_set_class_and_level(&bio->bi_ioprio,
+					   get_current_ioprio());
   	blkcg_set_ioprio(bio);
   }
diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h
index 7578d4f6a969..f2e768ab4b35 100644
--- a/include/linux/ioprio.h
+++ b/include/linux/ioprio.h
@@ -71,4 +71,14 @@ static inline int ioprio_check_cap(int ioprio)
   }
   #endif /* CONFIG_BLOCK */
+#define IOPRIO_CLASS_LEVEL_MASK ((IOPRIO_CLASS_MASK << IOPRIO_CLASS_SHIFT) | \
+				 (IOPRIO_LEVEL_MASK << 0))
+
+static inline void ioprio_set_class_and_level(u16 *prio, u16 class_level)
+{
+	WARN_ON_ONCE(class_level & ~IOPRIO_CLASS_LEVEL_MASK);
+	*prio &= ~IOPRIO_CLASS_LEVEL_MASK;
+	*prio |= class_level;

Return of get_current_ioprio() will touch all 16 bits here. So
user-defined value can alter whatever was set in bio by F2FS (patch 4 in
this series). Is that not an issue?

The above is incomprehensible to me. Anyway, I will try to answer.

It is not clear to me why it is claimed that "get_current_ioprio() will
touch all 16 bits here"? The return value of get_current_ioprio() is
passed to ioprio_set_class_and_level() and that function clears the hint
bits from the get_current_ioprio() return value.

ioprio_set_class_and_level() preserves the hint bits set by F2FS.

And what is the user interface you have in mind. Is it ioprio based, or
write-hint based or mix of both?

Since the data lifetime is encoded in the hint bits, the hint bits need
to be set by user space to set a data lifetime. In case you would help,
the blktest test that I wrote to test this functionality is available
below.

Thanks,

Bart.


diff --git a/tests/scsi/097 b/tests/scsi/097
new file mode 100755
index 000000000000..01d280021653
--- /dev/null
+++ b/tests/scsi/097
@@ -0,0 +1,62 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2022 Google LLC
+
+. tests/zbd/rc
+. common/null_blk
+. common/scsi_debug
+
+DESCRIPTION="test block data lifetime support"
+QUICK=1
+
+requires() {
+	_have_fio
+	_have_module scsi_debug
+}
+
+test() {
+	echo "Running ${TEST_NAME}"
+
+	local scsi_debug_params=(
+		delay=0
+		dev_size_mb=1024
+		sector_size=4096
+	)
+	_init_scsi_debug "${scsi_debug_params[@]}" &&
+	local dev="/dev/${SCSI_DEBUG_DEVICES[0]}" fail &&
+	ls -ld "${dev}" >>"${FULL}" &&
+	local i fio_args=(
+		--group_reporting=1
+		--gtod_reduce=1
+		--ioscheduler=none
+		--norandommap
+		--direct=1
+		--rw=randwrite
+		--ioengine=io_uring
+		--time_based
+	) &&
+	for ((i=1; i<=3; i++)); do
+		fio_args+=(
+			--name=whint"$i"
+			--filename="${dev}"
+			--prio=$((i<<6))
+			--time_based
+			--runtime=10
+		)
+	done &&
+	_run_fio "${fio_args[@]}" >>"${FULL}" 2>&1 ||
+	fail=true
+
+ head -n 999 /sys/bus/pseudo/drivers/scsi_debug/group_number_stats >> "${FULL}"
+	while read -r group count; do
+		if [ "$count" -gt 0 ]; then echo "$group"; fi
+	done < /sys/bus/pseudo/drivers/scsi_debug/group_number_stats
+	_exit_scsi_debug
+
+	if [ -z "$fail" ]; then
+		echo "Test complete"
+	else
+		echo "Test failed"
+		return 1
+	fi
+}
diff --git a/tests/scsi/097.out b/tests/scsi/097.out
new file mode 100644
index 000000000000..f6ed3d6de861
--- /dev/null
+++ b/tests/scsi/097.out
@@ -0,0 +1,5 @@
+Running scsi/097
+1
+2
+3
+Test complete




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux