On 10/11/2023 10:22 PM, Bart Van Assche wrote: >>> @@ -2926,7 +2926,8 @@ static void bio_set_ioprio(struct bio *bio) >>> { >>> /* Nobody set ioprio so far? Initialize it based on task's >>> nice value */ >>> if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) == IOPRIO_CLASS_NONE) >>> - bio->bi_ioprio = get_current_ioprio(); >>> + ioprio_set_class_and_level(&bio->bi_ioprio, >>> + get_current_ioprio()); >>> blkcg_set_ioprio(bio); >>> } >>> diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h >>> index 7578d4f6a969..f2e768ab4b35 100644 >>> --- a/include/linux/ioprio.h >>> +++ b/include/linux/ioprio.h >>> @@ -71,4 +71,14 @@ static inline int ioprio_check_cap(int ioprio) >>> } >>> #endif /* CONFIG_BLOCK */ >>> +#define IOPRIO_CLASS_LEVEL_MASK ((IOPRIO_CLASS_MASK << >>> IOPRIO_CLASS_SHIFT) | \ >>> + (IOPRIO_LEVEL_MASK << 0)) >>> + >>> +static inline void ioprio_set_class_and_level(u16 *prio, u16 >>> class_level) >>> +{ >>> + WARN_ON_ONCE(class_level & ~IOPRIO_CLASS_LEVEL_MASK); >>> + *prio &= ~IOPRIO_CLASS_LEVEL_MASK; >>> + *prio |= class_level; >> >> Return of get_current_ioprio() will touch all 16 bits here. So >> user-defined value can alter whatever was set in bio by F2FS (patch 4 in >> this series). Is that not an issue? > > The above is incomprehensible to me. Anyway, I will try to answer. > > It is not clear to me why it is claimed that "get_current_ioprio() will > touch all 16 bits here"? The return value of get_current_ioprio() is > passed to ioprio_set_class_and_level() and that function clears the hint > bits from the get_current_ioprio() return value. Function does OR bio->bi_ioprio with whatever is the return of get_current_ioprio(). So if lifetime bits were set in get_current_ioprio(), you will end up setting that in bio->bi_ioprio too. > ioprio_set_class_and_level() preserves the hint bits set by F2FS. > >> And what is the user interface you have in mind. Is it ioprio based, or >> write-hint based or mix of both? > > Since the data lifetime is encoded in the hint bits, the hint bits need > to be set by user space to set a data lifetime. I asked because more than one way seems to emerge here. Parts of this series (Patch 4) are taking inode->i_write_hint (and not ioprio value) and putting that into bio. I wonder what to expect if application get to send one lifetime with fcntl (write-hints) and different one with ioprio. Is that not racy? > In case you would help, > the blktest test that I wrote to test this functionality is available > below. > > Thanks, > > Bart. > > > diff --git a/tests/scsi/097 b/tests/scsi/097 > new file mode 100755 > index 000000000000..01d280021653 > --- /dev/null > +++ b/tests/scsi/097 > @@ -0,0 +1,62 @@ > +#!/bin/bash > +# SPDX-License-Identifier: GPL-3.0+ > +# Copyright (C) 2022 Google LLC > + > +. tests/zbd/rc > +. common/null_blk > +. common/scsi_debug > + > +DESCRIPTION="test block data lifetime support" > +QUICK=1 > + > +requires() { > + _have_fio > + _have_module scsi_debug > +} > + > +test() { > + echo "Running ${TEST_NAME}" > + > + local scsi_debug_params=( > + delay=0 > + dev_size_mb=1024 > + sector_size=4096 > + ) > + _init_scsi_debug "${scsi_debug_params[@]}" && > + local dev="/dev/${SCSI_DEBUG_DEVICES[0]}" fail && > + ls -ld "${dev}" >>"${FULL}" && > + local i fio_args=( > + --group_reporting=1 > + --gtod_reduce=1 > + --ioscheduler=none > + --norandommap > + --direct=1 > + --rw=randwrite > + --ioengine=io_uring > + --time_based > + ) && > + for ((i=1; i<=3; i++)); do > + fio_args+=( > + --name=whint"$i" > + --filename="${dev}" > + --prio=$((i<<6)) This will not work as prio can only take values between 0-7. Perhaps you want to use "priohint" to send lifetime.