On 10/20/23 07:40, Damien Le Moal wrote: > On 10/20/23 01:48, Bart Van Assche wrote: >> On 10/18/23 17:33, Damien Le Moal wrote: >>> On 10/19/23 04:34, Bart Van Assche wrote: >> >> On 10/18/23 12:09, Jens Axboe wrote: >>>>> I'm also really against growing struct bio just for this. Why is patch 2 >>>>> not just using the ioprio field at least? >>>> >>>> Hmm ... shouldn't the bits in the ioprio field in struct bio have the >>>> same meaning as in the ioprio fields used in interfaces between user >>>> space and the kernel? Damien Le Moal asked me not to use any of the >>>> ioprio bits passing data lifetime information from user space to the kernel. >>> >>> I said so in the context that if lifetime is a per-inode property, then ioprio >>> is the wrong interface since the ioprio API is per process or per IO. There is a >>> mismatch. >>> >>> One version of your patch series used fnctl() to set the lifetime per inode, >>> which is fine, and then used the BIO ioprio to pass the lifetime down to the >>> device driver. That is in theory a nice trick, but that creates conflicts with >>> the userspace ioprio API if the user uses that at the same time. >>> >>> So may be we should change bio ioprio from int to u16 and use the freedup u16 >>> for lifetime. With that, things are cleanly separated without growing struct bio. >> >> Hmm ... I think that bi_ioprio has been 16 bits wide since the >> introduction of that data structure member in 2016? > > My bad. struct bio->bi_ioprio is an unsigned short. I got confused with the user > API and kernel functions using an int in many places. We really should change > the kernel functions to use unsigned short for ioprio everywhere. > >>>> Is it clear that the size of struct bio has not been changed because the >>>> new bi_lifetime member fills a hole in struct bio? >>> >>> When the struct is randomized, holes move or disappear. Don't count on that... >> >> We should aim to maximize performance for users who do not use data >> structure layout randomization. >> >> Additionally, I doubt that anyone is using full structure layout >> randomization for SCSI devices. No SCSI driver has any >> __no_randomize_layout / __randomize_layout annotations although I'm sure >> there are plenty of data structures in SCSI drivers for which the layout >> matters. > > Well, if Jens is OK with adding another "unsigned short bi_lifetime" in a hole > in struct bio, that's fine with me. Otherwise, we are back to discussing how to > pack bi_ioprio in a sensible manner so that we do not create a mess between the > use cases and APIs: > 1) inode based lifetime with FS setting up the bi_ioprio field > 2) Direct IOs to files of an FS with lifetime set by user per IO (e.g. > aio/io_uring/ioprio_set()) and/or fcntl() > 3) Direct IOs to raw block devices with lifetime set by user per IO (e.g. > aio/io_uring/ioprio_set()) > > Any of the above case should also allow using ioprio class/level and CDL hint. > > I think the most problematic part is (2) when lifetime are set with both fcntl() > and per IO: which lifetime is the valid one ? The one set with fcntl() or the > one specified for the IO ? I think the former is the one we want here. > > If we can clarify that, then I guess using 3 or 4 bits from the 10 bits ioprio > hint should be OK. That would give you 7 or 15 lifetime values. Enough no ? To be clear, we have to deal with these cases: 1) File IOs - User uses fcntl() only for lifetime - User uses per direct IO ioprio with lifetime (and maybe class/level/cdl) - User uses all of the above 2) Raw block device direct IOs - Per IO ioprio with lifetime (and maybe class/level/cdl) (2) is easy. No real change needed beside the UFS driver bits. But the cases for (1) need clarification about how things should work. -- Damien Le Moal Western Digital Research