On 10/20/23 01:48, Bart Van Assche wrote: > On 10/18/23 17:33, Damien Le Moal wrote: >> On 10/19/23 04:34, Bart Van Assche wrote: > >> On 10/18/23 12:09, Jens Axboe wrote: >>>> I'm also really against growing struct bio just for this. Why is patch 2 >>>> not just using the ioprio field at least? >>> >>> Hmm ... shouldn't the bits in the ioprio field in struct bio have the >>> same meaning as in the ioprio fields used in interfaces between user >>> space and the kernel? Damien Le Moal asked me not to use any of the >>> ioprio bits passing data lifetime information from user space to the kernel. >> >> I said so in the context that if lifetime is a per-inode property, then ioprio >> is the wrong interface since the ioprio API is per process or per IO. There is a >> mismatch. >> >> One version of your patch series used fnctl() to set the lifetime per inode, >> which is fine, and then used the BIO ioprio to pass the lifetime down to the >> device driver. That is in theory a nice trick, but that creates conflicts with >> the userspace ioprio API if the user uses that at the same time. >> >> So may be we should change bio ioprio from int to u16 and use the freedup u16 >> for lifetime. With that, things are cleanly separated without growing struct bio. > > Hmm ... I think that bi_ioprio has been 16 bits wide since the > introduction of that data structure member in 2016? My bad. struct bio->bi_ioprio is an unsigned short. I got confused with the user API and kernel functions using an int in many places. We really should change the kernel functions to use unsigned short for ioprio everywhere. >>> Is it clear that the size of struct bio has not been changed because the >>> new bi_lifetime member fills a hole in struct bio? >> >> When the struct is randomized, holes move or disappear. Don't count on that... > > We should aim to maximize performance for users who do not use data > structure layout randomization. > > Additionally, I doubt that anyone is using full structure layout > randomization for SCSI devices. No SCSI driver has any > __no_randomize_layout / __randomize_layout annotations although I'm sure > there are plenty of data structures in SCSI drivers for which the layout > matters. Well, if Jens is OK with adding another "unsigned short bi_lifetime" in a hole in struct bio, that's fine with me. Otherwise, we are back to discussing how to pack bi_ioprio in a sensible manner so that we do not create a mess between the use cases and APIs: 1) inode based lifetime with FS setting up the bi_ioprio field 2) Direct IOs to files of an FS with lifetime set by user per IO (e.g. aio/io_uring/ioprio_set()) and/or fcntl() 3) Direct IOs to raw block devices with lifetime set by user per IO (e.g. aio/io_uring/ioprio_set()) Any of the above case should also allow using ioprio class/level and CDL hint. I think the most problematic part is (2) when lifetime are set with both fcntl() and per IO: which lifetime is the valid one ? The one set with fcntl() or the one specified for the IO ? I think the former is the one we want here. If we can clarify that, then I guess using 3 or 4 bits from the 10 bits ioprio hint should be OK. That would give you 7 or 15 lifetime values. Enough no ? -- Damien Le Moal Western Digital Research