On 12/13/23 10:02, Bart Van Assche wrote: > On 12/12/23 13:52, Damien Le Moal wrote: >> Trying to solve this issue in mq-deadline would require keeping track of the io >> priority used for a write request that is issued to a zone and use that same >> priority for all following write requests for the same zone until there are no >> writes pending for that zone. Otherwise, you will get the priority inversion >> causing the reordering. >> >> But I think that doing all this without also causing priority inversion for the >> user, i.e. a high priority write request ends up waiting for a low priority one, >> will be challenging, to say the least. > > Hi Damien, > > How about the following algorithm? > - If a zoned write refers to the start of a zone or no other writes for > the same zone occur in the RB-tree, use the I/O priority of the zoned > write. > - If another write for the same zone occurs in the RB-tree, use the I/O > priority from that other write. > > While this algorithm does not guarantee that all zoned writes for a > single zone have the same I/O priority, it guarantees that the > mq-deadline I/O scheduler won't submit zoned writes in the wrong order > because of their I/O priority. I guess this should work. > > Thanks, > > Bart. > -- Damien Le Moal Western Digital Research