Re: [PATCH 3/3] block/mq-deadline: Disable I/O prioritization in certain cases

Damien Le Moal <dlemoal@xxxxxxxxxx> · Tue, 12 Dec 2023 07:40:02 +0900

On 12/12/23 01:57, Christoph Hellwig wrote:
> On Mon, Dec 04, 2023 at 09:32:13PM -0800, Bart Van Assche wrote:
>> Fix the following two issues:
>> - Even with prio_aging_expire set to zero, I/O priorities still affect the
>>   request order.
>> - Assigning I/O priorities with the ioprio cgroup policy breaks zoned
>>   storage support in the mq-deadline scheduler.
> 
> Not it doesn't, how would it?  Or do you mean your f2fs hacks where you
> assume there is some order kept?  You really need to get rid of them
> and make sure f2fs doesn't care about reordering by writing the
> metadata that records the data location only at I/O completion time.
> Not only does that make zoned I/O trivially right, it also fixes the
> stale data exposures you are almost guaranteed to have even on
> conventional devices without that.

Priority CGroups can mess things up I think. If you have 2 processes belonging
to different CGs with different priorities and:
1) The processes do raw block device accesses and write to the same zone,
synchronized to get the WP correctly
2) The processes are writing different files and the FS decides to place the
block for the files in the same zone

Case (1) is clearly "the user is doing very stupid things" and for that case,
the user definitely deserve seeing his writes failing. But case (2) is perfectly
legit I think. That is the one that needs to be addressed. The choices I see
are: every file system supporting zone writes need to be priority CG aware when
writing files, or we ignore priority CG when writing.

The latter is I think better than the former as CGs can change without the FS
being aware (as far as I know), and such support would need to be implemented
for all FSes that support zone writing using regular writes (f2fs and zonefs).

-- 
Damien Le Moal
Western Digital Research