Re: [PATCH 8/8] block: Always initialize bio IO priority on submit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/17/22 20:49, Jan Kara wrote:
> On Fri 17-06-22 09:04:34, Damien Le Moal wrote:
>> On 6/16/22 21:24, Jan Kara wrote:
>>> On Thu 16-06-22 13:23:03, Jan Kara wrote:
>>>> On Thu 16-06-22 12:15:25, Damien Le Moal wrote:
>>>>> On 6/16/22 01:16, Jan Kara wrote:
>>>>>> +	if (ioprio_class == IOPRIO_CLASS_NONE)
>>>>>> +		bio->bi_ioprio = get_current_ioprio();
>>>>>>   }
>>>>>>   /**
>>>>>
>>>>> Beside this comment, I am still scratching my head regarding what the user
>>>>> gets with ioprio_get(). If I understood your patches correctly, the user may
>>>>> still see IOPRIO_CLASS_NONE ?
>>>>> For that case, to be in sync with the man page, I thought the returned
>>>>> ioprio should be the effective one based on the task io nice value, that is,
>>>>> the value returned by get_current_ioprio(). Am I missing something... ?
>>>>
>>>> The trouble with returning "effective ioprio" is that with IOPRIO_WHO_PGRP
>>>> or IOPRIO_WHO_USER the effective IO priority may be different for different
>>>> processes considered and it can be also further influenced by blk-ioprio
>>>> settings. But thinking about it now after things have settled I agree that
>>>> what you suggests makes more sense. I'll fix that. Thanks for suggestion!
>>>
>>> Oh, now I've remembered why I've done it that way. With IOPRIO_WHO_PROCESS
>>> (which is probably the most used and the best defined variant), we were
>>> returning IOPRIO_CLASS_NONE if the task didn't have set IO priority until
>>> commit e70344c05995 ("block: fix default IO priority handling"). So my
>>> patch was just making behavior of IOPRIO_WHO_PGRP & IOPRIO_WHO_USER
>>> consistent with the behavior of IOPRIO_WHO_PROCESS. I'd be reluctant to
>>> change the behavior of IOPRIO_WHO_PROCESS because that has the biggest
>>> chances for userspace regressions. But perhaps it makes sense to keep
>>> IOPRIO_WHO_PGRP & IOPRIO_WHO_USER inconsistent with IOPRIO_WHO_PROCESS and
>>> just use effective IO priority in those two variants. That looks like the
>>> smallest API change to make things at least somewhat sensible...
>>
>> Still bit lost. Let me try to summarize your goal:
>>
>> 1) If IOPRIO_WHO_PGRP is not set, ioprio_get(IOPRIO_WHO_PGRP) will return
>> the effective priority
> 
> You make it sound here like IOPRIO_WHO_PGRP would be some different type of
> IO priority. For record it is not, there's just one IO priority per task,
> if you set ioprio with IOPRIO_WHO_PGRP, it will just iterate all the tasks
> in PGRP and set IO priority for each task. After my patches,
> ioprio_get(IOPRIO_WHO_PGRPIO) will return the best of the effective IO
> priorities of tasks within PGRP. Before my patch it was doing the same but
> if IO priority was unset for some task it considered it to be CLASS_BE,4.

OK. Got it. Thanks for clarifying.

> 
>> 2) If IOPRIO_WHO_USER is not set, ioprio_get(IOPRIO_WHO_USER) will also
>> return the effective priority.
> 
> This is the same as above. Just the calls iterate over all tasks of the
> given user...
> 
>> 3) if IOPRIO_WHO_PROCESS is not set, return ? I am lost for this one. Do
>> you want to go back to IOPRIO_CLASS_NONE ? Keep default (IOPRIO_CLASS_BE)
>> ? Or switch to using the effective IO priority ? Not that the last 2
>> choices are actually equivalent if the user did not IO nice the process
>> (the default for the effective IO prio is class BE)
>  
> I want to go back to returning IOPRIO_CLASS_NONE for tasks with unset IO
> priority.

And that would be to retain the older (broken) behavior. Because if we
consider the man page, tasks with an unset IO prio should be reported as
having the effective IO nice based priority, which is class BE if IO nice
is not set. Right ? I am OK with that, but I think we should add this
explanation as a comment somewhere in the prio code. No ?

> 
>> For (1) and (2), I am not sure. Given that my last changes to the ioprio
>> default did not seem to have bothered anyone (nobody screamed at me :)) I
>> am tempted to say: any choice is OK. So we should try to get as close as
>> the man page defined behavior as possible.
> 
> I also don't find (1) and (2) too important. (3) is IMHO somewhat important
> and I think that the reason why nobody complained about the change there is
> because your change is relatively new so it didn't propagate yet to any
> widely used distro kernel...

Indeed.

> 
> 								Honza


-- 
Damien Le Moal
Western Digital Research



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux