On 4/10/24 22:17, Jens Axboe wrote: > On 4/10/24 4:18 AM, Dongliang Cui wrote: >> The default configuration in the current code is that when the device >> is not busy, a single dispatch will attempt to pull 'nr_requests' >> requests out of the schedule queue. >> >> I tried to track the dispatch process: >> >> COMM TYPE SEC_START IOPRIO INDEX >> fio-17304 R 196798040 0x2005 0 >> fio-17306 R 197060504 0x2005 1 >> fio-17307 R 197346904 0x2005 2 >> fio-17308 R 197609400 0x2005 3 >> fio-17309 R 197873048 0x2005 4 >> fio-17310 R 198134936 0x2005 5 >> ... >> fio-17237 R 197122936 0x0 57 >> fio-17238 R 197384984 0x0 58 >> <...>-17239 R 197647128 0x0 59 >> fio-17240 R 197909208 0x0 60 >> fio-17241 R 198171320 0x0 61 >> fio-17242 R 198433432 0x0 62 >> fio-17300 R 195744088 0x2005 0 >> fio-17301 R 196008504 0x2005 0 >> >> The above data is calculated based on the block event trace, with each >> column containing: process name, request type, sector start address, >> IO priority. >> >> The INDEX represents the order in which the requests are extracted from >> the scheduler queue during a single dispatch process. >> >> Some low-speed devices cannot process these requests at once, and they will >> be requeued to hctx->dispatch and wait for the next issuance. >> >> There will be a problem here, when the IO priority is enabled, if you try >> to dispatch "nr_request" requests at once, the IO priority will be ignored >> from the scheduler queue and all requests will be extracted. >> >> In this scenario, if a high priority request is inserted into the scheduler >> queue, it needs to wait for the low priority request in the hctx->dispatch >> to be processed first. >> >> --------------------dispatch 1st---------------------- >> fio-17241 R 198171320 0x0 61 >> fio-17242 R 198433432 0x0 62 >> --------------------dispatch 2nd---------------------- >> fio-17300 R 195744088 0x2005 0 >> >> In certain scenarios, we hope that requests can be processed in order of io >> priority as much as possible. >> >> Maybe max_dispatch should not be a fixed value, but can be adjusted >> according to device conditions. >> >> So we give a interface to control the maximum value of single dispatch >> so that users can configure it according to devices characteristics. > > I agree that pulling 'nr_requests' out of the scheduler will kind of > defeat the purpose of the scheduler to some extent. But rather than add > another knob that nobody knows about or ever will touch (and extra queue > variables that just take up space), why not just default to something a > bit saner? Eg we could default to 1/8 or 1/4 of the scheduler depth > instead. Why not default to pulling what can actually be executed, that is, up to the number of free hw tags / budget ? Anything more than that will be requeued anyway. -- Damien Le Moal Western Digital Research