On 2018/10/03 16:18, Linus Walleij wrote: > On Wed, Oct 3, 2018 at 9:05 AM Artem Bityutskiy <dedekind1@xxxxxxxxx> wrote: >> On Wed, 2018-10-03 at 08:29 +0200, Paolo Valente wrote: >>> So, I do understand your need for conservativeness, but, after so much >>> evidence on single-queue devices, and so many years! :), what's the >>> point in keeping Linux worse for virtually everybody, by default? >> >> Sounds like what we just need a mechanism for the device (ubi block in >> this case) to select the I/O scheduler. I doubt enhancing the default >> scheduler selection logic in 'elevator.c' is the right answer. Just >> give the driver authority to override the defaults. > > This might be true in the wider sense (like for what scheduler to > select for an NVME device with N channels) but $SUBJECT is just > trying to select BFQ (if available) for devices with one and only one > hardware queue. > > That is AFAICT the only reasonable choice for anything with just > one hardware queue as things stand right now. > > I have a slight reservation for the weird outliers like loopdev, which > has "one hardware queue" (.nr_hw_queues == 1) though this > makes no sense at all. So I would like to know what people think > about that. Maybe we should have .nr_queues and .nr_hw_queues > where the former is the number of logical queues and the latter > the actual number of hardware queues. There is another class of outliers: host-managed SMR disks (SATA and SCSI, definitely single hw queue). For these, using mq-deadline is mandatory in many cases in order to guarantee sequential write command delivery to the device driver. Having the default changed to bfq, which as far as I know is not SMR friendly (can sequential writes within a single zone be reordered ?) is asking for troubles (unaligned write errors showing up). A while back, we already had this discussion with Jens and Christoph on the list to allow device drivers to set a sensible default I/O scheduler for devices with "special needs" (e.g. host-managed SMR). At the time, the conclusion was that udev (or something alike in userland) is better suited to set a correct scheduler. Of note also is that host-managed like sequential zone devices are also likely to show up soon with the work being done in the NVMe standard on the new "Zoned namespace" feature proposal. These devices will also require a scheduler like mq-deadline guaranteeing per-zone in-order delivery of sequential write requests. Looking only at the number of queues of the device is not enough to choose the best (most reasonnable/appropriate) scheduler. -- Damien Le Moal Western Digital Research