On 12/23/22 00:21, Paolo Valente wrote: > Single-LUN multi-actuator SCSI drives, as well as all multi-actuator > SATA drives appear as a single device to the I/O subsystem [1]. Yet > they address commands to different actuators internally, as a function > of Logical Block Addressing (LBAs). A given sector is reachable by > only one of the actuators. For example, Seagate’s Serial Advanced > Technology Attachment (SATA) version contains two actuators and maps > the lower half of the SATA LBA space to the lower actuator and the > upper half to the upper actuator. > > Evidently, to fully utilize actuators, no actuator must be left idle > or underutilized while there is pending I/O for it. The block layer > must somehow control the load of each actuator individually. This > commit lays the ground for allowing BFQ to provide such a per-actuator > control. > > BFQ associates an I/O-request sync bfq_queue with each process doing > synchronous I/O, or with a group of processes, in case of queue > merging. Then BFQ serves one bfq_queue at a time. While in service, a > bfq_queue is emptied in request-position order. Yet the same process, > or group of processes, may generate I/O for different actuators. In > this case, different streams of I/O (each for a different actuator) > get all inserted into the same sync bfq_queue. So there is basically > no individual control on when each stream is served, i.e., on when the > I/O requests of the stream are picked from the bfq_queue and > dispatched to the drive. > > This commit enables BFQ to control the service of each actuator > individually for synchronous I/O, by simply splitting each sync > bfq_queue into N queues, one for each actuator. In other words, a sync > bfq_queue is now associated to a pair (process, actuator). As a > consequence of this split, the per-queue proportional-share policy > implemented by BFQ will guarantee that the sync I/O generated for each > actuator, by each process, receives its fair share of service. > > This is just a preparatory patch. If the I/O of the same process > happens to be sent to different queues, then each of these queues may > undergo queue merging. To handle this event, the bfq_io_cq data > structure must be properly extended. In addition, stable merging must > be disabled to avoid loss of control on individual actuators. Finally, > also async queues must be split. These issues are described in detail > and addressed in next commits. As for this commit, although multiple > per-process bfq_queues are provided, the I/O of each process or group > of processes is still sent to only one queue, regardless of the > actuator the I/O is for. The forwarding to distinct bfq_queues will be > enabled after addressing the above issues. > > [1] https://www.linaro.org/blog/budget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives/ > > Signed-off-by: Gabriele Felici <felicigb@xxxxxxxxx> > Signed-off-by: Carmine Zaccagnino <carmine@xxxxxxxxxxxxxxx> > Signed-off-by: Paolo Valente <paolo.valente@xxxxxxxxxx> One styles nit below. Reviewed-by: Damien Le Moal <damien.lemoal@xxxxxxxxxxxxxxxxxx> > @@ -690,14 +700,25 @@ static void bfq_limit_depth(blk_opf_t opf, struct blk_mq_alloc_data *data) > limit = (limit * depth) >> bfqd->full_depth_shift; > } > > - /* > - * Does queue (or any parent entity) exceed number of requests that > - * should be available to it? Heavily limit depth so that it cannot > - * consume more available requests and thus starve other entities. > - */ > - if (bfqq && bfqq_request_over_limit(bfqq, limit)) > - depth = 1; > + for (act_idx = 0; act_idx < bfqd->num_actuators; act_idx++) { > + struct bfq_queue *bfqq; > + > + if (bic) > + bfqq = bic_to_bfqq(bic, op_is_sync(opf), act_idx); > + else > + break; > > + /* > + * Does queue (or any parent entity) exceed number of > + * requests that should be available to it? Heavily > + * limit depth so that it cannot consume more > + * available requests and thus starve other entities. > + */ > + if (bfqq && bfqq_request_over_limit(bfqq, limit)) { > + depth = 1; > + break; > + } You could reverse the if condition to make this cleaner, or even better, include the bic test in the for loop: for (act_idx = 0; bic && act_idx < bfqd->num_actuators; act_idx++) { struct bfq_queue *bfqq; /* * Does queue (or any parent entity) exceed number of * requests that should be available to it? Heavily * limit depth so that it cannot consume more * available requests and thus starve other entities. */ bfqq = bic_to_bfqq(bic, op_is_sync(opf), act_idx); if (bfqq && bfqq_request_over_limit(bfqq, limit)) { depth = 1; break; } } -- Damien Le Moal Western Digital Research