On 10/5/18 11:46 PM, Paolo Valente wrote:
Il giorno 06 ott 2018, alle ore 05:12, Bart Van Assche <bvanassche@xxxxxxx> ha scritto:
On 10/5/18 2:16 AM, Jan Kara wrote:
On Thu 04-10-18 15:42:52, Bart Van Assche wrote:
What I think is missing is measurement results for BFQ on a system with
multiple CPU sockets and against a fast storage medium. Eliminating
the host lock from the SCSI core yielded a significant performance
improvement for such storage devices. Since the BFQ scheduler locks and
unlocks bfqd->lock for every dispatch operation it is very likely that BFQ
will slow down I/O for fast storage devices, even if their driver only
creates a single hardware queue.
Well, I'm not sure why that is missing. I don't think anyone proposed to
default to BFQ for such setup? Neither was anyone claiming that BFQ is
better in such situation... The proposal has been: Default to BFQ for slow
storage, leave it to deadline-mq otherwise.
How do you define slow storage? The proposal at the start of this thread
was to make BFQ the default for all block devices that create a single
hardware queue. That includes all SATA storage since scsi-mq only creates
a single hardware queue when using the SATA protocol. The proposal to make >> BFQ the default for systems with a single hard disk probably makes sense
but I am not sure that making BFQ the default for systems equipped with
one or more (SATA) SSDs is also a good idea. Especially for multi-socket
systems since BFQ reintroduces a queue-wide lock.
No, BFQ has no queue-wide lock. The very first change made to BFQ for
porting it to blk-mq was to remove the queue lock. Guided by Jens, I
replaced that lock with the exact, same scheduler lock used in
mq-deadline.
It's easy to see that both mq-deadline and BFQ define a queue-wide lock.
For mq-deadline its deadline_data.lock. For BFQ it's bfq_data.lock. That
last lock serializes all bfq_dispatch_request() calls and hence reduces
concurrency while processing I/O requests. From bfq_dispatch_request():
static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
{
struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
[ ... ]
spin_lock_irq(&bfqd->lock);
[ ... ]
}
I think the above makes it very clear that bfqd->lock is queue-wide.
It is easy to understand why both I/O schedulers need a queue-wide lock:
the only way to avoid race conditions when considering all pending I/O
requests for scheduling decisions is to use a lock that covers all
pending requests and hence that is queue-wide.
Bart.