On Thu, May 8, 2014 at 8:17 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > Il 08/05/2014 12:44, Ming Lei ha scritto: >> >> On Wed, 07 May 2014 18:43:45 +0200 >> Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> >>> >>> Per-CPU spinlocks have bad scalability problems, especially if you're >>> overcommitting. Writing req_vq is not at all rare. >> >> >> OK, thought about it further, and I believe seqcount may >> be a match for the case, could you take a look at below patch? >> >> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c >> index 13dd500..1adbad7 100644 >> --- a/drivers/scsi/virtio_scsi.c >> +++ b/drivers/scsi/virtio_scsi.c >> @@ -26,6 +26,7 @@ >> #include <scsi/scsi_host.h> >> #include <scsi/scsi_device.h> >> #include <scsi/scsi_cmnd.h> >> +#include <linux/seqlock.h> >> >> #define VIRTIO_SCSI_MEMPOOL_SZ 64 >> #define VIRTIO_SCSI_EVENT_LEN 8 >> @@ -73,18 +74,16 @@ struct virtio_scsi_vq { >> * queue, and also lets the driver optimize the IRQ affinity for the >> virtqueues >> * (each virtqueue's affinity is set to the CPU that "owns" the queue). >> * >> - * tgt_lock is held to serialize reading and writing req_vq. Reading >> req_vq >> - * could be done locklessly, but we do not do it yet. >> + * tgt_seq is held to serialize reading and writing req_vq. >> * >> * Decrements of reqs are never concurrent with writes of req_vq: before >> the >> * decrement reqs will be != 0; after the decrement the virtqueue >> completion >> * routine will not use the req_vq so it can be changed by a new request. >> - * Thus they can happen outside the tgt_lock, provided of course we make >> reqs >> + * Thus they can happen outside the tgt_seq, provided of course we make >> reqs >> * an atomic_t. >> */ >> struct virtio_scsi_target_state { >> - /* This spinlock never held at the same time as vq_lock. */ >> - spinlock_t tgt_lock; >> + seqcount_t tgt_seq; >> >> /* Count of outstanding requests. */ >> atomic_t reqs; >> @@ -521,19 +520,33 @@ static struct virtio_scsi_vq >> *virtscsi_pick_vq(struct virtio_scsi *vscsi, >> unsigned long flags; >> u32 queue_num; >> >> - spin_lock_irqsave(&tgt->tgt_lock, flags); >> + local_irq_save(flags); >> + if (atomic_inc_return(&tgt->reqs) > 1) { >> + unsigned long seq; >> + >> + do { >> + seq = read_seqcount_begin(&tgt->tgt_seq); >> + vq = tgt->req_vq; >> + } while (read_seqcount_retry(&tgt->tgt_seq, seq)); >> + } else { >> + /* no writes can be concurrent because of atomic_t */ >> + write_seqcount_begin(&tgt->tgt_seq); >> + >> + /* keep previous req_vq if there is reader found */ >> + if (unlikely(atomic_read(&tgt->reqs) > 1)) { >> + vq = tgt->req_vq; >> + goto unlock; >> + } >> >> queue_num = smp_processor_id(); >> while (unlikely(queue_num >= vscsi->num_queues)) >> queue_num -= vscsi->num_queues; >> tgt->req_vq = vq = &vscsi->req_vqs[queue_num]; >> + unlock: >> + write_seqcount_end(&tgt->tgt_seq); >> } >> + local_irq_restore(flags); > > > I find this harder to think about than the double-check with a > spin_lock_irqsave in the middle, Sorry, could you explain it a bit? With seqcount, spin_lock isn't needed, which should have been used for serialize read and write. > and the read side is not lock free anymore. It is still lock free, because reader won't block reader, and both read_seqcount_begin and read_seqcount_retry only checks if there is writer in progress or being completed, and the two helpers are very cheap. Thanks, -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html