Re: [PATCH 0/6] tcm_vhost/virtio-scsi WIP code for-3.6

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Mon, 09 Jul 2012 17:29:14 -0700

Hi folks,

On Wed, 2012-07-04 at 18:52 -0700, Nicholas A. Bellinger wrote:
> 
> To give an idea of how things are looking on the performance side, here
> some initial numbers for small block (4k) mixed random IOPs using the
> following fio test setup:

<SNIP>

> fio randrw workload | virtio-scsi-raw | virtio-scsi+tcm_vhost | bare-metal raw block
> ------------------------------------------------------------------------------------
> 25 Write / 75 Read  |      ~15K       |         ~45K          |         ~70K
> 75 Write / 25 Read  |      ~20K       |         ~55K          |         ~60K
> 
> 

After checking the original benchmarks here again, I realized that for
virtio-scsi+tcm_vhost the results where actually switched..

So this should have been: heavier READ case (25 / 75) == 55K, and
heavier WRITE case (75 / 25) == 45K.

> In the first case, virtio-scsi+tcm_vhost is out performing by 3x
> compared to virtio-scsi-raw using QEMU SCSI emulation with the same raw
> flash backend device.  For the second case heavier WRITE case, tcm_vhost
> is nearing full bare-metal utilization (~55K vs. ~60K).
> 
> Also converting tcm_vhost to use proper cmwq process context I/O
> submission will help to get even closer to bare metal speeds for both
> work-loads.
> 

Here are initial follow-up virtio-scsi randrw 4k benchmarks with
tcm_vhost recently converted to run backend I/O dispatch via modern cmwq
primitives (kworkerd).

fio randrw 4k workload | virtio-scsi+tcm_vhost+cmwq
---------------------------------------------------
  25 Write / 75 Read   |          ~60K
  75 Write / 25 Read   |	  ~45K

So aside from the minor performance improvement for the 25 / 75
workload, the other main improvement is lower CPU usage using the
iomemory_vsl backends.  This is attributed to cmwq providing process
context on the same core as the vhost thread pulling items off vq, which
ends up being on the order of 1/3 less host CPU usage (for both
workloads) primarly from positive cache effects.

This patch is now available in target-pending/tcm_vhost, and I'll be
respinning the initial merge series into for-next-merge over the next
days + another round of list review.

Please let us know if you have any concerns.

Thanks!

--nab

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html