Hi Ming & Co, On Thu, 2015-09-10 at 10:28 -0700, Ming Lin wrote: > On Thu, 2015-09-10 at 15:38 +0100, Stefan Hajnoczi wrote: > > On Thu, Sep 10, 2015 at 6:48 AM, Ming Lin <mlin@xxxxxxxxxx> wrote: > > > These 2 patches added virtio-nvme to kernel and qemu, > > > basically modified from virtio-blk and nvme code. > > > > > > As title said, request for your comments. <SNIP> > > > > At first glance it seems like the virtio_nvme guest driver is just > > another block driver like virtio_blk, so I'm not clear why a > > virtio-nvme device makes sense. > > I think the future "LIO NVMe target" only speaks NVMe protocol. > > Nick(CCed), could you correct me if I'm wrong? > > For SCSI stack, we have: > virtio-scsi(guest) > tcm_vhost(or vhost_scsi, host) > LIO-scsi-target > > For NVMe stack, we'll have similar components: > virtio-nvme(guest) > vhost_nvme(host) > LIO-NVMe-target > I think it's more interesting to consider a 'vhost style' driver that can be used with unmodified nvme host OS drivers. Dr. Hannes (CC'ed) had done something like this for megasas a few years back using specialized QEMU emulation + eventfd based LIO fabric driver, and got it working with Linux + MSFT guests. Doing something similar for nvme would (potentially) be on par with current virtio-scsi+vhost-scsi small-block performance for scsi-mq guests, without the extra burden of a new command set specific virtio driver. > > > > > Now there are lots of duplicated code with linux/nvme-core.c and qemu/nvme.c. > > > The ideal result is to have a multi level NVMe stack(similar as SCSI). > > > So we can re-use the nvme code, for example > > > > > > .-------------------------. > > > | NVMe device register | > > > Upper level | NVMe protocol process | > > > | | > > > '-------------------------' > > > > > > > > > > > > .-----------. .-----------. .------------------. > > > Lower level | PCIe | | VIRTIO | |NVMe over Fabrics | > > > | | | | |initiator | > > > '-----------' '-----------' '------------------' > > > > You mentioned LIO and SCSI. How will NVMe over Fabrics be integrated > > into LIO? If it is mapped to SCSI then using virtio_scsi in the guest > > and tcm_vhost should work. > > I think it's not mapped to SCSI. > > Nick, would you share more here? > (Adding Dave M. CC') So NVMe target code needs to function in at least two different modes: - Direct mapping of nvme backend driver provided hw queues to nvme fabric driver provided hw queues. - Decoding of NVMe command set for basic Read/Write/Flush I/O for submission to existing backend drivers (eg: iblock, fileio, rd_mcp) With the former case, it's safe to assumes there to be anywhere from a very small amount of code involved, to no code involved for fast-path operation. For more involved logic like PR, ALUA, and EXTENDED_COPY, I think both modes will still mostly likely handle some aspects of this in software, and not entirely behind a backend nvme host hw interface. --nab _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization