Re: status of spdk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Haomai,
   Thanks a lot.

   Regards,
   James

Hi Changpeng,
   Would you mind updating us about the status of multi processes support of spdk?

   Regards,
   James 

On 11/8/16, 8:59 PM, "Haomai Wang" <haomaiwang@xxxxxxxxx> wrote:

    On Wed, Nov 9, 2016 at 8:21 AM, LIU, Fei <james.liu@xxxxxxxxxxxxxxx> wrote:
    > Hi Yehuda and Haomai,
    >    The issue of drives driven by SPDK is not able to be shared by multiple OSDs as kernel NVMe drive since SPDK as a process so far can not be shared across multiple processes like OSDs, right?
    
    spdk nvme supports multi process is a undergoing spdk feature now, it
    will be implemented via shared memory among multi process.
    
    >
    >    Regards,
    >    James
    >
    >
    >
    > On 11/8/16, 4:06 PM, "Yehuda Sadeh-Weinraub" <ceph-devel-owner@xxxxxxxxxxxxxxx on behalf of yehuda@xxxxxxxxxx> wrote:
    >
    >     On Tue, Nov 8, 2016 at 3:40 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
    >     > On Tue, 8 Nov 2016, Yehuda Sadeh-Weinraub wrote:
    >     >> I just started looking at spdk, and have a few comments and questions.
    >     >>
    >     >> First, it's not clear to me how we should handle build. At the moment
    >     >> the spdk code resides as a submodule in the ceph tree, but it depends
    >     >> on dpdk, which currently needs to be downloaded separately. We can add
    >     >> it as a submodule (upstream is here: git://dpdk.org/dpdk). That been
    >     >> said, getting it to build was a bit tricky and I think it might be
    >     >> broken with cmake. In order to get it working I resorted to building a
    >     >> system library and use that.
    >     >
    >     > Note that this PR is about to merge
    >     >
    >     >         https://github.com/ceph/ceph/pull/10748
    >     >
    >     > which adds the DPDK submodule, so hopefully this issue will go away when
    >     > that merged or with a follow-on cleanup.
    >     >
    >     >> The way to currently configure an osd to use bluestore with spdk is by
    >     >> creating a symbolic link that replaces the bluestore 'block' device to
    >     >> point to a file that has a name that is prefixed with 'spdk:'.
    >     >> Originally I assumed that the suffix would be the nvme device id, but
    >     >> it seems that it's not really needed, however, the file itself needs
    >     >> to contain the device id (see
    >     >> https://github.com/yehudasa/ceph/tree/wip-yehuda-spdk for a couple of
    >     >> minor fixes).
    >     >
    >     > Open a PR for those?
    >
    >     Sure
    >
    >     >
    >     >> As I understand it, in order to support multiple osds on the same NVMe
    >     >> device we have a few options. We can leverage NVMe namespaces, but
    >     >> that's not supported on all devices. We can configure bluestore to
    >     >> only use part of the device (device sharding? not sure if it supports
    >     >> it). I think it's best if we could keep bluestore out of the loop
    >     >> there and have the NVMe driver abstract multiple partitions of the
    >     >> NVMe device. The idea is to be able to define multiple partitions on
    >     >> the device (e.g., each partition will be defined by the offset, size,
    >     >> and namespace), and have the osd set to use a specific partition.
    >     >> We'll probably need a special tool to manage it, and potentially keep
    >     >> the partition table information on the device itself. The tool could
    >     >> also manage the creation of the block link. We should probably rethink
    >     >> how the link is structure and what it points at.
    >     >
    >     > I agree that bluestore shouldn't get involved.
    >     >
    >     > Is the NVMe namespaces meant to support multiple processes sharing the
    >     > same hardware device?
    >
    >     More of a partitioning solution, but yes (as far as I undestand).
    >
    >     >
    >     > Also, if you do that, is it possible to give one of the namespaces to the
    >     > kernel?  That might solve the bootstrapping problem we currently have
    >
    >     Theoretically, but not right now (or ever?). See here:
    >
    >     https://lists.01.org/pipermail/spdk/2016-July/000073.html
    >
    >     > where we have nowhere to put the $osd_data filesystem with the device
    >     > metadata.  (This is admittedly not necessarily a blocking issue.  Putting
    >     > those dirs on / wouldn't be the end of the world; it just means cards
    >     > can't be easily moved between boxes.)
    >     >
    >
    >     Maybe we can use bluestore for these too ;) that been said, there
    >     might be some kind of a loopback solution that could work, but not
    >     sure if it won't create major bottlenecks that we'd want to avoid.
    >
    >     Yehuda
    >     --
    >     To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
    >     the body of a message to majordomo@xxxxxxxxxxxxxxx
    >     More majordomo info at  http://vger.kernel.org/majordomo-info.html
    >
    >
    >
    
    
    
    -- 
    Best Regards,
    
    Wheat
    


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux