RE: status of spdk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi James,

Yes, the multi processes support of SPDK is under development, Gang is the developer for the feature of  SPDK.
We are targeting to release the feature in 16.12 version for SPDK(WW50).


> -----Original Message-----
> From: LIU, Fei [mailto:james.liu@xxxxxxxxxxxxxxx]
> Sent: Wednesday, November 9, 2016 1:03 PM
> To: Haomai Wang <haomaiwang@xxxxxxxxx>; Liu, Changpeng
> <changpeng.liu@xxxxxxxxx>
> Cc: Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx>; Sage Weil
> <sweil@xxxxxxxxxx>; ceph-devel <ceph-devel@xxxxxxxxxxxxxxx>
> Subject: Re: status of spdk
> 
> Haomai,
>    Thanks a lot.
> 
>    Regards,
>    James
> 
> Hi Changpeng,
>    Would you mind updating us about the status of multi processes support of
> spdk?
> 
>    Regards,
>    James
> 
> On 11/8/16, 8:59 PM, "Haomai Wang" <haomaiwang@xxxxxxxxx> wrote:
> 
>     On Wed, Nov 9, 2016 at 8:21 AM, LIU, Fei <james.liu@xxxxxxxxxxxxxxx> wrote:
>     > Hi Yehuda and Haomai,
>     >    The issue of drives driven by SPDK is not able to be shared by multiple OSDs
> as kernel NVMe drive since SPDK as a process so far can not be shared across
> multiple processes like OSDs, right?
> 
>     spdk nvme supports multi process is a undergoing spdk feature now, it
>     will be implemented via shared memory among multi process.
> 
>     >
>     >    Regards,
>     >    James
>     >
>     >
>     >
>     > On 11/8/16, 4:06 PM, "Yehuda Sadeh-Weinraub" <ceph-devel-
> owner@xxxxxxxxxxxxxxx on behalf of yehuda@xxxxxxxxxx> wrote:
>     >
>     >     On Tue, Nov 8, 2016 at 3:40 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>     >     > On Tue, 8 Nov 2016, Yehuda Sadeh-Weinraub wrote:
>     >     >> I just started looking at spdk, and have a few comments and questions.
>     >     >>
>     >     >> First, it's not clear to me how we should handle build. At the moment
>     >     >> the spdk code resides as a submodule in the ceph tree, but it depends
>     >     >> on dpdk, which currently needs to be downloaded separately. We can
> add
>     >     >> it as a submodule (upstream is here: git://dpdk.org/dpdk). That been
>     >     >> said, getting it to build was a bit tricky and I think it might be
>     >     >> broken with cmake. In order to get it working I resorted to building a
>     >     >> system library and use that.
>     >     >
>     >     > Note that this PR is about to merge
>     >     >
>     >     >         https://github.com/ceph/ceph/pull/10748
>     >     >
>     >     > which adds the DPDK submodule, so hopefully this issue will go away
> when
>     >     > that merged or with a follow-on cleanup.
>     >     >
>     >     >> The way to currently configure an osd to use bluestore with spdk is by
>     >     >> creating a symbolic link that replaces the bluestore 'block' device to
>     >     >> point to a file that has a name that is prefixed with 'spdk:'.
>     >     >> Originally I assumed that the suffix would be the nvme device id, but
>     >     >> it seems that it's not really needed, however, the file itself needs
>     >     >> to contain the device id (see
>     >     >> https://github.com/yehudasa/ceph/tree/wip-yehuda-spdk for a couple
> of
>     >     >> minor fixes).
>     >     >
>     >     > Open a PR for those?
>     >
>     >     Sure
>     >
>     >     >
>     >     >> As I understand it, in order to support multiple osds on the same NVMe
>     >     >> device we have a few options. We can leverage NVMe namespaces, but
>     >     >> that's not supported on all devices. We can configure bluestore to
>     >     >> only use part of the device (device sharding? not sure if it supports
>     >     >> it). I think it's best if we could keep bluestore out of the loop
>     >     >> there and have the NVMe driver abstract multiple partitions of the
>     >     >> NVMe device. The idea is to be able to define multiple partitions on
>     >     >> the device (e.g., each partition will be defined by the offset, size,
>     >     >> and namespace), and have the osd set to use a specific partition.
>     >     >> We'll probably need a special tool to manage it, and potentially keep
>     >     >> the partition table information on the device itself. The tool could
>     >     >> also manage the creation of the block link. We should probably rethink
>     >     >> how the link is structure and what it points at.
>     >     >
>     >     > I agree that bluestore shouldn't get involved.
>     >     >
>     >     > Is the NVMe namespaces meant to support multiple processes sharing
> the
>     >     > same hardware device?
>     >
>     >     More of a partitioning solution, but yes (as far as I undestand).
>     >
>     >     >
>     >     > Also, if you do that, is it possible to give one of the namespaces to the
>     >     > kernel?  That might solve the bootstrapping problem we currently have
>     >
>     >     Theoretically, but not right now (or ever?). See here:
>     >
>     >     https://lists.01.org/pipermail/spdk/2016-July/000073.html
>     >
>     >     > where we have nowhere to put the $osd_data filesystem with the device
>     >     > metadata.  (This is admittedly not necessarily a blocking issue.  Putting
>     >     > those dirs on / wouldn't be the end of the world; it just means cards
>     >     > can't be easily moved between boxes.)
>     >     >
>     >
>     >     Maybe we can use bluestore for these too ;) that been said, there
>     >     might be some kind of a loopback solution that could work, but not
>     >     sure if it won't create major bottlenecks that we'd want to avoid.
>     >
>     >     Yehuda
>     >     --
>     >     To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>     >     the body of a message to majordomo@xxxxxxxxxxxxxxx
>     >     More majordomo info at  http://vger.kernel.org/majordomo-info.html
>     >
>     >
>     >
> 
> 
> 
>     --
>     Best Regards,
> 
>     Wheat
> 
> 

��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux