Re: single-threaded seastar-osd

Radoslaw Zarzynski <rzarzyns@xxxxxxxxxx> · Mon, 7 Jan 2019 02:16:28 +0100

On Sat, Jan 5, 2019 at 7:42 PM Sage Weil <sweil@xxxxxxxxxx> wrote:
>
> On Sat, 5 Jan 2019, kefu chai wrote:
> > - unable to share the osdmap cache. considering a high-density storage
> > deployment, where more than 40 disks are squeezed into a host, if we
> > are not able to reuse the osdmap cache. that's a shame..
>
> I think this is a bit of a red herring.  We could confine each OSD to a
> single core, each with its own distinct messenger, with no cross-core
> context switches in the IO path, but still put all (or many) OSDs inside
> the same process with a shared OSDMap cache.  The semantics of sharing the
> cache are comparatively simple, with a immutable maps that are only
> occasionally added.

Agreed, resource sharing and engine isolation are somewhat related but
definitely not the same issues. From reduced sharing we expect better
performance and simplicity, from stronger isolation - less bugs and
further cost reduction. To exemplify: perfect shared-nothing design allows
to switch all std::shared_ptrs to non-atomic seastar::shared_ptrs (and for
ceph::atomic wrapper in general ;-), perfect engine isolation (single
thread) would let to merge the patches without turning over all rocks for
correctness validation -- it would come just from the definition of data
race. There would be no bug due to sharing unnoticed in the review.

Without judging the reasonableness for now, I would like to just signalize
that shared-something is theoretically possible even in the 1 OSD/1 process
approach via shared memory.

Before going further, let me play the accountant's advocate and
ask: is *spending* the complexity on the shared cache really worth
benefits we could get? How much memory can we save?

>From yet another side: multiple OSDs in the same process but with *almost*
no sharing would *still* allow for the user-space IO scheduler Kefu has
pointed out some time ago.

Regards,
Radek