Re: single-threaded seastar-osd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-01-09 5:30 p.m., Matt Benjamin wrote:
On Tue, Jan 8, 2019 at 8:32 PM Radoslaw Zarzynski <rzarzyns@xxxxxxxxxx> wrote:


<snipped makes-sense-to-me-stuff>

Personally I perceive the OSD *concept* as networked ObjectStore instance
exposed over the RADOS protocol.


I remain concerned that this is framing is too strong.  Recall that
well before the seastar-osd concept, several teams (mellanox, folks on
my team, Fujitsu/Piotr, and I think by Sam) have asked to flex in the
other direction--mapping a reduced number of network connections to
OSDs.

That's still the case. I have a test cluster consisting of 6 hosts, 66 osd in total, 1536 PGs. How many connections are maintained by ceph-osd processes on a host? Almost 1700:

# netstat -ntp | grep -c ceph-osd
1650

How many connections are maintained by randomly taken ceph-osd process?

# netstat -ntp | grep -c 354420/ceph-osd
158

The problem remains, even if it's reduced by transition to async messenger (less threads and less cpu time wasted for context switches) and by transition to all-NVMe clusters that by definition pack less OSDs per host (usually 2-3 per NVMe).

When infiniband rc is the transport with Mellanox connect-x3 or
-x4, each reliable connection consumes 1 queue pair, and there there
are 64K -total- qps available on the hca.  Solutions are in the
direction of ud or hybridizing with shared receive-queue.  I'm not
arguing message-passing/datagram orientation should somehow take
precedence, but I think we need to make space for those setups in what
we design now.  Taking any incidence of cross-core communication as an
intolerable event feels problematic for that?

+1 for going with datagram/connectionless. Heartbeats alone could be moved to connectionless communication and that would already help a lot.

--
Piotr Dałek
piotr.dalek@xxxxxxxxxxxx
https://www.ovhcloud.com



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux