On Thu, Jan 26, 2023 at 02:48:27PM +0100, Ilya Dryomov wrote: > On Wed, Jan 25, 2023 at 5:57 PM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > > > Hi, > > What sort of memory usage is expected under heavy I/O to an rbd block > > device with O_DIRECT? > > > > For example: > > - Page cache: none (O_DIRECT) > > - Socket snd/rcv buffers: yes > > Hi Stefan, > > There is a socket open to each OSD (object storage daemon). A Ceph > cluster may have tens, hundreds or even thousands of OSDs (although the > latter is rare -- usually folks end up with several smaller clusters > instead a single large cluster). Under heavy random I/O and given > a big enough RBD image, it's reasonable to assume that most if not all > OSDs would be involved and therefore their sessions would be active. > > A thing to note is that, by default, OSD sessions are shared between > RBD devices. So as long as all RBD images that are mapped on a node > belong to the same cluster, the same set of sockets would be used. > > Idle OSD sockets get closed after 60 seconds of inactivity. > > > > - Internal rbd buffers? > > > > I am trying to understand how similar Linux rbd block devices behave > > compared to local block device memory consumption (like NVMe PCI). > > RBD doesn't do any internal buffering. Data is read from/written to > the wire directly to/from BIO pages. The only exception to that is the > "secure" mode -- built-in encryption for Ceph on-the-wire protocol. In > that case the data is buffered, partly because RBD obviously can't mess > with plaintext data in the BIO and partly because the Linux kernel > crypto API isn't flexible enough. > > There is some memory overhead associated with each I/O (OSD request > metadata encoding, mostly). It's surely larger than in the NVMe PCI > case. I don't have the exact number but it should be less than 4K per > I/O in almost all cases. This memory is coming out of private SLAB > caches and could be reclaimable had we set SLAB_RECLAIM_ACCOUNT on > them. Thanks, this information is very useful. I was trying to get a sense of whether to look deeper into the rbd driver in a OOM kill scenario. Stefan
Attachment:
signature.asc
Description: PGP signature