Re: rbd kernel block driver memory usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 26, 2023 at 02:48:27PM +0100, Ilya Dryomov wrote:
> On Wed, Jan 25, 2023 at 5:57 PM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote:
> >
> > Hi,
> > What sort of memory usage is expected under heavy I/O to an rbd block
> > device with O_DIRECT?
> >
> > For example:
> > - Page cache: none (O_DIRECT)
> > - Socket snd/rcv buffers: yes
> 
> Hi Stefan,
> 
> There is a socket open to each OSD (object storage daemon).  A Ceph
> cluster may have tens, hundreds or even thousands of OSDs (although the
> latter is rare -- usually folks end up with several smaller clusters
> instead a single large cluster).  Under heavy random I/O and given
> a big enough RBD image, it's reasonable to assume that most if not all
> OSDs would be involved and therefore their sessions would be active.
> 
> A thing to note is that, by default, OSD sessions are shared between
> RBD devices.  So as long as all RBD images that are mapped on a node
> belong to the same cluster, the same set of sockets would be used.
> 
> Idle OSD sockets get closed after 60 seconds of inactivity.
> 
> 
> > - Internal rbd buffers?
> >
> > I am trying to understand how similar Linux rbd block devices behave
> > compared to local block device memory consumption (like NVMe PCI).
> 
> RBD doesn't do any internal buffering.  Data is read from/written to
> the wire directly to/from BIO pages.  The only exception to that is the
> "secure" mode -- built-in encryption for Ceph on-the-wire protocol.  In
> that case the data is buffered, partly because RBD obviously can't mess
> with plaintext data in the BIO and partly because the Linux kernel
> crypto API isn't flexible enough.
> 
> There is some memory overhead associated with each I/O (OSD request
> metadata encoding, mostly).  It's surely larger than in the NVMe PCI
> case.  I don't have the exact number but it should be less than 4K per
> I/O in almost all cases.  This memory is coming out of private SLAB
> caches and could be reclaimable had we set SLAB_RECLAIM_ACCOUNT on
> them.

Thanks, this information is very useful. I was trying to get a sense of
whether to look deeper into the rbd driver in a OOM kill scenario.

Stefan

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux