Re: Revisiting MDS memory footprint

John Spray <john.spray@xxxxxxxxxx> · Mon, 1 Dec 2014 16:18:04 +0000

On Fri, Nov 28, 2014 at 1:48 PM, Florian Haas <florian@xxxxxxxxxxx> wrote:
> Out of curiosity: would it matter at all whether or not a significant
> fraction of the files in CephFS were hard links? Clearly the only
> thing that differs in metadata between individual hard-linked files is
> the file name, but I wonder if the Ceph MDS actually takes this into
> consideration. In other words, I'm not sure whether the MDS simply
> adds another pointer to the same set of metadata, or whether that set
> of metadata is actually duplicated in MDS memory. I am guessing the
> latter, but it would be nice to be sure.

When we load a hard link dentry (in CDir::_omap_fetched), if we
already have the inode in cache then we just refer to that copy -- we
never have two of the same inode (CInode object) in memory.  If we
don't have the inode in cache, then the inode isn't loaded until
someone tries to traverse the dentry (i.e. touch the file in any way),
at which point we go to fetch the backtrace from the RADOS object for
that file.

So hard links may incur less memory overhead when loading a directory
fragment, but you will take an I/O hammering when dereferencing them
if the linked inode is not already in cache, as each individual hard
link has to be followed via a separate RADOS object.

In general I would be very cautious about workloads that do a lot of
reads of cold hard linked files, e.g. if benchmarking this case for
backups then you should try to create the hard links, let the files
fall out of cache, then observe the performance of a restore where
many hard links are being dereferenced via backtraces.

I'm mostly reading this from the code rather than from memory, so I'm
sure Greg or Sage will jump in if I'm getting any of these cases
wrong.

Cheers,
John
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com