Hi Frank, On Tue, Nov 29, 2022 at 5:38 PM Frank Schilder <frans@xxxxxx> wrote: > > Hi Venky, > > maybe you can help me clarifying the situation a bit. I don't understand the difference between the two pinning implementations you describe in your reply and I also don't see any difference in meaning in the documentation between octopus and quicy, the difference is just in wording. Both texts state that "all of a directory’s immediate children should be ephemerally pinned" (octopus) and "This has the effect of distributing immediate children across a range of MDS ranks" (quincy). > > To me, both mean that, if I enable distributed ephemeral pinning on /home, then for every child /home/X of home it follows that /home/X and any directory under /home/X/ are pinned to the same MDS rank. Meaning their information in cache exists on this rank only and no other MDS is serving requests for any of these directories. > > Is there something wrong with this interpretation? Distributed ephemeral pins will distribute immediate children across a range of MDS ranks - /home/X might be on rank 1, /home/Y on rank 2, /home/Z on rank 0, and so on. > > I tried it with octopus and the cache for directories under /home/X/ was all over the place. Nothing was pinned to a single rank and on top of that the number of sub-trees was extremely unevenly assigned and excessively large. After I set an explicit pin on every child /home/X of /home, only then was all cache information about all subdirs of /home/X/ handled by the MDS I pinned it to. The directories (children) are spread across MDSs based on the (consistent) hash of its inode number. The distribution should be uniform across ranks. > > What should the result of distributed ephemeral pinning actually be when set on /home? > What would be different between octopus and quincy? It's an implementation difference. In octopus, each child dir (direct descendent of the ephemeral pinned directory) is pinned to a target MDS based on the hash of its (child dir) inode number. From pacific onwards, the dirfrags are distributed across ranks. This limits the number of subtrees. > Is the documentation (for octopus) misleading or does the implementation not match documentation? I think the docs are fine - quincy docs do mention that the directory fragments are distributed while the octopus docs do not. I agree, the wordings are a bit subtle. > > Thanks for any insight! > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Venky Shankar <vshankar@xxxxxxxxxx> > Sent: 29 November 2022 10:09:21 > To: Frank Schilder > Cc: Reed Dier; ceph-users > Subject: Re: Re: MDS stuck ops > > On Tue, Nov 29, 2022 at 1:42 PM Frank Schilder <frans@xxxxxx> wrote: > > > > Hi Venky. > > > > > You most likely ran into performance issues with distributed ephemeral > > > pins with octopus. It'd be nice to try out one of the latest releases > > > for this. > > > > I run into the problem that distributed ephemeral pinning seems not actually implemented in octopus. This mode didn't pin anything, see also the recent conversation with Patrick: > > Distributed ephemeral pins used to distribute inodes under a directory > mongst MDSs which had scalability issues due to the sheer number of > subtrees. This was changed to distribute dirfrags and I think those > changes were not in octopus. > > > > > https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/YEB34F5SREAOOMATOKC6NO3G2GVCSOOZ > > > > I sent him a couple of dumps, but am not sure if he is doing anything with it. I wrote a small script to do the distributed pinning by hand and it solved all sorts of problems. > > Distributing dirfrags solved a lot of scalability issues and those > changes are available in pacific and beyond. We aren't backporting to > octopus anymore, so the options are limited. > > > > > Best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > > -- > Cheers, > Venky > -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx