Hi Patrick, sorry for the mail flood. The reason I'm asking is that I always see these pairs of warnings: slow request 34.592600 seconds old, received at 2022-11-17T10:44:39.650761+0100: internal op exportdir:mds.3:15730122 currently failed to wrlock, waiting slow request 41.092127 seconds old, received at 2022-11-17T10:44:39.651173+0100: rejoin:mds.3:15730122 currently dispatched The rejoin is worrying me, because it indicates that an active directory fragment has been migrated (a client connection has been moved from one to another MDS). However, active fragments can only be deeper in the directory tree, which in turn should be pinned to a rank and not move. That's why I would really like to know what directories are moved around. Thanks and best regards! ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Frank Schilder <frans@xxxxxx> Sent: 17 November 2022 10:45:20 To: Patrick Donnelly Cc: ceph-users@xxxxxxx Subject: Re: MDS internal op exportdir despite ephemeral pinning Hi Patrick, thanks for your explanation. Is there a way to check which directory is exported? For example, is the inode contained in the messages somewhere? A readdir would usually happen on log-in and the number of slow exports seems much higher than the number of people logging in (I would assume there are a lot more that go without logging). Also, does an export happen for every client connection? For example, we have a 500+ node HPC cluster with kernel mounts. If a job starts on a dir that needs to be loaded to cache, would such an export happen for every client node (we do dropcaches on client nodes after job completion, so there is potential for reloading data)? Thanks a lot! ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Patrick Donnelly <pdonnell@xxxxxxxxxx> Sent: 16 November 2022 22:50:22 To: Frank Schilder Cc: ceph-users@xxxxxxx Subject: Re: MDS internal op exportdir despite ephemeral pinning Hello Frank, On Wed, Nov 16, 2022 at 5:38 AM Frank Schilder <frans@xxxxxx> wrote: > > Hi all, > > I have a question about ephemeral pinning on octopus latest. We have ephemeral pinning set on all directories that are mounted (well on all their parents), like /home etc. Every mount point of a ceph file system should, therefore, be pinned to a specific and fixed MDS rank. However, in the log I see a lot of slow ops warnings like this one: > > slow request 33.765074 seconds old, received at 2022-11-16T11:30:28.340294+0100: internal op exportdir:mds.0:34770855 currently failed to wrlock, waiting > > I don't understand why MDSes still export directories between each other. Am I misunderstanding the warning? What is happening here and why are these ops there? Does this point to a config problem? It may be whatever /home/X directory was pruned from the cache, someone did /readdir on that directory thereby loading it into cache, then the MDS authoritative for /home (probably 0?) exported that directory to wherever it should go. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx