On 04/10/2018 09:22 PM, Gregory Farnum wrote: > On Tue, Apr 10, 2018 at 6:32 AM Wido den Hollander <wido@xxxxxxxx > <mailto:wido@xxxxxxxx>> wrote: > > Hi, > > There have been numerous threads about this in the past, but I wanted to > bring this up again in a new situation. > > Running with Luminous v12.2.4 I'm seeing some odd Memory and CPU usage > when using the ceph-fuse client to mount a multi-MDS CephFS filesystem. > > health: HEALTH_OK > > services: > mon: 3 daemons, quorum luvil,sanomat,tide > mgr: luvil(active), standbys: tide, sanomat > mds: svw-2/2/2 up {0=luvil=up:active,1=tide=up:active}, 1 > up:standby > osd: 112 osds: 111 up, 111 in > > data: > pools: 2 pools, 4352 pgs > objects: 85549k objects, 4415 GB > usage: 50348 GB used, 772 TB / 821 TB avail > pgs: 4352 active+clean > > After running a rsync with millions of files (and some directories > having 1M files) a ceph-fuse process was using 44GB RSS and using > between 100% and 200% CPU usage. > > Looking at this FUSE client through the admin socket the objecter was > one of my first suspects, but it claimed to only use ~300M of data in > it's case spread out over tens of thousands of files. > > After unmounting and mounting again the Memory usage was gone and we > tried the rsync again, but it wasn't reproducible. > > The CPU usage however is, a "simple" rsync would cause ceph-fuse to use > up to 100% CPU. > > Switching to the kernel client (4.16 kernel) seems to solve this, but > the reason for using ceph-fuse in this would be the lack of a recent > kernel in Debian 9 in this case and the easiness to upgrade the FUSE > client. > > I've tried to disable all logging inside the FUSE client, but that > didn't help. > > When checking on the FUSE client's socket I saw that rename() operations > were hanging and that's something which rsync does a lot. > > At the same time I saw a getfattr() being done to the same inode by the > FUSE client, but to a different MDS: > > rename(): mds rank 0 > getfattr: mds rank 1 > > Although the kernel client seems to perform better it has the same > behavior when looking at the mdsc file in /sys > > 216729 mds0 create (unsafe) > #100021abbd9/.ddd.010236269.mpeg21.a0065.folia.xml.gz.AuxBQj > (reddata2/.ddd.010236269.mpeg21.a0065.folia.xml.gz.AuxBQj) > > 216731 mds1 rename > #100021abbd9/ddd.010236269.mpeg21.a0065.folia.xml.gz > (reddata2/ddd.010236269.mpeg21.a0065.folia.xml.gz) > #100021abbd9/.ddd.010236269.mpeg21.a0065.folia.xml.gz.AuxBQj > (reddata2/.ddd.010236269.mpeg21.a0065.folia.xml.gz.AuxBQj) > > So this is rsync talking to two MDS, one for a create and one for a > rename. > > Is this normal? Is this expected behavior? > > > If the directory got large enough to be sharded across MDSes, yes, it's > expected behavior. There are filesystems that attempt to recognize rsync > and change their normal behavior specifically to deal with this case, > but CephFS isn't one of them (yet, anyway). > Yes, that directory is rather large. I've set max_mds to 1 for now and suddenly both FUSE and the kclient are a lot after, not 10% but something like 80 to 100% faster. It seems like that directory was being balanced between two MDS and that caused a 'massive' slow down. This can probably be influenced by tuning the MDS balancer settings, but I am not sure yet where to start, any suggestions? Wido > Not sure about the specifics of the client memory or CPU usage; I think > you'd have to profile. rsync is a pretty pessimal CephFS workload though > and I think I've heard about this before... > -Greg > > > > To me it seems like that possibly the Subtree Partitioning might be > interfering here, but it wanted to double check. > > Apart from that the CPU and Memory usage of the FUSE client seems very > high and that might be related to this. > > Any ideas? > > Thanks, > > Wido > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com