Yeah, CephFS is much closer to POSIX semantics for a filesystem than NFS. There's an experimental relaxed mode called LazyIO but I'm not sure if it's applicable here. You can debug this by dumping slow requests from the MDS servers via the admin socket Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Sep 12, 2019 at 5:07 PM Stefan Kooman <stefan@xxxxxx> wrote: > > Dear list, > > We recently switched the shared storage for our linux shared hosting > platforms from "nfs" to "cephfs". Performance improvement are > noticeable. It all works fine, however, there is one peculiar thing: > when Apache reloads after a logrotate of the "error" logs all but one > node will hang for ~ 15 minutes. The log rotates are scheduled with a > cron, the nodes themselves synced with ntp. The first node that reloads > apache will keep on working, all the others will hang, and after a > period of ~ 15 minutes they will all recover almost simultaneously. > > Our setup looks like this: 10 webservers all sharing the same cephfs > filesystem. Each webserver with around 100 apache threads has around > 10.000 open file handles to "error" logs on cephfs. To be clear, all > webservers have a file handle on _the same_ "error" logs. The logrotate > takes around two seconds on the "surviving" node. > > What could be the reason for this? Does it have something to do with > file locking, i.e. that it behaves differently on cephfs compared to nfs > (more strict)? What would be a good way to find out what is the root > cause? We have sysdig traces of different nodes, but on the nodes where > apache hangs not a lot is going on ... until it all recovers. > > We remediated this by delaying the Apache reloads on all but one node. > Then there is no issue at all, even as all the other web servers still > reload almost at the same time. > > Any info / hints on how to investigate this issue further are highly > appreciated. > > Gr. Stefan > > -- > | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 > | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com