On Mon, Oct 23, 2017 at 12:46 AM, Daniel Pryor <dpryor@xxxxxxxxxxxxx> wrote: > I've completely upgraded my cluster and made sure my clients were luminous > too. Our cluster creates lots of directories really fast and because of the > layering it takes >1 second creating those directories. I would really like > to be able to diagnose exactly where the slowness is. I'm thinking mds, but > not 100% sure. We have benchmarked all pools and they are really fast. We > have also removed the directory structure in our app and our filesystem > writes 2KB to 80KB files in 4-10ms. > What's your workload? how many clients concurrently operate on the file system? Could you set debug_mds=10 during the directory creation and send mds log to use. > example structure: > > Mount location: /mnt/fsstore > Directory Structure: /mnt/fsstore/PDFDOCUMENT/2f/00d/f28/49a/74d/2e8/ > File: > /mnt/fsstore/PDFDOCUMENT/2f/00d/f28/49a/74d/2e8/2f00df28-49a7-4d2e-85c5-20217bafbf6c > > Daniel Pryor | Sr. DevOps Engineer > dpryor@xxxxxxxxxxxxx > direct 480.719.1646 ext. 1318 | mobile 208.757.2680 > 6263 North Scottsdale Road, Suite 330, Scottsdale, AZ 85250 > Parchment | Turn Credentials into Opportunities > www.parchment.com > > > On Thu, Oct 19, 2017 at 8:22 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> >> On Thu, 19 Oct 2017, Daniel Pryor wrote: >> > Hello Everyone, >> > >> > We are currently running into two issues. >> > >> > 1) We are noticing huge pauses during directory creation, but our file >> > write >> > times are super fast. The metadata and data pools are on the same >> > infrastructure. >> > * https://gist.github.com/pryorda/a0d5c37f119c4a320fa4ca9d48c8752b >> > * https://gist.github.com/pryorda/ba6e5c2f94f67ca72a744b90cc58024e >> >> Separate metadata onto different (ideally faster) devices is usually a >> good idea if you want to protect metadata performance. The stalls you're >> seeing could either be MDS requests getting slowed down by the OSDs or it >> might be the MDS missing something in it's cache and having to go >> fetch or flush something to RADOS. You might see if increasing the MDS >> cache size helps. >> >> > 2) Since we were having the issue above, we wanted to possibly move to a >> > larger top level directory. Stuff everything in there and later move >> > everything out via a batch job. To do this we need to increase the the >> > directory limit from 100,000 to 300,000. How do we increase this limit. >> >> I would recommend upgrading to luminous and enabling directory >> fragmentation instead of increasing the per-fragment limit on Jewel. Big >> fragments have a negative impact on MDS performance (leading to spikes >> like you see above) and can also make life harder for the OSDs. >> >> sage >> >> >> >> > >> > >> > dpryor@beta-ceph-node1:~$ dpkg -l |grep ceph >> > ii ceph-base 10.2.10-1xenial >> > amd64 common ceph daemon libraries and management tools >> > ii ceph-common 10.2.10-1xenial >> > amd64 common utilities to mount and interact with a ceph storage >> > cluster >> > ii ceph-deploy 1.5.38 >> > all Ceph-deploy is an easy to use configuration tool >> > ii ceph-mds 10.2.10-1xenial >> > amd64 metadata server for the ceph distributed file system >> > ii ceph-mon 10.2.10-1xenial >> > amd64 monitor server for the ceph storage system >> > ii ceph-osd 10.2.10-1xenial >> > amd64 OSD server for the ceph storage system >> > ii libcephfs1 10.2.10-1xenial >> > amd64 Ceph distributed file system client library >> > ii python-cephfs 10.2.10-1xenial >> > amd64 Python libraries for the Ceph libcephfs library >> > dpryor@beta-ceph-node1:~$ >> > >> > Any direction would be appreciated!? >> > >> > Thanks, >> > Daniel >> > >> > >> > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com