Let me know if this works and/or you need anything else: https://www.dropbox.com/s/fq47w6jebnyluu0/lookup-logs.tar.gz?dl=0 Beware - the clients were on debug=10. Also, I tried this with the kernel client and it is more consistent; it does the 2 lookups per create on 1 client every single time. On Thu, Jan 15, 2015 at 11:28 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > Can you post the full logs somewhere to look at? These bits aren't > very helpful on their own (except to say, yes, the client cleared its > I_COMPLETE for some reason). > > On Tue, Jan 13, 2015 at 3:45 PM, Michael Sevilla <mikesevilla3@xxxxxxxxx> wrote: >> On Tue, Jan 13, 2015 at 11:13 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> On Mon, Jan 12, 2015 at 10:17 PM, Michael Sevilla >>> <mikesevilla3@xxxxxxxxx> wrote: >>>> I can't get consistent performance with 1 MDS. I have 2 clients create >>>> 100,000 files (separate directories) in a CephFS mount. I ran the >>>> experiment 5 times (deleting the pools/fs and restarting the MDS in >>>> between each run). I graphed the metadata throughput (requests per >>>> second): https://github.com/michaelsevilla/mds/blob/master/graphs/thruput.png >>> >>> So that top line is ~20,000 processed requests/second, as measured at >>> the MDS? (Looking at perfcounters?) And the fast run is doing 10k >>> create requests/second? (This number is much higher than I expected!) >> >> Yes - top line was 20K req/s from perf counter dump and the fast run >> does about 13K creates/s. We were surprised, too... In fact, the >> performance of 1 client per MDS gives us similar performance to >> IndexFS - a system that came out in a paper at Supercomputing this >> year. Here is a throughput graph, normalized to the # of clients, that >> shows how powerful one MDS can actually be: >> https://github.com/michaelsevilla/mds/blob/master/graphs/thruput-norm.png >> >> Keep in mind that runs with more than 1 client aren't creates/s, but ops/sec. ;) >> >>> >>>> Sometimes (run0, run3), both clients issue 2 lookups per create to the >>>> MDS - this makes throughput high but the runtime long since the MDS >>>> processes many more requests. >>>> Sometimes (run2, run4), 1 client does 2 lookups per create and the >>>> other doesn't do any lookups. >>>> Sometimes (run1), neither client does any lookups - this has the >>>> fastest runtime. >>>> >>>> Does anyone know why the client behaves differently for the same exact >>>> experiment? Reading the client logs, it looks like sometimes the >>>> client enters add_update_cap() and clears the inode->flags in >>>> check_cap_issue(), then when a lookup occurs (in _lookup()), the >>>> client can't return ENOENT locally -- forcing it ask the MDS to do the >>>> lookup. But this only happens sometimes (e.g., run0 and run3). >>> >>> If you provide the logs I can check more carefully, but my guess is >>> that you've got another client mounting it, or are looking at both >>> directories from one of the clients, and this is inadvertently causing >>> them to go into shared rather than exclusive mode. >> >> I think you are right! Here is a subset of the client log: >> https://github.com/michaelsevilla/mds/blob/master/scratch/client0.log >> >> These snippets are zoomed into when the client stops sending "create, >> create, create, create..." and starts sending "lookup, lookup, create, >> lookup, lookup, create..." >> >> $ cat client0.log | grep "send_request client" >> create ...file.2098 >> create ...file.2099 >> create ...file.2100 >> create ...file.2101 >> lookup ...file.2102 >> lookup ...file.2102 >> create ...file.2102 >> lookup ...file.2103 >> lookup ...file.2103 >> create ...file.2103 >> lookup ...file.2104 >> lookup ...file.2104 >> create ...file.2104 >> >> I think what you are looking for is on line 687: >> ... clearing (I_COMPLETE|I_DIR_ORDERED) >> ... add_update_cap issued pAsLsXs -> pAsLsXsFsx >> >> It looks like we lose the exclusive mode on the file... but I don't >> understand why the MDS revokes it for 1 client but not the other. The >> MDS log is here: >> https://raw.githubusercontent.com/michaelsevilla/mds/master/scratch/mds.log >> >> >>> >>> How are you trying to keep the directories private during the >>> workload? Some of the more naive solutions won't stand up to >>> repetitive testing given how various components of the system >>> currently behave. >> Is there a way to keep the directories private (i.e. keep the always >> in exclusive mode? That'd be perfect... In my runs, one client does >> mkdir /mnt/cephfs/dir0 and there other does mdkir /mnt/cephfs/dir1... >> >>> >>>> >>>> Details of the experiment: >>>> Workload: 2 clients, 100,000 creates in separate directories, using >>>> the FUSE client >>>> MDS config: client_cache_size = 100000000, mds_cache_size = 16384000 >>> >>> That client_cache_size only has any effect if it's applied to the >>> client-side config. ;) >> Yes - I copy the ceph.conf to the client, too. I think it works >> because the 1 client, 1 MDS test caches all the inodes, according the >> perf counters. >> >> Thanks so much, Greg! >> >> Mike >> >>> -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html