On Fri, Jan 16, 2015 at 10:34 AM, Michael Sevilla <mikesevilla3@xxxxxxxxx> wrote: > On Thu, Jan 15, 2015 at 10:37 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >> On Thu, Jan 15, 2015 at 2:44 PM, Michael Sevilla <mikesevilla3@xxxxxxxxx> wrote: >>> Let me know if this works and/or you need anything else: >>> >>> https://www.dropbox.com/s/fq47w6jebnyluu0/lookup-logs.tar.gz?dl=0 >>> >>> Beware - the clients were on debug=10. Also, I tried this with the >>> kernel client and it is more consistent; it does the 2 lookups per >>> create on 1 client every single time. >> >> Mmmm, there are no mds logs of note here. :( >> > > Meaning you couldn't find mds.issdm-15.log? Or that that log didn't > show anything interesting... It's not interesting. Caps are not logged at a very high level so I think we'd actually want debug 20 on the mds, the messenger, and the client subsystems. > >> I did look enough to see that: >> 1) The MDS is for some reason revoking caps on the file create >> prompting the switch to double-lookups, which it was not before. The >> client doesn't really have any visibility into why that would be the >> case; the best guess I can come up with is that maybe the MDS split up >> the directory into multiple frags at this point — do you have that >> enabled? > > Nope, unless any of these make a difference: > $ ceph --admin-daemon... config show | grep frag > "mds_bal_frag": "false", > "mds_bal_fragment_interval": "5", > "mds_thrash_fragments": "0", > "mds_debug_frag": "false", > >> 2) The only way we set the I_COMPLETE flag is when we create an empty >> directory, or when we do a complete listdir on one. That makes it >> pretty difficult to get the flag back (and so do the optimal create >> path) once you lose it. :( I'd love a better way to do so, but we'll >> have to look at what's involved in a bit of depth. > > No need - with that reasoning it looks more like this is part of the > design rather than a bug. I'll just have to accept the fact that the > system is very complicated and clients touching stuff at certain times > can make things less predictable... I just wanted to make sure I > wasn't doing anything wrong. :) I'll stick with the kernel client > (it's almost twice as fast, anyways!) Well, sort of — an isolated client with their own directory is something we definitely want to have exclusive caps, but our heuristics aren't sophisticated enough yet. > >> I'm not sure why the kernel client is so much more cautious, but I >> think there were a number of troubles with the directory listing >> orders and things which were harder to solve there – I don't remember >> if we introduced the I_DIR_ORDERED flag in it or not. Zheng can talk >> more about that. What kernel client version are you using? >> >> And for a vanity data point, what kind of hardware is your MDS running on? :) > > Really, really old hardware from 2006: 2 dual-core CPUs, 8GB RAM, > connected with 1Gbit. Kernel 3.4. We actually just installed beefier > nodes so I'll keep you posted if we get other cool results. Awesome! That's much faster than previously, although Zheng did some work recently to split the journaling code into a separate thread which I guess must have made a big difference. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html