Re: MDS has inconsistent performance

Michael Sevilla <mikesevilla3@xxxxxxxxx> · Fri, 16 Jan 2015 13:13:04 -0800

If you feel like perusing... log=20 on client, mds messenger, and mds:

https://www.dropbox.com/s/uvmexh9impd3f3c/forgreg.tar.gz?dl=0

In this run, only client 1 starts doing the extra lookups.

On Fri, Jan 16, 2015 at 10:43 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
> On Fri, Jan 16, 2015 at 10:34 AM, Michael Sevilla
> <mikesevilla3@xxxxxxxxx> wrote:
>> On Thu, Jan 15, 2015 at 10:37 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>> On Thu, Jan 15, 2015 at 2:44 PM, Michael Sevilla <mikesevilla3@xxxxxxxxx> wrote:
>>>> Let me know if this works and/or you need anything else:
>>>>
>>>> https://www.dropbox.com/s/fq47w6jebnyluu0/lookup-logs.tar.gz?dl=0
>>>>
>>>> Beware - the clients were on debug=10. Also, I tried this with the
>>>> kernel client and it is more consistent; it does the 2 lookups per
>>>> create on 1 client every single time.
>>>
>>> Mmmm, there are no mds logs of note here. :(
>>>
>>
>> Meaning you couldn't find mds.issdm-15.log? Or that that log didn't
>> show anything interesting...
>
> It's not interesting. Caps are not logged at a very high level so I
> think we'd actually want debug 20 on the mds, the messenger, and the
> client subsystems.
>
>>
>>> I did look enough to see that:
>>> 1) The MDS is for some reason revoking caps on the file create
>>> prompting the switch to double-lookups, which it was not before. The
>>> client doesn't really have any visibility into why that would be the
>>> case; the best guess I can come up with is that maybe the MDS split up
>>> the directory into multiple frags at this point — do you have that
>>> enabled?
>>
>> Nope, unless any of these make a difference:
>> $ ceph --admin-daemon... config show | grep frag
>>   "mds_bal_frag": "false",
>>   "mds_bal_fragment_interval": "5",
>>   "mds_thrash_fragments": "0",
>>   "mds_debug_frag": "false",
>>
>>> 2) The only way we set the I_COMPLETE flag is when we create an empty
>>> directory, or when we do a complete listdir on one. That makes it
>>> pretty difficult to get the flag back (and so do the optimal create
>>> path) once you lose it. :( I'd love a better way to do so, but we'll
>>> have to look at what's involved in a bit of depth.
>>
>> No need - with that reasoning it looks more like this is part of the
>> design rather than a bug. I'll just have to accept the fact that the
>> system is very complicated and clients touching stuff at certain times
>> can make things less predictable... I just wanted to make sure I
>> wasn't doing anything wrong. :)  I'll stick with the kernel client
>> (it's almost twice as fast, anyways!)
>
> Well, sort of — an isolated client with their own directory is
> something we definitely want to have exclusive caps, but our
> heuristics aren't sophisticated enough yet.
>
>>
>>> I'm not sure why the kernel client is so much more cautious, but I
>>> think there were a number of troubles with the directory listing
>>> orders and things which were harder to solve there – I don't remember
>>> if we introduced the I_DIR_ORDERED flag in it or not. Zheng can talk
>>> more about that. What kernel client version are you using?
>>>
>>> And for a vanity data point, what kind of hardware is your MDS running on? :)
>>
>> Really, really old hardware from 2006: 2 dual-core CPUs, 8GB RAM,
>> connected with 1Gbit. Kernel 3.4. We actually just installed beefier
>> nodes so I'll keep you posted if we get other cool results.
>
> Awesome! That's much faster than previously, although Zheng did some
> work recently to split the journaling code into a separate thread
> which I guess must have made a big difference.
> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html