Re: CephFS Space Accounting and Quotas

"Jim Schutt" <jaschut@xxxxxxxxxx> · Mon, 11 Mar 2013 14:40:04 -0600

On 03/11/2013 10:57 AM, Greg Farnum wrote:
> On Monday, March 11, 2013 at 9:48 AM, Jim Schutt wrote:
>> On 03/11/2013 09:48 AM, Greg Farnum wrote:
>>> On Monday, March 11, 2013 at 7:47 AM, Jim Schutt wrote:
>>>>
>>>> For this run, the MDS logging slowed it down enough to cause the
>>>> client caps to occasionally go stale. I don't think it's the cause
>>>> of the issue, because I was having it before I turned MDS debugging
>>>> up. My client caps never go stale at, e.g., debug mds 5.
>>>
>>>
>>>
>>> Oh, so this might be behaviorally different than you were seeing before? Drat.
>>>
>>> You had said before that each newfstatat was taking tens of seconds,
>>> whereas in the strace log you sent along most of the individual calls
>>> were taking a bit less than 20 milliseconds. Do you have an strace of
>>> them individually taking much more than that, or were you just
>>> noticing that they took a long time in aggregate?
>>
>>
>>
>> When I did the first strace, I didn't turn on timestamps, and I was
>> watching it scroll by. I saw several stats in a row take ~30 secs,
>> at which point I got bored, and took a look at the strace man page to
>> figure out how to get timestamps ;)
>>
>> Also, another difference is for that test, I was looking at files
>> I had written the day before, whereas for the strace log I sent,
>> there was only several minutes between writing and the strace of find.
>>
>> I thought I had eliminated the page cache issue by using fdatasync
>> when writing the files. Perhaps the real issue is affected by that
>> delay?
> 
> I'm not sure. I can't think of any mechanism by which waiting longer
> would increase the time lags, though, so I doubt it.
> 
>>> I suppose if you were going to run it again then just the message
>>> logging could also be helpful. That way we could at least check and
>>> see the message delays and if the MDS is doing other work in the
>>> course of answering a request.
>>
>>
>>
>> I can do as many trials as needed to isolate the issue.
>>
>> What message debugging level is sufficient on the MDS; 1?
> Yep, that will capture all incoming and outgoing messages. :) 
>>
>> If you want I can attempt to duplicate my memory of the first
>> test I reported, writing the files today and doing the strace
>> tomorrow (with timestamps, this time).
>>
>> Also, would it be helpful to write the files with minimal logging, in
>> hopes of inducing minimal timing changes, then upping the logging
>> for the stat phase?
> 
> Well that would give us better odds of not introducing failures of
> any kind during the write phase, and then getting accurate
> information on what's happening during the stats, so it probably
> would. Basically I'd like as much logging as possible without
> changing the states they system goes through. ;)

Hmmm, this is getting more interesting...

I just did two complete trials where I built a file system,
did two sets of writes with minimal MDS logging, then 
turned MDS logging up to 20 with MDS ms logging at 1 for
the stat phase.

In each trial my strace'd find finished in < 10 seconds,
and there were no slow stat calls (they were taking ~19 ms
each).

I'm going to do a third trial where I let things rest
overnight, after I write the files.  That delay is the
only thing I haven't reproduced from my first trial....

-- Jim

> -Greg
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html