Re: Consistency vs efficiency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I did some basic debugging and thats another thing I wanted to add to
the ceph kernel. I started off with a print_message and am working on
it now.  At some point I want to be able to use this method at various
places for debugging outgoing and incoming messages.
 But yes going back to your question - i do see repeated lookups. I
need to look more carefully though if each of the lookups differ in
any manner. Also interesting is that after the first lookup (followed
by a mkdir), couple of next lookups still fail.

thanks again
Jojy

On Thu, Jul 21, 2011 at 10:55 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Thu, 21 Jul 2011, Jojy Varghese wrote:
>> Thanks for the response Sage. We are using 2.6.39 kernel and in the
>> "ceph_lookup" method, i see that there is a shortcut for deciding
>> ENOENT but after the MDS lookup, i dont see a d_add. I am sure i am
>> missing something here.
>
>                        dout(" dir %p complete, -ENOENT\n", dir);
>                        d_add(dentry, NULL);
>
> ...but that is only for the negative lookup in a directory with the
> 'complete' flag set.  And it's never set currently because we don't have
> d_prune yet (and the old use of d_release was racy).  So ignore this part
> for now!
>
> You have an existing, unchanging, directory that you're seeing repeated
> lookups on, right?  Like the top-level directory in the heirarchy you're
> copying?  And the client is doing repeated lookups on the same name?
>
> The way to debug this is probably to start with the messages passing to
> the MDS and verifying that lookups are duplicated.  Then enable the
> logging on the kernel client and see why the client isn't uses leases or
> the FILE_SHARED cap to avoid them.  We can help you through that on #ceph
> if you like.
>
> sage
>
>
>>
>> thanks again
>> Jojy
>>
>> On Thu, Jul 21, 2011 at 9:49 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> > On Thu, 21 Jul 2011, Jojy Varghese wrote:
>> >> Hi
>> >>   I just started looking at the ceph code in kernel and had a question
>> >> about performance considerations for lookup operations. I noticed that
>> >> for every operation (say copying a directory), the root dentry is
>> >> "looked" up multiple times and since they all go to MDS for the actual
>> >> lookup operation, it effects the performance. I am sure consistency is
>> >> the winner here. Is there any plan to improve this, maybe by having
>> >> MDS push the capability down to the clients when the dentry is
>> >> updated. So say from CAP_EXCL to CAP_SHARED when the dentry is
>> >> modified. This was the client node can cache the lookup operation and
>> >> does not have to make a round trip to the MDS.
>> >
>> > In general, the MDS has two ways of keeping a client's cached dentry
>> > consistent:
>> >
>> >  - it can issue the FILE_SHARED capability bit on the parent directory,
>> > which means the entire directory is static and the client can cache
>> > dentry.
>> >  - if it can't do that, it will issue a per-dentry lease
>> >
>> > There is an additional 'complete' bit that is used to indicate on the
>> > client that it has the _entire_ directory in cache.  If set, it can do
>> > negative lookups and readdir without hitting the MDS.  That's currently
>> > broken, pending the addition of a d_prune dentry_operation (see
>> > linux-fsdevel email from July 8).
>> >
>> > Anyway, long story short, if you're seeing repeated lookups on a dentry
>> > that isn't changing, something is broken.  Can you describe the workload
>> > in more detail?  Which versions of the client and mds are you running?
>> >
>> > sage
>> >
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux