Re: CephFs kernel client metadata caching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


Thank you for your fast response!


Is there a way -that You know of- to list these locks?

I write to the file with echo "foo" >> /mnt/ceph/...something... so if there is any locking, should not it be released after the append is done?


The strange thing is, that this -increased traffic- stage went on for hours, tried many times, and after I stop the watch for ~5s (not tried different intervals) and restart it, the traffic is gone, and there is normal -I think some keepalive- comm between mds and client, two packets in ~5s (request, response)


As if the metadata cache would only be populated in a timer, (between 1s and 5s) which is never reached because of the repeated watch ls query .... just a blind shot in the dark...


Thanks:

Denes.


On 10/13/2017 01:32 PM, Burkhard Linke wrote:
Hi,


On 10/13/2017 12:36 PM, Denes Dolhay wrote:
Dear All,


First of all, this is my first post, so please be lenient :)


For the last few days I have been testing ceph, and cephfs, deploying a PoC cluster.

I have been testing the cephfs kernel client caching, when I came across something strange, and I cannot decide if it is a bug or I just messed up something.


Steps given client1 and client2 both mounded the same cephfs, extra mount option, noatime:


Client 1: watch -n 1 ls -lah /mnt/cephfs

-in tcpdump I can see that the directory is being listed once and only once, all the following ls requests are served from the client cache


Client 2: make any modification for example append to a file, or delete a file directly under /mnt/cephfs

-The operation is done, and client1 is informed about the change OK.

-Client1 does not seem to cache the new metadata information received from the metadata server, now it communicates every second with the mds.


Client 1: stop watch ls... command, wait a few sec and restart it

-The communication stops, client1 serves ls data from cache


Please help, if it is intentional then why, if not, how can I debug it?

This is probably the intended behaviour. CephFS is a posix compliant filesystem, and uses capabilities (similar to locks) to control concurrent access to directories and files.

In your first step, a capibility for directory access is granted to client1. As soon as client2 wants to access the directory (probably read-only first for listing, write access later), the MDS has to check the capability requests with client1. I'm not sure about the details, but something similar to "write lock" should be granted to client2, and client1 is granted a read lock or a "I have this entry in cache and need the MDS to know it" lock. That's also the reason why client1 has to ask the MDS every second whether its cache content is still valid. client2 probably still holds the necessary capabilities, so you might also see some traffic between MDS and client2.

I'm not sure why client1 does not continue to ask the MDS in the last step. Maybe the capability in client2 has expired and it was granted to client1. Others with more insight into the details of capabilities might be able to give you more details.

Short version: CephFS has a strict posix locking semantic implemented by capabilities, and you need to be aware of this fact (especially if you are used to NFS...)

Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux