Re: md-cache improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Aug 11, 2016 at 9:31 AM, Raghavendra G <raghavendra@xxxxxxxxxxx> wrote:
Couple of more areas to explore:
1. purging kernel dentry and/or page-cache too. Because of patch [1], upcall notification can result in a call to inode_invalidate, which results in an "invalidate" notification to fuse kernel module. While I am sure that, this notification will purge page-cache from kernel, I am not sure about dentries. I assume if an inode is invalidated, it should result in a lookup (from kernel to glusterfs). But neverthless, we should look into differences between entry_invalidation and inode_invalidation and harness them appropriately.

2. Granularity of invalidation. For eg., We shouldn't be purging page-cache in kernel, because of a change in xattr used by an xlator (eg., dht layout xattr). We have to make sure that [1] is handling this. We need to add more granularity into invaldation (like internal xattr invalidation, user xattr invalidation, entry invalidation in kernel, page-cache invalidation in kernel, attribute/stat invalidation in kernel etc) and use them judiciously, while making sure other cached data remains to be present.

To stress the importance of this point, it should be noted that with tier there can be constant migration of files, which can result in spurious (from perspective of application) invalidations, even though application is not doing any writes on files [2][3][4]. Also, even if application is writing to file, there is no point in invalidating dentry cache. We should explore more ways to solve [2][3][4].

3. We've a long standing issue of spurious termination of fuse invalidation thread. Since after termination, the thread is not re-spawned, we would not be able to purge kernel entry/attribute/page-cache. This issue was touched upon during a discussion [5], though we didn't solve the problem then for lack of bandwidth. Csaba has agreed to work on this issue.

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c7
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c8
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1293967#c9
[5] http://review.gluster.org/#/c/13274/1/xlators/mount/fuse/src/fuse-bridge.c
 

[1] http://review.gluster.org/12951


On Wed, Aug 10, 2016 at 10:35 PM, Dan Lambright <dlambrig@xxxxxxxxxx> wrote:

There have been recurring discussions within the gluster community to build on existing support for md-cache and upcalls to help performance for small file workloads. In certain cases, "lookup amplification" dominates data transfers, i.e. the cumulative round trip times of multiple LOOKUPs from the client mitigates benefits from faster backend storage.

To tackle this problem, one suggestion is to more aggressively utilize md-cache to cache inodes on the client than is currently done. The inodes would be cached until they are invalidated by the server.

Several gluster development engineers within the DHT, NFS, and Samba teams have been involved with related efforts, which have been underway for some time now. At this juncture, comments are requested from gluster developers.

(1) .. help call out where additional upcalls would be needed to invalidate stale client cache entries (in particular, need feedback from DHT/AFR areas),

(2) .. identify failure cases, when we cannot trust the contents of md-cache, e.g. when an upcall may have been dropped by the network

(3) .. point out additional improvements which md-cache needs. For example, it cannot be allowed to grow unbounded.

Dan

----- Original Message -----
> From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>
>
> List of areas where we need invalidation notification:
> 1. Any changes to xattrs used by xlators to store metadata (like dht layout
> xattr, afr xattrs etc).
> 2. Scenarios where individual xlator feels like it needs a lookup. For
> example failed directory creation on non-hashed subvol in dht during mkdir.
> Though dht succeeds mkdir, it would be better to not cache this inode as a
> subsequent lookup will heal the directory and make things better.
> 3. removing of files
> 4. writev on brick (to invalidate read cache on client)
>
> Other questions:
> 5. Does md-cache has cache management? like lru or an upper limit for cache.
> 6. Network disconnects and invalidating cache. When a network disconnect
> happens we need to invalidate cache for inodes present on that brick as we
> might be missing some notifications. Current approach of purging cache of
> all inodes might not be optimal as it might rollback benefits of caching.
> Also, please note that network disconnects are not rare events.
>
> regards,
> Raghavendra
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



--
Raghavendra G



--
Raghavendra G
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux