Re: Upcalls Infrastructure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Below patches include preliminary upcall framework support but has only 'cache_invalidation' use-case handled.

http://review.gluster.org/#/c/9534/
http://review.gluster.org/#/c/9535/
http://review.gluster.org/#/c/9536/

Kindly review the changes.

Lease_Lock support changes will be submitted in the new patches after addressing the proposed changes discussed in the earlier mail.

Thanks,
Soumya


On 02/19/2015 12:30 PM, Soumya Koduri wrote:
Hi,

We recently have uncovered few issues with respect to lease_locks
support and had discussions around the same. Thanks to everyone involved.

So the new changes proposed in the design (in addition to the ones
discussed in the earlier mail) are -

* Earlier, in case if a client takes a lease-lock and a conflicting fop
is requested by another client, RECALL_LEASE CBK event will be sent to
the first client and till the first client unlocks the LEASE_LOCK, we
send EDELAY/ERETRY error to the conflicting fops. This works for
protocol clients (like NFS/SMB) which keep retrying on receiving that
error but not for FUSE clients or any of the other auxiliary services
(like rebalance/self-heal/quota) which will error-out immediately.

So to resolve that, we choose to block the fops based on the flags
passed (by default 'BLOCK' or 'NON_BLOCK' incase of protocol clients).
The blocking will be done in the same way as current locks xlator does
to block lock requests (maintain a queue of call stubs and wake them up
once the LEASE_LOCK is released/recalled).

* Earlier, when a lease_lk request comes, the upcall xlator maps it to
POSIX lock for the entire file before granting it. And incase if the
same client takes an fcntl lock, it will be merged with the earlier lock
taken and unlock of either of the locks will result in the loss of lock
state.

To avoid that, we plan to define a new lk_entry (LEASE_LOCK) in the
'locks' xlator to store lease_locks and add support to not merge it with
the locks of any other type.

* In addition, before granting lease_lock, we now check if there are
existing open-fds on the file with the conflicting access requested. If
yes, lease_lock will not be granted.

* While sending RECALL_LEASE CBK event, a new timer event will be
registered to notify in case of recall timeout so that we can purge
lease locks forcefully and wake up blocked fops.

* Few Enhancements which may be considered,
     *   to start with upcall entries are maintained in a linked list.
We may change it to RBT tree for performance improvement.
     *   store Upcall entries in inode/fd_ctxt for faster lookup.

Thanks,
Soumya

On 01/22/2015 02:31 PM, Soumya Koduri wrote:
Hi,

I have updated the feature page with more design details and the
dependencies/limitations this support has.

http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#Dependencies



Kindly check the same and provide your inputs.

Few of them which may be addressed for 3.7 release are -

*AFR/EC*
     - Incase of replica bricks maintained by AFR, the upcalls state is
maintained and processed on all the replica bricks. This will result in
duplicate notifications sent by all those bricks incase of
non-idempotent fops.
     - Hence we need support on AFR to filter out such duplicate
callback notifications. Similar support is needed for EC as well.
     - One of the approaches suggested by the AFR team is to cache the
upcall notifications received for around 1min (their current lifetime)
to detect & filter out the duplicate notifications sent by the replica
bricks.


*Cleanup during network disconnect - protocol/server*
    - At present, incase of network disconnects between the
glusterfs-server and the client, the protocol/server looks up the fd
table associated with that client and sends 'flush' op for each of those
fds to cleanup the locks associated with it.

    - We need similar support to flush the lease locks taken. Hence,
while granting the lease-lock, we plan to associate that upcall_entry
with the corresponding fd_ctx or inode_ctx so that they can be easily
tracked if needed to be cleaned up. Also it will help in faster lookup
of the upcall entries while trying to process the fops using the same
fd/inode.

Note: Above cleanup is done for the upcall state associated with only
lease-locks. For the other entries maintained (for eg:, for
cache-invalidations), the reaper thread (which will be used to cleanup
the expired entries in this xlator) will clean-up those states as well
once they get expired.

*Replay of the lease-locks state*
   - At present, replay of locks by the client xlator (after network
disconnect and reconnect) seems to have been disabled.
   - But when it is being enabled, we need to add support to replay
lease-locks taken as well.
   - Till then, this will be considered as a limitation and will be
documented as suggested by KP.

Thanks,
Soumya


On 12/16/2014 09:36 AM, Krishnan Parthasarathi wrote:

- Is there a new connection from glusterfsd (upcall xlator) to
    a client accessing a file? If so, how does the upcall xlator reuse
    connections when the same client accesses multiple files, or
does it?

No. We are using the same connection which client initiates to send-in
fops. Thanks to you for pointing me initially to the 'client_t'
structure. As these connection details are available only in the server
xlator, I am passing these to upcall xlator by storing them in
'frame->root->client'.

- In the event of a network separation (i.e, a partition) between a
client
    and a server, how does the client discover or detect that the
server
    has 'freed' up its previously registerd upcall notification?

The rpc connection details of each client are stored based on its
client-uid. So incase of network partition, when client comes back
online, IMO it re-initiates the connection (along with new client-uid).

How would a client discover that a server has purged its upcall entries?
For instance, a client could assume that the server would notify it
about
changes as before (while the server has purged the client's upcall
entries)
and assume that it still holds the lease/lock. How would you avoid that?

Please correct me if that's not the case. So there will new entries
created/added in this xlator. However, we still need to decide on
how to
cleanup the old-timed-out and stale entries
    * either clean-up the entries as and when we find any expired
entry or
stale entry (in case if notification fails).
    * or by spawning a new thread which periodically scans through this
list and cleans up those entries.

There are couple of things to resource cleanup in this context.
1) Time to cleanup; For e.g, on expiry of a timer.
2) Order of cleaning up; This involves clearly establishing
relationships
    among inode, upcall entry and client_t(s). We should document this.

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux