Re: Upcall state + Data Tiering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Please see this thread on the same/similar problem,
http://www.gluster.org/pipermail/gluster-devel/2014-December/043284.html

This was discussed particularly for lease locks (when that feature was being discussed, and for tier as that would move files more frequently).

The solution outline is for the rebalance process to migrate the locks, with some additional coordination with the locks/lease/upcall xlators.

The problem however is _mapping_ all of the lock information across the 2 different storage node brick processes. (i.e client_t information)

Shyam

On 04/20/2015 02:48 PM, Soumya Koduri wrote:
Thanks for your inputs. Replies inline.

-Soumya

On 04/20/2015 07:48 AM, Joseph Fernandes wrote:
Adding more to Dan's Reply,

In tiering we lose the heat of the file(collected on the source brick)
when the file gets migrated, by the DHT-Rebalancer during the
rebalancer withing the tier.
We like to leverage the common solution infra for passing this extra
metadata to destination.


----- Original Message -----
From: "Dan Lambright" <dlambrig@xxxxxxxxxx>
To: "Niels de Vos" <ndevos@xxxxxxxxxx>
Cc: "Joseph Fernandes" <josferna@xxxxxxxxxx>, "gluster Devel"
<gluster-devel@xxxxxxxxxxx>, "Soumya Koduri" <skoduri@xxxxxxxxxx>
Sent: Monday, April 20, 2015 4:01:12 AM
Subject: Re:  Upcall state + Data Tiering



----- Original Message -----
From: "Niels de Vos" <ndevos@xxxxxxxxxx>
To: "Dan Lambright" <dlambrig@xxxxxxxxxx>, "Joseph Fernandes"
<josferna@xxxxxxxxxx>
Cc: "gluster Devel" <gluster-devel@xxxxxxxxxxx>, "Soumya Koduri"
<skoduri@xxxxxxxxxx>
Sent: Sunday, April 19, 2015 9:01:56 AM
Subject: Re:  Upcall state + Data Tiering

On Thu, Apr 16, 2015 at 04:58:29PM +0530, Soumya Koduri wrote:
Hi Dan/Joseph,

As part of upcall support on the server-side, we maintain certain
state to
notify clients of the cache-invalidation and recall-leaselk events.

We have certain known limitations with Rebalance and Self-Heal.
Details in
the below link -
http://www.gluster.org/community/documentation/index.php/Features/Upcall-infrastructure#Limitations


In case of Cache-invalidation,
upcall state is not migrated and once the rebalance is finished, the
file
is
deleted and we may falsely notify the client that file is deleted
when in
reality it isn't.

In case of Lease-locks,
As the case with posix locks, we do not migrate lease-locks as well but
will
end-up recalling lease-lock.

Here rebalance is an admin driven job, but that is not the case with
respect
to Data tiering.

We would like to know when the files are moved from hot to cold
tiers or
vice-versa or rather when a file is considered to be migrated from
cold to
hot tier, where we see potential issues.
Is it the first fop which triggers it? and
where are the further fops processed - on hot tier or cold tier?

Data tiering's basic design has been to reuse DHT's data migration
algorithms. In this case, we see the same problem exists with DHT, but
is a known limitation controlled by management operations. And
therefore (if I follow) they may not tackle this problem right away,
hence tiering may not be able to leverage their solution. It is of
course desirable for data tiering to solve the problem in order to use
the new upcall mechanisms.

Migration of a file is a multi-state process. I/O is accepted at the
same time migration is underway. I believe the upcall manager and the
migration manager (for lack of better words) would have to coordinate.
The former subsystem understands locks, and the later how to move files.

With that "coordination" in mind, a basic strategy might be something
like this:

On the source, when a file is ready to be moved, the migration manager
informs the upcall manager.

The upcall manager packages relevant lock information and returns it
to the migrator.  The information reflects the state of posix or lease
locks.

The migration manager moves the file.

The migration manager then sends the lock information as a virtual
extended attribute.

On the destination server, the upcall manager is invoked. It is passed
the contents of the virtual attributes. The upcall manager rebuilds
the lock state and puts the file into proper order.

This approach sounds good. The same has been followed to migrate locks
incase of graph switch on the client-side
("glfs_migrate_fd_locks_safe"). Rebalance process (maybe self-heal
daemon) too could transfer the state using similar xattrs.

Only at that point, does the setxattr RPC return, and only then, shall
the file be declared "migrated".

We would have to handle any changes to the lock state that occur when
the file is in the middle of being migrated. Probably, the upcall
manager would change the contents of the "package".

This sounds tricky. I do not know atm how the data changes on the files
which are getting rebalanced are handled but maybe we could follow on
the same lines.

It is desirable to invent something that would work with both DHT and
tiering (i.e. be implemented at the core dht-rebalance.c layer). And
in fact, the mechanism I describe could be useful for other meta-data
transfer applications.

Agree. Thanks again for your inputs. I shall open a BZ capturing all
these details.

This is just a high level sketch, to provoke discussion and checkpoint
if this is the right direction. It would take much time to sort
through the details. Other ideas are welcome.



My understanding is the following:

- when a file is "cold" and gets accessed, the 1st FOP will mark the
   file for migration to the "hot" tier
- migration is async, so the initial responses on FOPs would come from
   the "cold" tier
- upon migration (similar to rebalance) locking state and upcall
   tracking is lost

I think this is a problem. There seems to be a window where a client can
get (posix) locks while the file is on the "cold" tier. After migrating
the file from "cold" to "hot", these locks would get lost. The same
counts for tracking access in the upcall xlator.

Please provide your inputs on the same. We may need to document the
same or
provide suggestions to the customers while deploying this solution.

Some ideas on how this can get solved would be most welcome.

Thanks,
Niels

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux