Re: Improving Geo-replication Status and Checkpoints

Aravinda <avishwan@xxxxxxxxxx> · Wed, 01 Apr 2015 14:58:41 +0530

On 04/01/2015 02:50 PM, Sahina Bose wrote:

On 04/01/2015 02:30 PM, Aravinda wrote:
Hi,

In each node of Master Cluster one Monitor process and one or more 
worker process for each brick in that node.
Monitor will have status file, which will be updated by glusterd. 
Possible Status values in monitor_status file are Created, Started, 
Paused, Stopped.

Geo-rep can not be paused if monitor status is not "Started".

Based on monitor_status, we need to hide other information from brick 
status file from showing it to user. For example, If monitor status 
is "Stopped", it will not make sense to show "Crawl Status" in 
Geo-rep Status output. Created a Matrix of possible status values 
based on Status of Monitor. VALUE represents actual unchanged value 
from Brick status file.

Monitor Status --->        Created        Started Paused Stopped
-------------------------------------------------------------------------- 

session                    VALUE          VALUE VALUE VALUE
brick                      VALUE          VALUE VALUE VALUE
node                       VALUE          VALUE VALUE VALUE
node_uuid                  VALUE          VALUE VALUE VALUE
volume                     VALUE          VALUE VALUE VALUE
slave_user                 VALUE          VALUE VALUE VALUE
slave_node                 N/A            VALUE VALUE       N/A
status                     Created        VALUE Paused Stopped
last_synced                N/A            VALUE VALUE VALUE
crawl_status               N/A            VALUE N/A         N/A
entry                      N/A            VALUE N/A         N/A
data                       N/A            VALUE N/A         N/A
meta                       N/A            VALUE N/A         N/A
failures                   N/A            VALUE VALUE VALUE
checkpoint_completed       N/A            VALUE VALUE VALUE
checkpoint_time            N/A            VALUE VALUE VALUE
checkpoint_completed_time  N/A            VALUE VALUE VALUE

Where:
session - only in XML output, Complete session URL which is used in 
Create command
brick - Master Brick Node
node - Master Node
node_uuid - Master Node UUID, Only in XML output
volume - Master Volume
slave_user - Slave User
slave_node - Slave node to which respective master worker is connected.
status - Created/Initializing../Active/Passive/Faulty/Paused/Stopped
last_synced - Last synced Time
crawl_status - Hybrid/History/Changelog
entry - Number of entry ops pending(per session, resets counter if 
worker restart)
data - Number of data ops pending(per session, resets counter if 
worker restart)
meta - Number of meta ops pending(per session, resets counter if 
worker restart)
failures - Number of failures. (If count more than 0, then action 
item for admin to look in log files)
checkpoint_completed - Checkpoint Status Yes/No/ N/A
checkpoint_time - Checkpoint Set time or N/A
checkpoint_completed_time - Checkpoint Completed Time or N/A

Along with the monitor_status, if brick status is Faulty, following 
fields will be displayed as N/A.
active, paused, slave_node, crawl_status, entry, data, metadata

Some questions -

* Would monitor status also have "Initializing" state?
Monitor status is only internal to Geo-replication, Status output will 
not have monitor_status.

* What's the difference between brick and master node above?
Brick - Brick path as shown in volume info, Node: Hostname as shown in 
Volume info

* Is the last_synced time returned in UTC?
TBD.
* active and paused - are these fields or status values?
Status will be Paused irrespective of Active/Passive.

Let me know your thoughts.

--
regards
Aravinda

On 02/03/2015 11:00 PM, Aravinda wrote:
Today we discussed about Geo-rep Status design, summary of the 
discussion.

- No usecase for "Deletes pending" column, should we retain it?
- No separate column for Active/Passive. Worker can be 
Active/Passive only when worker is Stable(It can't be Faulty and 
Active)
- Rename "Not Started" status as "Created"
- Checkpoint columns will be retained in the Status output till we 
support Multiple checkpoints. Three columns instead of Single 
column(Completed, Checkpoint time and Completion time)
- Still we have confusion about "Files Pending" and "Files Synced", 
What numbers it has to show. Georep can't map the number to exact 
count on disk.
  Venky suggested to show Entry, Data and Metadata pending as three 
columns. (Remove "Files Pending" and "Files Synced")
- Rename "Files Skipped" to "Failures"

Status output proposed:
-----------------------
MASTER NODE - Master node hostname/IP
MASTER VOL - Master volume name
MASTER BRICK - Master brick path
SLAVE USER - Slave user to which geo-rep is established.
SLAVE - Slave host and Volume name(HOST::VOL format)
STATUS - Created/Initializing../Started/Active/Passive/Stopped/Faulty
LAST SYNCED - Last synced time(Based on stime xattr)
CRAWL STATUS - Hybrid/History/Changelog
CHECKPOINT STATUS - Yes/No/ N/A
CHECKPOINT TIME - Checkpoint Set Time
CHECKPOINT COMPLETED - Checkpoint Completion Time

Not yet decided
---------------
FILES SYNCD - Number of Files Synced
FILES PENDING - Number of Files Pending
DELETES PENDING- Number of Deletes Pending
FILES SKIPPED - Number of Files skipped
ENTRIES - Create/Delete/MKDIR/RENAME etc
DATA - Data operations
METADATA - SETATTR, SETXATTR etc

Let me know your suggestions.

--
regards
Aravinda

On 02/02/2015 04:51 PM, Aravinda wrote:
Thanks Sahina, replied inline.

--
regards
Aravinda

On 02/02/2015 12:55 PM, Sahina Bose wrote:

On 01/28/2015 04:07 PM, Aravinda wrote:
Background
----------
We have `status` and `status detail` commands for GlusterFS 
geo-replication, This mail is to fix the existing issues in these 
command outputs. Let us know if we need any other columns which 
helps users to get meaningful status.

Existing output
---------------
Status command output
    MASTER NODE - Master node hostname/IP
    MASTER VOL - Master volume name
    MASTER BRICK - Master brick path
    SLAVE - Slave host and Volume name(HOST::VOL format)
    STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
    CHECKPOINT STATUS - Details about Checkpoint completion
    CRAWL STATUS - Hybrid/History/Changelog

Status detail -
    MASTER NODE - Master node hostname/IP
    MASTER VOL - Master volume name
    MASTER BRICK - Master brick path
    SLAVE - Slave host and Volume name(HOST::VOL format)
    STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
    CHECKPOINT STATUS - Details about Checkpoint completion
    CRAWL STATUS - Hybrid/History/Changelog
    FILES SYNCD - Number of Files Synced
    FILES PENDING - Number of Files Pending
    BYTES PENDING - Bytes pending
    DELETES PENDING - Number of Deletes Pending
    FILES SKIPPED - Number of Files skipped

Issues with existing status and status detail:
----------------------------------------------

1. Active/Passive and Stable/faulty status is mixed up - Same 
column is used to show both active/passive status as well as 
Stable/faulty status. If Active node goes faulty then by looking 
at the status it is difficult to understand Active node is faulty 
or the passive one.
2. Info about last synced time, unless we set checkpoint it is 
difficult to understand till what time data is synced to slave. 
For example, if a admin want's to know all the files synced which 
are created 15 mins ago, it is not possible without setting 
checkpoint.
3. Wrong values in metrics.
4. When multiple bricks present in same node. Status shows Faulty 
when one of the worker is faulty in that node.

Changes:
--------
1. Active nodes will be prefixed with * to identify it is a 
active node.(In xml output active tag will be introduced with 
values 0 or 1)
2. New column will show the last synced time, which minimizes the 
use of checkpoint feature. Checkpoint status will be shown only 
in status detail.
3. Checkpoint Status is removed, Separate Checkpoint command will 
be added to gluster cli(We can introduce multiple Checkpoint 
feature with this change)
4. Status values will be "Not 
Started/Initializing/Started/Faulty/Stopped". Stable is changed 
to "Started"
5. Slave User column will be introduced to show to which user 
geo-rep session is established.(Useful in Non root geo-rep)
6. Bytes pending column will be removed. It is not possible to 
identify the delta without simulating sync. For example, we are 
using rsync to sync data from master to slave, If we need to know 
how much data to be transferred then we have to run the rsync 
command with --dry-run flag before running actual command. With 
tar-ssh we have to stat all the files which are identified to be 
synced to calculate the total bytes to be synced. Both are costly 
operations which degrades the geo-rep performance.(In Future we 
can include these columns)
7. Files pending, Synced, deletes pending are only session 
information of the worker, these numbers will not match with the 
number of files present in Filesystem. If worker restarts, 
counter will reset to zero. When worker restarts, it logs 
previous session stats before resetting it.
8. Files Skipped is persistent status across sessions, Shows 
exact count of number of files skipped(Can get list of GFIDs 
skipped from log file)
9. "Deletes Pending" column can be removed?

Is there any way to know if there are errors syncing any of the 
files? Which column would that reflect in?
"Skipped" Column shows number of files failed to sync to Slave.

Is the last synced time - the least of the synced time across the 
nodes?
Status output will have one entry for each brick, so we are 
planning to display last synced time from that brick.

Example output

    MASTER NODE  MASTER VOL  MASTER BRICK  SLAVE USER 
SLAVE             STATUS    LAST SYNCED           CRAWL
---------------------------------------------------------------------------------------------------------------- 

    * fedoravm1  gvm         /gfs/b1       root fedoravm3::gvs 
Started   2014-05-10 03:07 pm   Changelog
      fedoravm2  gvm         /gfs/b2       root fedoravm4::gvs 
Started   2014-05-10 03:07 pm   Changelog

New Status columns

    ACTIVE_PASSIVE - * if Active else none.
    MASTER NODE - Master node hostname/IP
    MASTER VOL - Master volume name
    MASTER BRICK - Master brick path
    SLAVE USER - Slave user to which geo-rep is established.
    SLAVE - Slave host and Volume name(HOST::VOL format)
    STATUS - Stable/Faulty/Active/Passive/Stopped/Not Started
    LAST SYNCED - Last synced time(Based on stime xattr)
    CHECKPOINT STATUS - Details about Checkpoint completion
    CRAWL STATUS - Hybrid/History/Changelog
    FILES SYNCD - Number of Files Synced
    FILES PENDING - Number of Files Pending
    DELETES PENDING- Number of Deletes Pending
    FILES SKIPPED - Number of Files skipped

XML output
    active
    master_node
    master_node_uuid
    master_brick
    slave_user
    slave
    status
    last_synced
    crawl_status
    files_syncd
    files_pending
    deletes_pending
    files_skipped

Checkpoints
===========
New set of Gluster CLI commands will be introduced for Checkpoints.

    gluster volume geo-replication <VOLNAME> 
<SLAVEHOST>::<SLAVEVOL> checkpoint create <NAME> <DATE>
gluster volume geo-replication <VOLNAME> <SLAVEHOST>::<SLAVEVOL> 
checkpoint delete <NAME>
    gluster volume geo-replication <VOLNAME> 
<SLAVEHOST>::<SLAVEVOL> checkpoint delete all
    gluster volume geo-replication <VOLNAME> 
<SLAVEHOST>::<SLAVEVOL> checkpoint status [<NAME>]
    gluster volume geo-replication <VOLNAME> checkpoint status # 
For all geo-rep sessions for that volume
    gluster volume geo-replication checkpoint status # For all 
geo-rep sessions for all volumes

Checkpoint Status:

    SESSION                    NAME      Completed Checkpoint 
Time        Completion Time
----------------------------------------------------------------------------------------- 

    gvm->root@fedoravm3::gvs   Chk1      Yes 2014-11-30 11:30 
pm    2014-12-01 02:30 pm
    gvm->root@fedoravm3::gvs   Chk2      No 2014-12-01 10:00 
pm    N/A

Can the time information have the timezone information as well? Or 
is this UTC time?
(Same comment for last synced time)
Sure. Will have UTC time in Status output.

XML output:
    session
    master_uuid
    name
    completed
    checkpoint_time
    completion_time

--
regards
Aravinda
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel