Hey Ravi, I'll ping Shreyas about this today. There's also a patch we'll need for multi-threaded SHD to fix the least-pri queuing. The PID of the process wasn't tagged correctly via the call frame in my original patch. The patch below fixes this (for 3.6.3), I didn't see multi-threaded self heal on github/master yet so let me know what branch you need this patch on and I can come up with a clean patch. Richard ===== diff --git a/xlators/cluster/afr/src/afr-self-heald.c b/xlators/cluster/afr/src/afr-self-heald.c index 028010d..b0f6248 100644 --- a/xlators/cluster/afr/src/afr-self-heald.c +++ b/xlators/cluster/afr/src/afr-self-heald.c @@ -532,6 +532,9 @@ afr_mt_process_entries_done (int ret, call_frame_t *sync_frame, pthread_cond_signal (&mt_data->task_done); } pthread_mutex_unlock (&mt_data->lock); + + if (task_ctx->frame) + AFR_STACK_DESTROY (task_ctx->frame); GF_FREE (task_ctx); return 0; } @@ -787,6 +790,7 @@ _afr_mt_create_process_entries_task (xlator_t *this, int ret = -1; afr_mt_process_entries_task_ctx_t *task_ctx; afr_mt_data_t *mt_data; + call_frame_t *frame = NULL; mt_data = &healer->mt_data; @@ -799,6 +803,8 @@ _afr_mt_create_process_entries_task (xlator_t *this, if (!task_ctx) goto err; + task_ctx->frame = afr_frame_create (this); + INIT_LIST_HEAD (&task_ctx->list); task_ctx->readdir_xl = this; task_ctx->healer = healer; @@ -812,7 +818,7 @@ _afr_mt_create_process_entries_task (xlator_t *this, // This returns immediately, and afr_mt_process_entries_done will // be called when the task is completed e.g. our queue is empty ret = synctask_new (this->ctx->env, afr_mt_process_entries_task, - afr_mt_process_entries_done, NULL, + afr_mt_process_entries_done, task_ctx->frame, (void *)task_ctx); if (!ret) { diff --git a/xlators/cluster/afr/src/afr-self-heald.h b/xlators/cluster/afr/src/afr-self-heald.h index 817e712..1588fc8 100644 --- a/xlators/cluster/afr/src/afr-self-heald.h +++ b/xlators/cluster/afr/src/afr-self-heald.h @@ -74,6 +74,7 @@ typedef struct afr_mt_process_entries_task_ctx_ { subvol_healer_t *healer; xlator_t *readdir_xl; inode_t *idx_inode; /* inode ref for xattrop dir */ + call_frame_t *frame; unsigned int entries_healed; unsigned int entries_processed; unsigned int already_healed; Richard ________________________________________ From: Ravishankar N [ravishankar@xxxxxxxxxx] Sent: Sunday, February 07, 2016 11:15 PM To: Shreyas Siravara Cc: Richard Wareing; Vijay Bellur; Gluster Devel Subject: Re: Throttling xlator on the bricks Hello, On 01/29/2016 06:51 AM, Shreyas Siravara wrote: > So the way our throttling works is (intentionally) very simplistic. > > (1) When someone mounts an NFS share, we tag the frame with a 32 bit hash of the export name they were authorized to mount. > (2) io-stats keeps track of the "current rate" of fops we're seeing for that particular mount, using a sampling of fops and a moving average over a short period of time. > (3) Based on whether the share violated its allowed rate (which is defined in a config file), we tag the FOP as "least-pri". Of course this makes the assumption that all NFS endpoints are receiving roughly the same # of FOPs. The rate defined in the config file is a *per* NFS endpoint number. So if your cluster has 10 NFS endpoints, and you've pre-computed that it can do roughly 1000 FOPs per second, the rate in the config file would be 100. > (4) IO-Threads then shoves the FOP into the least-pri queue, rather than its default. The value is honored all the way down to the bricks. > > The code is actually complete, and I'll put it up for review after we iron out a few minor issues. Did you get a chance to send the patch? Just wanted to run some tests and see if this is all we need at the moment to regulate shd traffic, especially with Richard's multi-threaded heal patch https://urldefense.proofpoint.com/v2/url?u=http-3A__review.gluster.org_-23_c_13329_&d=CwIC-g&c=5VD0RTtNlTh3ycd41b3MUw&r=qJ8Lp7ySfpQklq3QZr44Iw&m=B873EiTlTeUXIjEcoutZ6Py5KL0bwXIVroPbpwaKD8s&s=fo86UTOQWXf0nQZvvauqIIhlwoZHpRlQMNfQd7Ubu7g&e= being revived and made ready for 3.8. -Ravi > >> On Jan 27, 2016, at 9:48 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote: >> >> On 01/26/2016 08:41 AM, Richard Wareing wrote: >>> In any event, it might be worth having Shreyas detail his throttling feature (that can throttle any directory hierarchy no less) to illustrate how a simpler design can achieve similar results to these more complicated (and it follows....bug prone) approaches. >>> >>> Richard >> Hi Shreyas, >> >> Wondering if you can share the details of the throttling feature you're working on. Even if there's no code, a description of what it is trying to achieve and how will be great. >> >> Thanks, >> Ravi _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel