Hi Shyam,
I've been discussing this stuff with Krishnan and we ended up with a
proposal (see attached file).
There many comments to explain how it works.
What do you think about this proposal ?
Xavi
On 11/11/2014 07:42 PM, Shyam wrote:
On 11/10/2014 09:59 AM, Xavier Hernandez wrote:
Hi Shyam,
On 11/10/2014 03:36 PM, Shyam wrote:
Hi Xavi,
I think you are referring to the problem as I see it, and have posted
here, http://review.gluster.org/#/c/6459/1/libglusterfs/src/timer.c
If so, there is nothing clean to handle the case in the code. Something
akin to returning a non-zero from the cancel and handling it by the
consumers is a possibility.
Yes, something similar to this is what I was thinking. The solution in
the patch is not enough to handle this case.
But there are other cases where some more control will be needed:
* Suppose a fini() function releasing resources may need to cancel a
pending timer. To avoid crashes, the ideal solution would be to be sure
that we don't release anything the callback might need before the
callback has completely been executed or correctly canceled. This means
that we would need some kind of synchronization support for timers.
Yes, this is needed. I need to give this more thought though, my memory
on this problem is a bit scratchy at present.
* Suppose that you write a callback to release some resource after a
certain amount of time and this callback is handled to a subsystem that
does not know exactly what the callback does. If an external condition
happens, and it's needed to execute the callback before the timeout
expires, you need a way to let the callback be executed, with an
indication that it haven0t been executed by a normal timeout.
I presume this is an _simpler_ extension if we solve the one above.
These are only 2 possible cases where having a better timer api would be
helpful. In fact the first case would be interesting for ec, that
currently needs something like that. It also needs a way to detect if a
callback has really been canceled or not to avoid inconsistencies.
Would be acceptable to improve the current timer api ?
Yes it would :)
I would like to have the concurrent cancel and event having been (going
to) fired handled first, if possible with little the no changes in the
API, and the second case later. Your priorities seem to suggest the same.
Shyam
/* Timers have two references:
* - One for the creator thread that can be used to cancel the timer, force
* its execution or synchronize with it.
* - One for the worker thread that will be used the thread that implements
* the timer.
*
* Timer resources are released when both references are released.
*/
/* Timer callback signature.
*
* This function will be executed when the timer expires or gf_timer_now() is
* called.
*
* op_ret will be ETIMEDOUT if called when the timer has expired, or the
* value specified in gf_timer_now() otherwise.
*/
typedef (* gf_timer_cbk_t)(void * data, int32_t op_ret);
/* Checks if the creator thread reference has been released or not.
*
* Field 'ctx' of the timer is used to track the reference of the creator
* thread. It's set to != NULL on creation, and = NULL when it's not needed
* anymore by the creation thread.
*
* The timer thread never uses this field.
*/
gf_boolean_t gf_timer_has_creator_ref(gf_timer_t * timer)
{
return (timer->ctx != NULL);
}
/* Checks if the worker thread reference has been released or not.
*
* Field 'callback' of the timer is used to track the reference of the worker
* thread. It's set to != NULL on creation, and = NULL when it's not needed
* anymore by the worker thread.
*
* The creator thread never uses this field (it only initialized it on
* creation).
*/
gf_boolean_t gf_timer_has_worker_ref(gf_timer_t * timer)
{
return (timer->callback != NULL);
}
/* Releases the creator reference.
*
* Must be called inside a mutex region.
*/
void gf_timer_release_creator_ref(gf_timer_t * timer)
{
GF_ASSERT(gf_timer_has_creator_ref(timer));
timer->ctx = NULL;
}
/* Releases the worker reference.
*
* Must be called inside a mutex region.
*/
void gf_timer_release_worker_ref(gf_timer_t * timer)
{
GF_ASSERT(gf_timer_has_worker_ref(timer));
timer->callback = NULL;
}
/* Create and schedule a new timer.
*
* The caller will receive an implicit reference to the timer. This reference
* can be used to call gf_timer_cancel() or gf_timer_now().
*
* The returned reference must be released by calling gf_timer_wait() or
* gf_timer_detach().
*/
gf_timer_t * gf_timer_schedule(gf_timer_ctx_t * ctx, struct timespec delay,
gf_timer_cbk_t callback, void * data)
{
gf_timer_t * timer;
GF_ASSERT(ctx != NULL);
GF_ASSERT(callback != NULL);
timer = malloc(sizeof(gf_timer_t);
timer->ctx = ctx;
timer->callback = callback;
timer->data = data;
timer->time = convert_to_absolute_time(delay);
timer->op_ret = ETIMEDOUT;
pthread_mutex_lock(&timer->ctx->lock);
heap_insert(&timer->ctx->heap, timer);
if (heap_is_root(&timer->ctx->heap, timer))
{
/* If the new timer expires before anyone else, wake up the
* timer thread to adjust. */
pthread_cond_signal(&timer->ctx->cond);
}
pthread_mutex_unlock(&timer->ctx->lock);
return timer;
}
/* Cancel a pending timer.
*
* It tries to cancel a pending timer. If the timer has not expired yet, it's
* cancelled and the callback will never be executed. In this case the function
* returns gf_true. If the callback has already been executed or it's currently
* running, gf_false is returned.
*
* If this function returns gf_false, gf_timer_wait() can be used to synchronize
* execution.
*/
gf_boolean_t gf_timer_cancel(gf_timer_t * timer)
{
gf_boolean_t ret = gf_false;
GF_ASSERT(gf_timer_has_creator_ref(timer));
pthread_mutex_lock(&timer->ctx->lock);
if (heap_contains(&timer->ctx->heap, timer))
{
heap_delete(&timer->ctx->heap, timer);
gf_timer_release_worker_ref(timer);
timer->op_ret = -1;
ret = gf_true;
}
pthread_mutex_unlock(&timer->ctx->lock);
return ret;
}
/* Force an immediate execution of a timer's callback.
*
* It forces a callback to be executed as soon as possible. If the callback
* has already been executed or is being executed, gf_false will be returned.
* Otherwise the callback will be called using the specified op_ret and gf_true
* will be returned.
*
* gf_timer_wait() can be used to synchronize execution.
*/
gf_boolean_t gf_timer_now(gf_timer_t * timer, int32_t op_ret)
{
gf_boolean_t ret = gf_false;
GF_ASSERT(op_ret != ETIMEDOUT);
GF_ASSERT(gf_timer_has_creator_ref(timer));
pthread_mutex_lock(&timer->ctx->lock);
if (heap_contains(&timer->ctx->heap, timer))
{
heap_delete(&timer->ctx->heap, timer);
/* Add the timer to a list of immediate execution timers */
list_insert(&timer->ctx->now, timer);
/* Save the value to pass to the callback */
timer->op_ret = op_ret;
pthread_cond_signal(&timer->ctx->cond);
ret = gf_true;
}
pthread_mutex_unlock(&timer->ctx->lock);
return ret;
}
/* Checks if the callback has been fully executed or cancelled.
*/
gf_boolean_t gf_timer_completed(gf_timer_t * timer)
{
return !gf_timer_has_worker_ref(timer);
}
/* Synchronize the creator thread with the callback execution.
*
* This function blocks until the callback has been completely executed or
* cancelled.
*
* It returns the value returned by the callback or -1 if the callback has
* been cancelled. The callback should always return positive error codes.
*/
int32_t gf_timer_wait(gf_timer_t * timer)
{
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
int32_t ret;
GF_ASSERT(gf_timer_has_creator_ref(timer));
pthread_mutex_lock(&timer->ctx->lock);
if (gf_timer_has_worker_ref(timer))
{
timer->cond = &cond;
/* Wait for callback completion */
pthread_cond_wait(&timer->cond, &ctx->lock);
}
pthread_mutex_unlock(&ctx->lock);
/* Take the return value from the callback. If the callback has not
* been executed, it will be -1. */
ret = timer->op_ret;
free(timer);
return ret;
}
/* Release the creator thread reference.
*
* This function releases the creator thread reference when it's not needed
* any more. If the user needs to track if/when the callback has been executed,
* gf_timer_completed() of gf_timer_wait() must be used.
*/
void gf_timer_detach(gf_timer_t * timer)
{
gf_timer_ctx_t * ctx;
int32_t done;
GF_ASSERT(gf_timer_has_creator_ref(timer));
ctx = timer->ctx;
pthread_mutex_lock(&ctx->lock);
done = !gf_timer_has_worker_ref(timer);
gf_timer_release_creator_ref(timer);
pthread_mutex_lock(&ctx->lock);
if (done)
{
free(timer);
}
}
/* The thread worker that calls callbacks.
*
*/
void * gf_timer_worker(void * arg)
{
struct list_head list;
gf_timer_ctx_t * ctx = arg;
INIT_LIST_HEAD(&list);
do
{
pthread_mutex_lock(&ctx->lock);
while (list_is_empty(&list))
{
timer = heap_top(&ctx->heap);
if (timer != NULL)
{
delay = time_sub(now(), timer->time);
}
else
{
delay = INFINITE;
}
pthread_cond_timedwait(&ctx->cond, &ctx->lock, delay);
if (!list_is_empty(&ctx->now))
{
list_splice_init(&ctx->now, &list);
}
timer = heap_top(&ctx->heap);
while ((timer != NULL) && (timer->time <= now()))
{
heap_delete(&ctx->heap, timer);
list_insert(&list, timer);
timer = heap_top(&ctx->heap);
}
}
pthread_mutex_unlock(&ctx->lock);
while (!list_is_empty(&list))
{
timer = list_delete_first(&list);
timer->op_ret = timer->callback(timer->data,
timer->op_ret);
pthread_mutex_lock(&ctx->lock);
done = !gf_timer_has_creator_ref(timer);
gf_timer_release_worker_ref(timer);
if (timer->cond != NULL)
{
pthread_cond_signal(timer->cond, &ctx->lock);
}
pthread_mutex_unlock(&ctx->lock);
if (done)
{
free(timer);
}
}
} while (!ctx->stop);
return NULL;
}
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel