Ben, Yes, that is the race I found thru debugging. Chandra On Thu, 2011-12-01 at 12:32 -0600, Ben Myers wrote: > Hi, > > On Mon, Nov 28, 2011 at 03:17:36AM -0500, Christoph Hellwig wrote: > > Apply the scheme used in log_regrant_write_log_space to wake up any other > > threads waiting for log space before the newly added one to > > log_regrant_write_log_space as well, and factor the code into readable > > helpers. For each of the queues we have add two helpers: > > > > - one to try to wake up all waiting threads. This helper will also be > > usable by xfs_log_move_tail once we remove the current opportunistic > > wakeups in it. > > - one to sleep on t_wait until enough log space is available, loosely > > modelled after Linux waitqueues. > > > > And use them to reimplement the guts of log_regrant_write_log_space and > > log_regrant_write_log_space. These two function now use one and the same > > algorithm for waiting on log space instead of subtly different ones before, > > with an option to completely unify them in the near future. > > > > Also move the filesystem shutdown handling to the common caller given > > that we had to touch it anyway. > > > > Based on hard debugging and an earlier patch from > > Chandra Seetharaman <sekharan@xxxxxxxxxx>. > > > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > > I'd like to make sure that I understand the race that Chandra > debugged and reported. > > 2499 STATIC int > 2500 xlog_grant_log_space(xlog_t *log, > 2501 xlog_ticket_t *tic) > 2502 { > 2503 int free_bytes; > 2504 int need_bytes; > ... > 2517 /* something is already sleeping; insert new transaction at end */ > 2518 if (!list_empty_careful(&log->l_reserveq)) { > 2519 spin_lock(&log->l_grant_reserve_lock); > 2520 /* recheck the queue now we are locked */ > 2521 if (list_empty(&log->l_reserveq)) { > 2522 spin_unlock(&log->l_grant_reserve_lock); > 2523 goto redo; > 2524 } > 2525 list_add_tail(&tic->t_queue, &log->l_reserveq); > 2526 > 2527 trace_xfs_log_grant_sleep1(log, tic); > 2528 > 2529 /* > 2530 * Gotta check this before going to sleep, while we're > 2531 * holding the grant lock. > 2532 */ > 2533 if (XLOG_FORCED_SHUTDOWN(log)) > 2534 goto error_return; > 2535 > 2536 XFS_STATS_INC(xs_sleep_logspace); > 2537 xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock); > 2538 > 2539 /* > 2540 * If we got an error, and the filesystem is shutting down, > 2541 * we'll catch it down below. So just continue... > 2542 */ > 2543 trace_xfs_log_grant_wake1(log, tic); > 2544 } > 2545 > 2546 redo: > 2547 if (XLOG_FORCED_SHUTDOWN(log)) > 2548 goto error_return_unlocked; > 2549 > 2550 free_bytes = xlog_space_left(log, &log->l_grant_reserve_head); > 2551 if (free_bytes < need_bytes) { > 2552 spin_lock(&log->l_grant_reserve_lock); > 2553 if (list_empty(&tic->t_queue)) > 2554 list_add_tail(&tic->t_queue, &log->l_reserveq); > 2555 > 2556 trace_xfs_log_grant_sleep2(log, tic); > 2557 > 2558 if (XLOG_FORCED_SHUTDOWN(log)) > 2559 goto error_return; > 2560 > 2561 xlog_grant_push_ail(log, need_bytes); > 2562 > 2563 XFS_STATS_INC(xs_sleep_logspace); > 2564 xlog_wait(&tic->t_wait, &log->l_grant_reserve_lock); > 2565 > 2566 trace_xfs_log_grant_wake2(log, tic); > 2567 goto redo; > 2568 } > 2569 > 2570 if (!list_empty(&tic->t_queue)) { > 2571 spin_lock(&log->l_grant_reserve_lock); > 2572 list_del_init(&tic->t_queue); > 2573 spin_unlock(&log->l_grant_reserve_lock); > 2574 } > > So the race that we're looking at here is: > > process A was added to the reserve queue at either 2525 or 2554 and, pushes the AIL at 2561, > xfsaild frees up enough log space for process A (possibly B?), eventually xfs_log_move_tail is called to wake process A, > process A wakes at line 2564, and he is on the reserveq already, > process B sees that there are tickets on the queue at 2518 and gets the grant reserve lock at 2519, > process A spins at 2571 waiting for the grant reserve lock, > process B adds itself to the queue at 2525, > process B drops the grant reserve lock and goes to sleep at 2537 > process A takes the grant reserve lock at 2571 and removes it's ticket from the list. > > ...and there is nothing to wake process B until the ail is pushed by > some other process. > > Is that about right? > > Thanks, > Ben > _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs