This series addresses the log grant lock contention seen by 8-way fs_mark workloads. The log grant lock protects: - reserve grant head and wait queue - write grant head and wait queue - tail lsn - last sync lsn While they are all currently protected by a single lock, there is no reason that they need to all be under the same lock. As a result, one option for scaling was simply to split the grant lock into three locks - one for each of the above groups. However, this would mean that we'd need to nest locks inside each other and it ignores the fact that we really only use the lock on the tail and last sync lsn variables to protect against concurrent updates. Hence we can make the tail and last sync LSN variables independent of the grant lock by making them atomic variables. This means that when we are moving the tail of the log, we can avoid all locking except when there are waiters queued on the grant heads. Making the grant heads scale better is a bit more of a challenge. Just replacing the grant lock with reserve and write grant locks doesn't really help improve scalability because we'd still need to take both locks in the hot xfs_log_reserve() path. To improve scalability, we really need to make this path lock free. The first steps aree to clean up some of the code. We convert the ticket queues to use the common list_head infrastructure, factor out some common debug code, refactor and rearrange the grant head calculations code and convert all the users of the sv_t wait mechanisms to use wait queues directly. The second step to acheiving this is to encode the grant heads as a 64 bit variable and then convert it to an atomic variable. The tail/last sync LSNs also get converted to atomic variables, and this means we can read the grant heads without holding locks and that allows tail pushing calculations and available log space calculations to operate lock free. The next step is to introduce a lock per grant queue that is used exclusively to protect queue manpulations. With the use of list_empty_careful() we can check whether the queue has waiters without holding the queue lock. Hence in the case where the queues are empty we do not need to take the queue locks in the fast path. Finally, we need to make the grant head space calculations lockless. With the grant heads already being atomic variables, we can change the calculation algorithm to a lockless cmpxchg algorithm. This means we no longer need any spinlocks in the transaction reserve fast path and hence the scalability of this path should be significantly improved. There is one down side to this change - the xlog_verify_head() debug code can no longer be reliably used to detect accounting problems in the grant space allocations as it requires an atomic sample of both grant heads. However, the tail verification and the xlog_space_left() verification still works without problems, so we still have some debug checking on the grant head locations. Version 3: - dropped cleanup of xlog_grant_log_space() and xlog_regrant_log_write_space(). - split grant head aggregation into multiple patches - split out xlog_verify_tail() function - factor grant head calculations and drop wrappers - combine grant heads and add wrappers to crack/combine grant heads. - removed intermediate grant head "_lsn" suffix name. - folded all sv_t removal patches into one. - don't pass tail and last sync lsn into xlog_grant_push_ail(). - ensure that shutdown checks in ticket queue processing are done consistently before sleeping. - removed xlog_grant_verify_head() - folded grant lock removal into patch that converts grant head manipulations to lockless algorithms. - added a couple of tracepoints for when the log tail is moved and queued tickets are woken to aid debugging. Version 2: - split into lots of patches - clean up the code and comments - add patches to clean up sv_t usage at the end of the series _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs