On Sat, Jan 05, 2013 at 02:34:15PM -0600, Mark Tinguely wrote: > The back-end of xlog_cil_push() allows multiple push sequences > to write to the xlog at the same time. It does this by design, and has since day zero. > This will cause problems > for recovery and also could cause the xlog_cil_committed() callback > to be called out of sequence. Log recovery is supposed to be able to handle it just fine in that recovery only replays up to the last checkpoint with a valid commit record. Checkpoints that don't have valid commit records - no matter the order they are written - will terminate recovery at the LSN of the lowest entire commit. > This was discovered with an EFI/EFD misorder. There are several (5) active > sequence pushes and 3 completed pushes. The callback for the sequence (2) > holding the EFD is being processed but the callback for the sequence (1) > holding the EFI has not yet been processed. What are the symptoms shown when this problem is hit? AFAICT, there is no problem here - this comment above __xfs_efi_release() explains that EFI/EFD misordering is expected: /* * Freeing the efi requires that we remove it from the AIL if it has already * been placed there. However, the EFI may not yet have been placed in the AIL * when called by xfs_efi_release() from EFD processing due to the ordering of * committed vs unpin operations in bulk insert operations. Hence the * test_and_clear_bit(XFS_EFI_COMMITTED) to ensure only the last caller frees * the EFI. */ This was introduced in this commit: $ gl -n 1 b199c8a4 commit b199c8a4ba11879df87daad496ceee41fdc6aa82 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Mon Dec 20 11:59:49 2010 +1100 xfs: Pull EFI/EFD handling out from under the AIL lock EFI/EFD interactions are protected from races by the AIL lock. They are the only type of log items that require the the AIL lock to serialise internal state, so they need to be separated from the AIL lock before we can do bulk insert operations on the AIL. To acheive this, convert the counter of the number of extents in the EFI to an atomic so it can be safely manipulated by EFD processing without locks. Also, convert the EFI state flag manipulations to use atomic bit operations so no locks are needed to record state changes. Finally, use the state bits to determine when it is safe to free the EFI and clean up the code to do this neatly. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Hence if there is some problem being seen as a result of allowing this behaviour, I'd like to know what that problem actually is.... > The xlog_cil_committed() callback misorder happens because the buffer that > contains the sequence ticket is filled by another sequence push and the > callback for the buffer write happens before the callback is placed onto > that buffer. I'm not sure I follow you here. xfs_log_done() takes a reference to the iclog that the commit record is added to, and I/O cannot be issued on that iclog until the reference count drops to zero. Hence the sequence of writing the commit record, obtaining the commit_lsn, adding the callbacks to the iclog and releasing the iclog are atomic from an I/O perspective, and IO is only issued when the reference count falls to zero. And given that xlog_write() uses the same reference counting to provide the same guarantees, I cannot see how concurrent in-memory writes to the same iclog could cause IO completion callbacks to be issued out of order. > This patch serializes the xlog_write() so that only one sequence (the lowest) > is written at a time. This will also stop the race between xlog_commit_record() > and the adding of the callback onto the buffer containing the sequence commit > record. The point of the separation of the commit record from the commit "data" is to allow interleaving of multiple transaction commits without serialising the transaction commit process. It's actually a significant performance win when dealing with synchronous transaction heavy workloads as it allows multiple concurrent synchronous transactions to aggregate into a single log buffer. From that perspective alone, that's a NACK from me to to this patch... Without knowing what the problem you are actually seeing is, I can't make any further suggestions or comments about whether there is a real issue here. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs