On Sat 13-02-10 14:13:17, Kailas Joshi wrote: > On 13 February 2010 01:37, <tytso@xxxxxxx> wrote: > > On Fri, Feb 12, 2010 at 08:52:15AM +0530, Kailas Joshi wrote: > >> Sorry, I didn't understand why processes need to be suspended. > >> In my scheme, I am issuing magic handle only after locking the current > >> transaction. AFAIK after the transaction is locked, it can receive the > >> block journaling requests for already created handles(in our case, for > >> already reserved journal space), and the new concurrent requests for > >> journal_start() will go to the new current transaction. Since, the > >> credits for locked transaction are fixed (by means of early > >> reservations) we can know whether journal has enough space for the new > >> journal_start(). So, as long as journal has enough space available, > >> new processes need now be stalled. > > > > But while you are modifying blocks that need to go into the journal > > via the locked (old) transaction, it's not safe to start a new > > transaction and start issuing handles against the new transaction. > > > > Just to give one example, suppose we need to update the extent > > allocation tree for an inode in the locked/committing transaction as > > the delayed allocation blocks are being resolved --- and in another > > process, that inode is getting truncated or unlinked, which also needs > > to modify the extent allocation tree? Hilarty ensues, unless you use > > a block all attempts to create a new handle (practically speaking, by > > blocking all attempts to start a new transaction), until this new > > delayed allocation resolution phase which you have proposed is > > complete. > Okay. So, basically process stalling is unavoidable as we cannot > modify a buffer data in past transaction after it has been modified in > current transaction. > Can we restrict the scope for this blocking? Blocking on > journal_start() will block all processes even though they are > operating on mutually exclusive sets of metadata buffers. Can we > restrict this blocking to allocation/deallocation paths by blocking in > get_write_access() on specific cases(some condition on buffer)? This > way, since all files will use commit-time allocation, very few(sync > and direct-io mode) file operations will be stalled. I doubt blocking at buffer-level would be enough. I think that the journalling layer just does not have enough information for such decisions. It could be feasible to block on per-inode basis but you'd still have to give a good thought to modification of filesystem global structures like bitmaps, superblock, or inode blocks. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html