Re: [PATCH] ext3: wait on all pending commits in ext3_sync_fs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jan Kara wrote:

> From 0a578ba1b56fe655570ee6dad41748863a120dbc Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@xxxxxxx>
> Date: Fri, 19 Dec 2008 00:05:34 +0100
> Subject: [PATCH] jbd: Fix return value of journal_start_commit()
> 
> journal_start_commit() returns 1 if either a transaction is committing or the
> function has queued a transaction commit. But it returns 0 if we raced with
> somebody queueing the transaction commit as well. This resulted in
> ext3_sync_fs() not functioning correctly (description from Arthur Jones):
> In the case of a data=ordered umount with pending long symlinks which are
> delayed due to a long list of other I/O on the backing block device, this
> causes the buffer associated with the long symlinks to not be moved to the
> inode dirty list in the second phase of fsync_super.  Then, before they can be
> dirtied again, kjournald exits, seeing the UMOUNT flag and the dirty pages are
> never written to the backing block device, causing long symlink corruption and
> exposing new or previously freed block data to userspace.

This looks sane to me, and it does fix the below testcase.

Care to formally propose it?

Thanks,
-Eric

> This can be reproduced with a script created by Eric Sandeen
> <sandeen@xxxxxxxxxx>:
> 
>         #!/bin/bash
> 
>         umount /mnt/test2
>         mount /dev/sdb4 /mnt/test2
>         rm -f /mnt/test2/*
>         dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
>         touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
>         ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
>         /mnt/test2/link
>         umount /mnt/test2
>         mount /dev/sdb4 /mnt/test2
>         ls /mnt/test2/
> 
> This patch fixes journal_start_commit() to always return 1 when there's
> a transaction committing or queued for commit.
> 
> Signed-off-by: Jan Kara <jack@xxxxxxx>
> ---
>  fs/jbd/journal.c |   17 +++++++++++------
>  1 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
> index 9e4fa52..e79c078 100644
> --- a/fs/jbd/journal.c
> +++ b/fs/jbd/journal.c
> @@ -427,7 +427,7 @@ int __log_space_left(journal_t *journal)
>  }
>  
>  /*
> - * Called under j_state_lock.  Returns true if a transaction was started.
> + * Called under j_state_lock.  Returns true if a transaction commit was started.
>   */
>  int __log_start_commit(journal_t *journal, tid_t target)
>  {
> @@ -495,7 +495,8 @@ int journal_force_commit_nested(journal_t *journal)
>  
>  /*
>   * Start a commit of the current running transaction (if any).  Returns true
> - * if a transaction was started, and fills its tid in at *ptid
> + * if a transaction is going to be committed (or is currently already
> + * committing), and fills its tid in at *ptid
>   */
>  int journal_start_commit(journal_t *journal, tid_t *ptid)
>  {
> @@ -505,15 +506,19 @@ int journal_start_commit(journal_t *journal, tid_t *ptid)
>  	if (journal->j_running_transaction) {
>  		tid_t tid = journal->j_running_transaction->t_tid;
>  
> -		ret = __log_start_commit(journal, tid);
> -		if (ret && ptid)
> +		__log_start_commit(journal, tid);
> +		/* There's a running transaction and we've just made sure
> +		 * it's commit has been scheduled. */
> +		if (ptid)
>  			*ptid = tid;
> -	} else if (journal->j_committing_transaction && ptid) {
> +		ret = 1;
> +	} else if (journal->j_committing_transaction) {
>  		/*
>  		 * If ext3_write_super() recently started a commit, then we
>  		 * have to wait for completion of that transaction
>  		 */
> -		*ptid = journal->j_committing_transaction->t_tid;
> +		if (ptid)
> +			*ptid = journal->j_committing_transaction->t_tid;
>  		ret = 1;
>  	}
>  	spin_unlock(&journal->j_state_lock);

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux