Re: [PATCH 4/5] jbd2: Speedup jbd2_journal_get_[write|undo]_access()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 18-06-15 15:59:33, Jerry Lee wrote:
> Hi:
> 
> I found that there may be a typo in the attached patch 5/5:
> 
> +        /*
> +         * If it's in our transaction it must be in BJ_Metadata list.
> +         * The assertion is unreliable since we may see jh in
> +         * inconsistent state unless we grab bh_state lock. But this
> +         * is crutial to catch bugs so let's do a reliable check until
> +         * the lockless handling is fully proven.
> +         */
> +        if (jh->b_transaction == transaction &&
> +            jh->b_jlist != BJ_Metadata) {
> +            jbd_lock_bh_state(bh);
> +            J_ASSERT_JH(jh, jh->b_transaction != transaction ||
> +                    jh->b_jlist == BJ_Metadata);
> +            jbd_lock_bh_state(bh);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +        }
> +        goto out;
> 
> Does the highlight part should be "jbd_unlock_bh_state(bh)"?
  Yeah, thanks for catching this. I was apparently pretty lucky in this
passing all the tests... Attached is the new version of the patch.

								Honza

> On Thu, Jun 18, 2015 at 12:56 AM, Jan Kara <jack@xxxxxxx> wrote:
> 
> > On Mon 08-06-15 12:47:26, Ted Tso wrote:
> > > On Thu, Apr 02, 2015 at 03:58:19PM +0200, Jan Kara wrote:
> > > > jbd2_journal_get_write_access() and jbd2_journal_get_create_access()
> > are
> > > > frequently called for buffers that are already part of the running
> > > > transaction - most frequently it is the case for bitmaps, inode table
> > > > blocks, and superblock. Since in such cases we have nothing to do, it
> > is
> > > > unfortunate we still grab reference to journal head, lock the bh, lock
> > > > bh_state only to find out there's nothing to do.
> > > >
> > > > Improving this is a bit subtle though since until we find out journal
> > > > head is attached to the running transaction, it can disappear from
> > under
> > > > us because checkpointing / commit decided it's no longer needed. We
> > deal
> > > > with this by protecting journal_head slab with RCU. We still have to be
> > > > careful about journal head being freed & reallocated within slab and
> > > > about exposing journal head in consistent state (in particular
> > > > b_modified and b_frozen_data must be in correct state before we allow
> > > > user to touch the buffer).
> > > >
> > > > FIXME: Performance data.
> > > >
> > > > Signed-off-by: Jan Kara <jack@xxxxxxx>
> > >
> > > Applied, so we can start getting some testing on this patch.  Did you
> > > ever get performance data?
> >
> > Yes. Here are results for reaim fserver workload for 32 core machine with
> > 128 GB of ram with ext4 on ramdisk:
> > Procs Vanilla    Patched
> > 1      20420.688  21155.556
> > 21     49684.704 178934.074
> > 41     84630.364 196647.482
> > 61    106344.284 204831.652
> > 81    120751.370 214842.428
> > 101   131585.450 208761.832
> > 121   138092.078 212741.648
> > 141   142271.578 212118.502
> > 161   146008.364 213731.388
> > 181   149569.494 216121.444
> >
> > Numbers are operations per second so larger is better. You can see that
> > for 21 processes we get increase by 260% in the number operations. Also the
> > total maximum of operations the machine is able to achieve increases by
> > 44% because of overall lower CPU overhead.
> >
> >                                                                 Honza
> > --
> > Jan Kara <jack@xxxxxxx>
> > SUSE Labs, CR
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
>From a19a4452879743eb7c8db593adbf60fc9ff0427c Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@xxxxxxx>
Date: Wed, 18 Mar 2015 14:29:13 +0100
Subject: [PATCH 5/5] jbd2: Speedup jbd2_journal_dirty_metadata()

It is often the case that we mark buffer as having dirty metadata when
the buffer is already in that state (frequent for bitmaps, inode table
blocks, superblock). Thus it is unnecessary to contend on grabbing
journal head reference and bh_state lock. Avoid that by checking whether
any modification to the buffer is needed before grabbing any locks or
references.

Signed-off-by: Jan Kara <jack@xxxxxxx>
---
 fs/jbd2/transaction.c | 33 +++++++++++++++++++++++++++------
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index a91f639af6c3..5887c880422b 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1290,8 +1290,6 @@ void jbd2_buffer_abort_trigger(struct journal_head *jh,
 	triggers->t_abort(triggers, jh2bh(jh));
 }
 
-
-
 /**
  * int jbd2_journal_dirty_metadata() -  mark a buffer as containing dirty metadata
  * @handle: transaction to add buffer to.
@@ -1325,12 +1323,36 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
 	WARN_ON(!transaction);
 	if (is_handle_aborted(handle))
 		return -EROFS;
-	journal = transaction->t_journal;
-	jh = jbd2_journal_grab_journal_head(bh);
-	if (!jh) {
+	if (!buffer_jbd(bh)) {
 		ret = -EUCLEAN;
 		goto out;
 	}
+	/*
+	 * We don't grab jh reference here since the buffer must be part
+	 * of the running transaction.
+	 */
+	jh = bh2jh(bh);
+	J_ASSERT_JH(jh, jh->b_transaction == transaction ||
+			jh->b_next_transaction == transaction);
+	if (jh->b_modified == 1) {
+		/*
+		 * If it's in our transaction it must be in BJ_Metadata list.
+		 * The assertion is unreliable since we may see jh in
+		 * inconsistent state unless we grab bh_state lock. But this
+		 * is crutial to catch bugs so let's do a reliable check until
+		 * the lockless handling is fully proven.
+		 */
+		if (jh->b_transaction == transaction &&
+		    jh->b_jlist != BJ_Metadata) {
+			jbd_lock_bh_state(bh);
+			J_ASSERT_JH(jh, jh->b_transaction != transaction ||
+					jh->b_jlist == BJ_Metadata);
+			jbd_unlock_bh_state(bh);
+		}
+		goto out;
+	}
+
+	journal = transaction->t_journal;
 	jbd_debug(5, "journal_head %p\n", jh);
 	JBUFFER_TRACE(jh, "entry");
 
@@ -1421,7 +1443,6 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
 	spin_unlock(&journal->j_list_lock);
 out_unlock_bh:
 	jbd_unlock_bh_state(bh);
-	jbd2_journal_put_journal_head(jh);
 out:
 	JBUFFER_TRACE(jh, "exit");
 	return ret;
-- 
2.1.4


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux