Re: some hard numbers on ext3 & batching performance issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday 07 March 2008 3:08:32 pm Ric Wheeler wrote:
> Josef Bacik wrote:
> > On Wednesday 05 March 2008 2:19:48 pm Ric Wheeler wrote:
> >> After the IO/FS workshop last week, I posted some details on the slow
> >> down we see with ext3 when we have a low latency back end instead of a
> >> normal local disk (SCSI/S-ATA/etc).
>
> ...
> ...
> ...
>
> >> It would be really interesting to rerun some of these tests on xfs which
> >> Dave explained in the thread last week has a more self tuning way to
> >> batch up transactions....
> >>
> >> Note that all of those poor users who have a synchronous write workload
> >> today are in the "1" row for each of the above tables.
> >
> > Mind giving this a whirl?  The fastest thing I've got here is an Apple X
> > RAID and its being used for something else atm, so I've only tested this
> > on local disk to make sure it didn't make local performance suck (which
> > it doesn't btw). This should be equivalent with what David says XFS does.
> >  Thanks much,
> >
> > Josef
> >
> > diff --git a/fs/jbd/transaction.c b/fs/jbd/transaction.c
> > index c6cbb6c..4596e1c 100644
> > --- a/fs/jbd/transaction.c
> > +++ b/fs/jbd/transaction.c
> > @@ -1333,8 +1333,7 @@ int journal_stop(handle_t *handle)
> >  {
> >  	transaction_t *transaction = handle->h_transaction;
> >  	journal_t *journal = transaction->t_journal;
> > -	int old_handle_count, err;
> > -	pid_t pid;
> > +	int err;
> >
> >  	J_ASSERT(journal_current_handle() == handle);
> >
> > @@ -1353,32 +1352,22 @@ int journal_stop(handle_t *handle)
> >
> >  	jbd_debug(4, "Handle %p going down\n", handle);
> >
> > -	/*
> > -	 * Implement synchronous transaction batching.  If the handle
> > -	 * was synchronous, don't force a commit immediately.  Let's
> > -	 * yield and let another thread piggyback onto this transaction.
> > -	 * Keep doing that while new threads continue to arrive.
> > -	 * It doesn't cost much - we're about to run a commit and sleep
> > -	 * on IO anyway.  Speeds up many-threaded, many-dir operations
> > -	 * by 30x or more...
> > -	 *
> > -	 * But don't do this if this process was the most recent one to
> > -	 * perform a synchronous write.  We do this to detect the case where a
> > -	 * single process is doing a stream of sync writes.  No point in
> > waiting -	 * for joiners in that case.
> > -	 */
> > -	pid = current->pid;
> > -	if (handle->h_sync && journal->j_last_sync_writer != pid) {
> > -		journal->j_last_sync_writer = pid;
> > -		do {
> > -			old_handle_count = transaction->t_handle_count;
> > -			schedule_timeout_uninterruptible(1);
> > -		} while (old_handle_count != transaction->t_handle_count);
> > -	}
> > -
> >  	current->journal_info = NULL;
> >  	spin_lock(&journal->j_state_lock);
> >  	spin_lock(&transaction->t_handle_lock);
> > +
> > +	if (journal->j_committing_transaction && handle->h_sync) {
> > +		tid_t tid = journal->j_committing_transaction->t_tid;
> > +
> > +		spin_unlock(&transaction->t_handle_lock);
> > +		spin_unlock(&journal->j_state_lock);
> > +
> > +		err = log_wait_commit(journal, tid);
> > +
> > +		spin_lock(&journal->j_state_lock);
> > +		spin_lock(&transaction->t_handle_lock);
> > +	}
> > +
> >  	transaction->t_outstanding_credits -= handle->h_buffer_credits;
> >  	transaction->t_updates--;
> >  	if (!transaction->t_updates) {
>
> Running with Josef's patch, I was able to see a clear improvement for
> batching these synchronous operations on ext3 with the RAM disk and
> array. It is not too often that you get to do a simple change and see a
> 27 times improvement ;-)
>
> On the bad side, the local disk case took as much as a 30% drop in
> performance.  The specific disk is not one that I have a lot of
> experience with, I would like to retry on a disk that has been qualified
>   by our group (i.e., we have reasonable confidence that there are no
> firmware issues, etc).
>
> Now for the actual results.
>
> The results are the average value of 5 runs for each number of threads.
>
> Type     Threads   Baseline    Josef    Speedup (Josef/Baseline)
> array	    1	     320.5      325.4      1.01
> array	    2	     174.9      351.9      2.01
> array	    4	     382.7      593.5      1.55
> array	    8	     644.1      963.0      1.49
> array	    10	     842.9     1038.7      1.23
> array	    20	    1319.6     1432.3      1.08
>
> RAM disk    1       5621.4     5595.1      0.99
> RAM disk    2        281.5     7613.3     27.04
> RAM disk    4        579.9     9111.5     15.71
> RAM disk    8        891.1     9357.3     10.50
> RAM disk    10      1116.3     9873.6      8.84
> RAM disk    20      1952.0    10703.6      5.48
>
> S-ATA disk  1         19.0       15.1      0.79
> S-ATA disk  2         19.9       14.4      0.72
> S-ATA disk  4         41.0       27.9      0.68
> S-ATA disk  8         60.4       43.2      0.71
> S-ATA disk  10        67.1       48.7      0.72
> S-ATA disk  20       102.7       74.0      0.72
>
> Background on the tests:
>
> All of this is measured on three devices - a relatively old & slow
> array, the local (slow!) 2.5" S-ATA disk in the box and a RAM disk.
>
> These numbers are used fs_mark to write 4096 byte files with the
> following commands:
>
> fs_mark  -d  /home/test/t  -s  4096  -n  40000  -N  50  -D  64  -t  1
> ...
> fs_mark  -d  /home/test/t  -s  4096  -n  20000  -N  50  -D  64  -t  2
> ...
> fs_mark  -d  /home/test/t  -s  4096  -n  10000  -N  50  -D  64  -t  4
> ...
> fs_mark  -d  /home/test/t  -s  4096  -n  5000  -N  50  -D  64  -t  8
> ...
> fs_mark  -d  /home/test/t  -s  4096  -n  4000  -N  50  -D  64  -t  10
> ...
> fs_mark  -d  /home/test/t  -s  4096  -n  2000  -N  50  -D  64  -t  20
> ...
>
> Note that this spreads the files across 64 subdirectories, each thread
> writes 50 files and then moves on to the next in a round robin.
>

I'm starting to wonder about the disks I have, because my files/second is 
spanking yours, and its just a local samsung 3gb/s sata drive.  With those 
commands I'm consistently getting over 700 files/sec.  I'm seeing about a 1-5% 
increase in speed locally with my patch.  I guess I'll start looking around for 
some other hardware and check on there in case this box is more badass than I 
think it is.  Thanks much,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux