Re: background on the ext3 batching performance issue

Josef Bacik <jbacik@xxxxxxxxxx> · Thu, 28 Feb 2008 10:05:11 -0500

On Thursday 28 February 2008 7:09:17 am Ric Wheeler wrote:
> At the LSF workshop, I mentioned that we have tripped across an
> embarrassing performance issue in the jbd transaction code which is
> clearly not tuned for low latency devices.
>
> The short summary is that we can do say 800 10k files/sec in a
> write/fsync/close loop with a single thread, but drop down to under 250
> files/sec with 2 or more threads.
>
> This is pretty easy to reproduce with any small file write synchronous
> workload (i.e., fsync() each file before close).  We used my fs_mark
> tool to reproduce.
>
> The core of the issue is the call in the jbd transaction code call out
> to schedule_timeout_uninterruptible(1) which causes us to sleep for 4ms:
>
>         pid = current->pid;
>         if (handle->h_sync && journal->j_last_sync_writer != pid) {
>                 journal->j_last_sync_writer = pid;
>                 do {
>                         old_handle_count = transaction->t_handle_count;
>                         schedule_timeout_uninterruptible(1);
>                 } while (old_handle_count != transaction->t_handle_count);
>         }
>
> This is quite topical to the concern we had with low latency devices in
> general, but specifically things like SSD's.
>

Your testcase does in fact show a weakness in this optimization, but look at the 
more likely case, where you have multiple writers on the same filesystem rather 
than one guy doing write/fsync.  If we wait we could potentially add quite a 
few more buffers to this transaction before flushing it, rather than flushing a 
buffer or two at a time.  What would you propose as a solution?

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html