Re: Help: ext4 jbd2 IO requests slow down fsync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for the information and analysis. I then did more tests. My app
runs random 4KB workloads on SSD device, one write followed by one
fsync. Here are the FIO test simulating the workload and the test
results. Please help to take a look and let me know what you think.

=================
Ext3 (Kernel 3.2)
=================
# ./fio  --rw=randwrite --bs=4k --direct=0 --filesize=1G --numjobs=32
--runtime=240 --group_reporting --filename=/db/test.benchmark
--name=randomwrite --fsync=3 --invalidate=1

randomwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K,
ioengine=psync, iodepth=1

...

fio-2.16-5-g915ca

Starting 32 processes

randomwrite: Laying out IO file(s) (1 file(s) / 1024MB)

Jobs: 31 (f=31): [w(9),_(1),w(22)] [100.0% done] [0KB/294.4MB/0KB /s]
[0/75.4K/0 iops] [eta 00m:00s]

randomwrite: (groupid=0, jobs=32): err= 0: pid=9050: Tue Jan  7 18:00:16 2020

  write: io=32768MB, bw=272130KB/s, iops=68032, runt=123303msec
     <<<<<<<<<<<  iops is much higher than 4.4 kernel test below.

    clat (usec): min=1, max=21783, avg=35.40, stdev=137.63

     lat (usec): min=1, max=21783, avg=35.51, stdev=137.74

    clat percentiles (usec):

     |  1.00th=[    1],  5.00th=[    2], 10.00th=[    2], 20.00th=[    3],

     | 30.00th=[    4], 40.00th=[    6], 50.00th=[    7], 60.00th=[   10],

     | 70.00th=[   14], 80.00th=[   23], 90.00th=[   94], 95.00th=[  211],

     | 99.00th=[  354], 99.50th=[  580], 99.90th=[ 1144], 99.95th=[ 1336],

     | 99.99th=[ 1816]

    lat (usec) : 2=2.18%, 4=19.43%, 10=36.36%, 20=19.23%, 50=10.43%

    lat (usec) : 100=2.60%, 250=6.43%, 500=2.75%, 750=0.23%, 1000=0.19%

    lat (msec) : 2=0.16%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%

  cpu          : usr=0.35%, sys=8.89%, ctx=27987466, majf=0, minf=1120

  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%

     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

     issued    : total=r=0/w=8388608/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0

     latency   : target=0, window=0, percentile=100.00%, depth=1



Run status group 0 (all jobs):

  WRITE: io=32768MB, aggrb=272129KB/s, minb=272129KB/s,
maxb=272129KB/s, mint=123303msec, maxt=123303msec



Disk stats (read/write):

    dm-4: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%,
aggrios=1/8378314, aggrmerge=0/10665, aggrticks=0/3661470,
aggrin_queue=3661970, aggrutil=97.88%

  sdb: ios=1/8378314, merge=0/10665, ticks=0/3661470,
in_queue=3661970, util=97.88%


===================
Kernel 4.4
===================
# ./fio  --rw=randwrite --bs=4k --direct=0 --filesize=1G --numjobs=32
--runtime=240 --group_reporting --filename=/db/test.benchmark
--name=randomwrite --fsync=3 --invalidate=1

randomwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K,
ioengine=psync, iodepth=1

...

fio-2.16-5-g915ca

Starting 32 processes

Jobs: 32 (f=32): [w(32)] [100.0% done] [0KB/219.3MB/0KB /s] [0/56.2K/0
iops] [eta 00m:00s]

randomwrite: (groupid=0, jobs=32): err= 0: pid=27703: Tue Jan  7 17:42:03 2020

  write: io=32768MB, bw=228297KB/s, iops=57074, runt=146977msec

    clat (usec): min=1, max=1647, avg=22.88, stdev=27.19

     lat (usec): min=1, max=1647, avg=22.96, stdev=27.22

    clat percentiles (usec):

     |  1.00th=[    2],  5.00th=[    3], 10.00th=[    5], 20.00th=[    8],

     | 30.00th=[   11], 40.00th=[   13], 50.00th=[   15], 60.00th=[   17],

     | 70.00th=[   22], 80.00th=[   33], 90.00th=[   52], 95.00th=[   72],

     | 99.00th=[  103], 99.50th=[  131], 99.90th=[  241], 99.95th=[  326],

     | 99.99th=[  852]

    lat (usec) : 2=0.68%, 4=5.79%, 10=18.95%, 20=40.42%, 50=22.92%

    lat (usec) : 100=9.96%, 250=1.19%, 500=0.06%, 750=0.01%, 1000=0.01%

    lat (msec) : 2=0.01%

  cpu          : usr=0.30%, sys=10.26%, ctx=24443801, majf=0, minf=295

  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%

     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

     issued    : total=r=0/w=8388608/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0

     latency   : target=0, window=0, percentile=100.00%, depth=1



Run status group 0 (all jobs):

  WRITE: io=32768MB, aggrb=228297KB/s, minb=228297KB/s,
maxb=228297KB/s, mint=146977msec, maxt=146977msec



Disk stats (read/write):

    dm-3: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%,
aggrios=0/8555222, aggrmerge=0/88454, aggrticks=0/4851095,
aggrin_queue=4858514, aggrutil=91.66%

  sdb: ios=0/8555222, merge=0/88454, ticks=0/4851095,
in_queue=4858514, util=91.66%



On Fri, Jan 24, 2020 at 5:57 PM Theodore Y. Ts'o <tytso@xxxxxxx> wrote:
>
> On Thu, Jan 23, 2020 at 10:28:47PM -0800, Colin Zou wrote:
> >
> > I used to run my application on ext3 on SSD and recently switched to
> > ext4. However, my application sees performance regression. The root
> > cause is, iosnoop shows that the workload includes a lot of fsync and
> > every fsync does data IO and also jbd2 IO. While on ext3, it seldom
> > does journal IO. Is there a way to tune ext4 to increase fsync
> > performance? Say, by reducing jbd2 IO requests?
>
> If you're not seeing journal I/O from ext3 after an fsync, you're not
> looking at things correctly.  At the very *least* there will be
> journal I/O for the commit block, unless all of the work was done
> earlier in a previous journal commit.
>
> In general, ext4 and ext3 will be doing roughly the same amount of I/O
> to the journal.  In some cases, depending on the workload, ext4
> *might* need to do more data I/O for the file being synced.  That's
> because with ext3, if there is an intervening periodic 5 second
> journal commit, some or all of the data I/O may have been forced out
> to disk earlier due to said 5 second sync.
>
> What sort of workload does your application do?  How much data blocks
> are you writing before each fsync(), and how often are the fsync()
> operations?
>
>                                                 - Ted



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux