Re: [PATCH v2 0/9] re-enable DAX PMD support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2016-08-30 at 17:01 -0600, Ross Zwisler wrote:
> On Tue, Aug 23, 2016 at 04:04:10PM -0600, Ross Zwisler wrote:
> > 
> > DAX PMDs have been disabled since Jan Kara introduced DAX radix
> > tree based locking.  This series allows DAX PMDs to participate in
> > the DAX radix tree based locking scheme so that they can be re-
> > enabled.
> > 
> > Changes since v1:
> >  - PMD entry locking is now done based on the starting offset of
> > the PMD entry, rather than on the radix tree slot which was
> > unreliable. (Jan)
> >  - Fixed the one issue I could find with hole punch.  As far as I
> > can tell hole punch now works correctly for both PMD and PTE DAX
> > entries, 4k zero pages and huge zero pages.
> >  - Fixed the way that ext2 returns the size of holes in
> > ext2_get_block(). (Jan)
> >  - Made the 'wait_table' global variable static in respnse to a
> > sparse warning.
> >  - Fixed some more inconsitent usage between the names 'ret' and
> > 'entry' for radix tree entry variables.
> > 
> > Ross Zwisler (9):
> >   ext4: allow DAX writeback for hole punch
> >   ext2: tell DAX the size of allocation holes
> >   ext4: tell DAX the size of allocation holes
> >   dax: remove buffer_size_valid()
> >   dax: make 'wait_table' global variable static
> >   dax: consistent variable naming for DAX entries
> >   dax: coordinate locking for offsets in PMD range
> >   dax: re-enable DAX PMD support
> >   dax: remove "depends on BROKEN" from FS_DAX_PMD
> > 
> >  fs/Kconfig          |   1 -
> >  fs/dax.c            | 297 +++++++++++++++++++++++++++++-----------
> > ------------
> >  fs/ext2/inode.c     |   3 +
> >  fs/ext4/inode.c     |   7 +-
> >  include/linux/dax.h |  29 ++++-
> >  mm/filemap.c        |   6 +-
> >  6 files changed, 201 insertions(+), 142 deletions(-)
> > 
> > -- 
> > 2.9.0
> 
> Ping on this series?  Any objections or comments?

Hi Ross,

I am seeing a major performance loss in fio mmap test with this patch-
set applied.  This happens with or without my patches [1] applied on
top of yours.  Without my patches, dax_pmd_fault() falls back to the
pte handler since an mmap'ed address is not 2MB-aligned.

I have attached three test results.
 o rc4.log - 4.8.0-rc4 (base)
 o non-pmd.log - 4.8.0-rc4 + your patchset (fall back to pte)
 o pmd.log - 4.8.0-rc4 + your patchset + my patchset (use pmd maps)

My test steps are as follows.

mkfs.ext4 -O bigalloc -C 2M /dev/pmem0
mount -o dax /dev/pmem0 /mnt/pmem0
numactl --preferred block:pmem0 --cpunodebind block:pmem0 fio test.fio

"test.fio"
---
[global]
bs=4k
size=2G
directory=/mnt/pmem0
ioengine=mmap
[randrw]
rw=randrw
---

Can you please take a look?
Thanks,
-Toshi

[1] https://lkml.org/lkml/2016/8/29/560




randrw: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
fio-2.6
Starting 1 process
randrw: Laying out IO file(s) (1 file(s) / 2048MB)

randrw: (groupid=0, jobs=1): err= 0: pid=12656: Wed Aug 31 18:14:06 2016
  read : io=1024.7MB, bw=3076.4KB/s, iops=769, runt=341062msec
    clat (usec): min=415, max=1703, avg=509.78, stdev=37.40
     lat (usec): min=415, max=1703, avg=509.81, stdev=37.40
    clat percentiles (usec):
     |  1.00th=[  482],  5.00th=[  498], 10.00th=[  498], 20.00th=[  498],
     | 30.00th=[  502], 40.00th=[  502], 50.00th=[  502], 60.00th=[  502],
     | 70.00th=[  502], 80.00th=[  506], 90.00th=[  524], 95.00th=[  540],
     | 99.00th=[  724], 99.50th=[  732], 99.90th=[  748], 99.95th=[  860],
     | 99.99th=[  900]
    bw (KB  /s): min= 2688, max= 3552, per=100.00%, avg=3078.69, stdev=143.84
  write: io=1023.4MB, bw=3072.6KB/s, iops=768, runt=341062msec
    clat (usec): min=683, max=1955, avg=788.99, stdev=45.83
     lat (usec): min=683, max=1955, avg=789.04, stdev=45.84
    clat percentiles (usec):
     |  1.00th=[  756],  5.00th=[  772], 10.00th=[  772], 20.00th=[  772],
     | 30.00th=[  772], 40.00th=[  780], 50.00th=[  780], 60.00th=[  780],
     | 70.00th=[  780], 80.00th=[  788], 90.00th=[  812], 95.00th=[  828],
     | 99.00th=[ 1004], 99.50th=[ 1012], 99.90th=[ 1128], 99.95th=[ 1144],
     | 99.99th=[ 1208]
    bw (KB  /s): min= 2752, max= 3552, per=100.00%, avg=3074.60, stdev=96.62
    lat (usec) : 500=12.55%, 750=37.73%, 1000=48.96%
    lat (msec) : 2=0.76%
  cpu          : usr=99.96%, sys=0.01%, ctx=32870, majf=0, minf=3014
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262309/w=261979/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1024.7MB, aggrb=3076KB/s, minb=3076KB/s, maxb=3076KB/s, mint=341062msec, maxt=341062msec
  WRITE: io=1023.4MB, aggrb=3072KB/s, minb=3072KB/s, maxb=3072KB/s, mint=341062msec, maxt=341062msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
randrw: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
fio-2.6
Starting 1 process
randrw: Laying out IO file(s) (1 file(s) / 2048MB)

randrw: (groupid=0, jobs=1): err= 0: pid=19521: Wed Aug 31 17:50:39 2016
  read : io=1024.7MB, bw=3034.5KB/s, iops=758, runt=345780msec
    clat (usec): min=492, max=1359, avg=517.20, stdev=55.87
     lat (usec): min=492, max=1359, avg=517.23, stdev=55.87
    clat percentiles (usec):
     |  1.00th=[  498],  5.00th=[  498], 10.00th=[  498], 20.00th=[  498],
     | 30.00th=[  502], 40.00th=[  502], 50.00th=[  502], 60.00th=[  502],
     | 70.00th=[  502], 80.00th=[  506], 90.00th=[  524], 95.00th=[  708],
     | 99.00th=[  740], 99.50th=[  756], 99.90th=[  900], 99.95th=[  908],
     | 99.99th=[ 1048]
    bw (KB  /s): min= 2600, max= 3448, per=100.00%, avg=3036.52, stdev=141.59
  write: io=1023.4MB, bw=3030.6KB/s, iops=757, runt=345780msec
    clat (usec): min=765, max=1788, avg=799.46, stdev=67.19
     lat (usec): min=766, max=1788, avg=799.50, stdev=67.20
    clat percentiles (usec):
     |  1.00th=[  772],  5.00th=[  772], 10.00th=[  772], 20.00th=[  772],
     | 30.00th=[  772], 40.00th=[  780], 50.00th=[  780], 60.00th=[  780],
     | 70.00th=[  780], 80.00th=[  788], 90.00th=[  820], 95.00th=[  996],
     | 99.00th=[ 1020], 99.50th=[ 1144], 99.90th=[ 1176], 99.95th=[ 1208],
     | 99.99th=[ 1320]
    bw (KB  /s): min= 2704, max= 3328, per=100.00%, avg=3032.56, stdev=93.00
    lat (usec) : 500=10.66%, 750=39.06%, 1000=48.19%
    lat (msec) : 2=2.08%
  cpu          : usr=99.96%, sys=0.00%, ctx=32513, majf=0, minf=3012
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262309/w=261979/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1024.7MB, aggrb=3034KB/s, minb=3034KB/s, maxb=3034KB/s, mint=345780msec, maxt=345780msec
  WRITE: io=1023.4MB, aggrb=3030KB/s, minb=3030KB/s, maxb=3030KB/s, mint=345780msec, maxt=345780msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
randrw: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=mmap, iodepth=1
fio-2.6
Starting 1 process
randrw: Laying out IO file(s) (1 file(s) / 2048MB)

randrw: (groupid=0, jobs=1): err= 0: pid=12678: Wed Aug 31 19:59:45 2016
  read : io=1024.7MB, bw=775489KB/s, iops=193872, runt=  1353msec
    clat (usec): min=1, max=297, avg= 1.67, stdev= 2.92
     lat (usec): min=1, max=297, avg= 1.70, stdev= 2.96
    clat percentiles (usec):
     |  1.00th=[    1],  5.00th=[    1], 10.00th=[    1], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    2], 60.00th=[    2],
     | 70.00th=[    2], 80.00th=[    2], 90.00th=[    2], 95.00th=[    2],
     | 99.00th=[    3], 99.50th=[    4], 99.90th=[   12], 99.95th=[   12],
     | 99.99th=[  189]
    bw (KB  /s): min=736608, max=792296, per=98.58%, avg=764452.00, stdev=39377.36
  write: io=1023.4MB, bw=774513KB/s, iops=193628, runt=  1353msec
    clat (usec): min=2, max=235, avg= 2.66, stdev= 3.59
     lat (usec): min=2, max=235, avg= 2.70, stdev= 3.61
    clat percentiles (usec):
     |  1.00th=[    2],  5.00th=[    2], 10.00th=[    2], 20.00th=[    2],
     | 30.00th=[    2], 40.00th=[    2], 50.00th=[    3], 60.00th=[    3],
     | 70.00th=[    3], 80.00th=[    3], 90.00th=[    3], 95.00th=[    3],
     | 99.00th=[    4], 99.50th=[    6], 99.90th=[   13], 99.95th=[   14],
     | 99.99th=[  193]
    bw (KB  /s): min=736288, max=789440, per=98.50%, avg=762864.00, stdev=37584.14
    lat (usec) : 2=20.18%, 4=78.23%, 10=1.40%, 20=0.16%, 50=0.01%
    lat (usec) : 250=0.03%, 500=0.01%
  cpu          : usr=46.82%, sys=53.03%, ctx=135, majf=0, minf=786279
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262309/w=261979/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1024.7MB, aggrb=775488KB/s, minb=775488KB/s, maxb=775488KB/s, mint=1353msec, maxt=1353msec
  WRITE: io=1023.4MB, aggrb=774512KB/s, minb=774512KB/s, maxb=774512KB/s, mint=1353msec, maxt=1353msec

Disk stats (read/write):
  pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux