Re: [RFC] what to do with IOCB_DSYNC?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 28, 2022 at 9:57 PM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
>
> * Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> >
> > * Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
> >
> > > On Mon, May 23, 2022 at 10:03:45AM -0600, Jens Axboe wrote:
> > > > clear_user()
> > > > 32        ~96MB/sec
> > > > 64        195MB/sec
> > > > 128       386MB/sec
> > > > 1k        2.7GB/sec
> > > > 4k        7.8GB/sec
> > > > 16k       14.8GB/sec
> > > >
> > > > copy_from_zero_page()
> > > > 32        ~96MB/sec
> > > > 64        193MB/sec
> > > > 128       383MB/sec
> > > > 1k        2.9GB/sec
> > > > 4k        9.8GB/sec
> > > > 16k       21.8GB/sec
> > >
> > > Just FYI, on x86, Samuel Neves proposed some nice clear_user()
> > > performance improvements that were forgotten about:
> > >
> > > https://lore.kernel.org/lkml/20210523180423.108087-1-sneves@xxxxxxxxx/
> > > https://lore.kernel.org/lkml/Yk9yBcj78mpXOOLL@xxxxxxxxx/
> > >
> > > Hoping somebody picks this up at some point...
> >
> > Those ~2x speedup numbers are indeed looking very nice:
> >
> > | After this patch, on a Skylake CPU, these are the
> > | before/after figures:
> > |
> > | $ dd if=/dev/zero of=/dev/null bs=1024k status=progress
> > | 94402248704 bytes (94 GB, 88 GiB) copied, 6 s, 15.7 GB/s
> > |
> > | $ dd if=/dev/zero of=/dev/null bs=1024k status=progress
> > | 446476320768 bytes (446 GB, 416 GiB) copied, 15 s, 29.8 GB/s
> >
> > Patch fell through the cracks & it doesn't apply anymore:
> >
> >   checking file arch/x86/lib/usercopy_64.c
> >   Hunk #2 FAILED at 17.
> >   1 out of 2 hunks FAILED
> >
> > Would be nice to re-send it.
>
> Turns out Boris just sent a competing optimization to clear_user() 3 days ago:
>
>   https://lore.kernel.org/r/YozQZMyQ0NDdD8cH@xxxxxxx
>
> Thanks,
>

[ CC Hugh ]

I hope I adapted both patches from Hugh and Samuel against Linux v5.18
correctly.

As I have no "modern CPU" meaning Intel Sandy-Bridge, the patch of
Hugh was not predestined for me (see numbers).

Samuel's patch gave me 15% of speedup with running Hugh's dd test-case
(cannot say if this is a real benchmark for testing).

Patches and latest linux-config attached.

*** Without patch

root# cat /proc/version
Linux version 5.18.0-3-amd64-clang14-lto (sedat.dilek@xxxxxxxxx@iniza)
(dileks clang version 14.0.4 (https://github.com/llvm/llvm-project.git
29f1039a7285a5c3a9c353d05
4140bf2556d4c4d), LLD 14.0.4) #3~bookworm+dileks1 SMP PREEMPT_DYNAMIC 2022-05-27

root# dd if=/dev/zero of=/dev/null bs=1M count=1M
1048576+0 Datensätze ein
1048576+0 Datensätze aus
1099511627776 Bytes (1,1 TB, 1,0 TiB) kopiert, 97,18 s, 11,3 GB/s

*** With hughd patch

Patch: 0001-x86-usercopy-Use-alternatives-for-clear_user.patch
Link: https://lore.kernel.org/lkml/2f5ca5e4-e250-a41c-11fb-a7f4ebc7e1c9@xxxxxxxxxx/

root# cat /proc/version
Linux version 5.18.0-4-amd64-clang14-lto (sedat.dilek@xxxxxxxxx@iniza)
(dileks clang version 14.0.4 (https://github.com/llvm/llvm-project.git
29f1039a7285a5c3a9c35>

root# dd if=/dev/zero of=/dev/null bs=1M count=1M
1048576+0 Datensätze ein
1048576+0 Datensätze aus
1099511627776 Bytes (1,1 TB, 1,0 TiB) kopiert, 588,053 s, 1,9 GB/s

root# cat /proc/version
Linux version 5.18.0-4-amd64-clang14-lto (sedat.dilek@xxxxxxxxx@iniza)
(dileks clang version 14.0.4 (https://github.com/llvm/llvm-project.git
29f1039a7285a5c3a9c353d05
4140bf2556d4c4d), LLD 14.0.4) #4~bookworm+dileks1 SMP PREEMPT_DYNAMIC 2022-05-28

*** With sneves patch

Patch: 0001-x86-usercopy-speed-up-64-bit-__clear_user-with-stos-.patch
Link: https://lore.kernel.org/lkml/20210523180423.108087-1-sneves@xxxxxxxxx/

root# cat /proc/version
Linux version 5.18.0-5-amd64-clang14-lto (sedat.dilek@xxxxxxxxx@iniza)
(dileks clang version 14.0.4 (https://github.com/llvm/llvm-project.git
29f1039a7285a5c3a9c353d05
4140bf2556d4c4d), LLD 14.0.4) #5~bookworm+dileks1 SMP PREEMPT_DYNAMIC 2022-05-28

root# dd if=/dev/zero of=/dev/null bs=1M count=1M
1048576+0 Datensätze ein
1048576+0 Datensätze aus
1099511627776 Bytes (1,1 TB, 1,0 TiB) kopiert, 82,697 s, 13,3 GB/s


-dileks // 28-May-2022




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux