>From :include/linux/highmem.h: "kmap_atomic - Atomically map a page for temporary usage - Deprecated!" Use memcpy_from_page() since does the same job of mapping, copying, and unmaping except it uses non deprecated kmap_local_page() and kunmap_local(). Following are the differences between kmal_local_page() and kmap_atomic() :- * creates local mapping per thread, local to CPU & not globally visible * allows to be called from any context * allows task preemption Performance numbers from V1 should apply as it as there is not even a single line of change in this version which only combines all the patches into one :- There is a slight performance difference observed with the use of new API on the one arch I've tested with two different sets :- Set 1 (Average of 3 runs) :- ----------------------------- * Latency (lower is better) :- ~14 higher with this patch seires * IOPS/BW (higner is better) :- ~47k higner with this patch series * CPU Usage (lower is better) :- approximately the same Set 2 (Average of 3 runs) :- ----------------------------- * Latency (lower is better) :- ~9 higher with this patch seires * IOPS/BW (higner is better) :- ~23k higner with this patch series * CPU Usage (lower is better) :- approximately the same Below is the test for the fio verification job and perf numbers on brd. In case someone shows up with performance regression on the arch that I've don't have access to we can decide then if we want to drop it this or keep using deprecated kernel API, but I think removing deprecated API is useful in long term in anyway. -ck v2:- Merge all the patches into a single patch. No functional change from V1. Chaitanya Kulkarni (1): brd: use memcpy_to|from_page() in copy_to|from_brd() drivers/block/brd.c | 26 ++++++++------------------ 1 file changed, 8 insertions(+), 18 deletions(-) fio verify job output: linux-block (brd-memcpy) # git log -1 commit ea45fcc44031dc56055b194f0792fb2230caba00 (HEAD -> brd-memcpy) Author: Chaitanya Kulkarni <kch@xxxxxxxxxx> Date: Sun Apr 9 15:14:01 2023 -0700 brd: use memcpy_xxx_page() lib functions "kmap_atomic - Atomically map a page for temporary usage - Deprecated!" Use memcpy_from_page() helper that does same job of mapping and copying buffer that is opcoded in copy_from_brd() except the library function also uses non deprecated kmap_local_page() and kunmap_local() instead of kmap() amd kunmap() in current code. Use memcpy_to_page() helper that does same job of mapping and copying buffer that is opcoded in copy_to_brd() except the library function also uses non deprecated kmap_local_page() and kunmap_local() instead of kmap() amd kunmap() in current code. Signed-off-by: Chaitanya Kulkarni <kch@xxxxxxxxxx> linux-block (brd-memcpy) # ./compile_brd.sh + umount /mnt/brd umount: /mnt/brd: not mounted. + dmesg -c + modprobe -r brd + lsmod + grep brd ++ nproc + make -j 48 M=drivers/block modules CC [M] drivers/block/floppy.o CC [M] drivers/block/brd.o CC [M] drivers/block/loop.o CC [M] drivers/block/nbd.o CC [M] drivers/block/virtio_blk.o CC [M] drivers/block/xen-blkfront.o CC [M] drivers/block/rbd.o CC [M] drivers/block/mtip32xx/mtip32xx.o CC [M] drivers/block/xen-blkback/blkback.o CC [M] drivers/block/zram/zram_drv.o CC [M] drivers/block/xen-blkback/xenbus.o CC [M] drivers/block/null_blk/main.o CC [M] drivers/block/null_blk/trace.o CC [M] drivers/block/null_blk/zoned.o CC [M] drivers/block/drbd/drbd_bitmap.o CC [M] drivers/block/drbd/drbd_proc.o CC [M] drivers/block/drbd/drbd_worker.o CC [M] drivers/block/drbd/drbd_receiver.o CC [M] drivers/block/drbd/drbd_req.o CC [M] drivers/block/drbd/drbd_actlog.o CC [M] drivers/block/drbd/drbd_main.o CC [M] drivers/block/drbd/drbd_nl.o CC [M] drivers/block/drbd/drbd_state.o CC [M] drivers/block/drbd/drbd_nla.o CC [M] drivers/block/drbd/drbd_debugfs.o LD [M] drivers/block/zram/zram.o LD [M] drivers/block/xen-blkback/xen-blkback.o LD [M] drivers/block/null_blk/null_blk.o LD [M] drivers/block/drbd/drbd.o MODPOST drivers/block/Module.symvers LD [M] drivers/block/floppy.ko LD [M] drivers/block/brd.ko LD [M] drivers/block/loop.ko LD [M] drivers/block/nbd.ko LD [M] drivers/block/virtio_blk.ko LD [M] drivers/block/xen-blkfront.ko LD [M] drivers/block/xen-blkback/xen-blkback.ko LD [M] drivers/block/drbd/drbd.ko LD [M] drivers/block/rbd.ko LD [M] drivers/block/mtip32xx/mtip32xx.ko LD [M] drivers/block/zram/zram.ko LD [M] drivers/block/null_blk/null_blk.ko + HOST=drivers/block/brd.ko ++ uname -r + HOST_DEST=/lib/modules/6.3.0-rc5lblk+/kernel/drivers/block/ + cp drivers/block/brd.ko /lib/modules/6.3.0-rc5lblk+/kernel/drivers/block// + ls -lrth /lib/modules/6.3.0-rc5lblk+/kernel/drivers/block//brd.ko -rw-r--r--. 1 root root 375K Apr 10 13:17 /lib/modules/6.3.0-rc5lblk+/kernel/drivers/block//brd.ko + dmesg -c [81687.581471] brd: module unloaded + lsmod + grep brd linux-block (brd-memcpy) # modprobe brd rd_size=$((70*1024*1204)) rd_nr=1; ls /dev/ram0 /dev/ram0 linux-block (brd-memcpy) # cat fio/verify.fio [write-and-verify] rw=randwrite bs=4k direct=1 ioengine=libaio iodepth=16 norandommap randrepeat=0 verify=crc32c size=15G allow_file_create=0 group_reporting linux-block (brd-memcpy) # fio --filename= /dev/ram0 fio: option filename requires an argument linux-block (brd-memcpy) # fio fio/verify.fio --filename=/dev/ram0 write-and-verify: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16 fio-3.27 Starting 1 process Jobs: 1 (f=0): [f(1)][100.0%][r=1058MiB/s][r=271k IOPS][eta 00m:00s] write-and-verify: (groupid=0, jobs=1): err= 0: pid=57965: Mon Apr 10 13:16:54 2023 read: IOPS=390k, BW=1522MiB/s (1596MB/s)(9710MiB/6381msec) slat (nsec): min=1152, max=70725, avg=1481.93, stdev=397.01 clat (nsec): min=1092, max=224095, avg=38774.90, stdev=2190.45 lat (usec): min=2, max=225, avg=40.30, stdev= 2.23 clat percentiles (nsec): | 1.00th=[37120], 5.00th=[37120], 10.00th=[37632], 20.00th=[37632], | 30.00th=[38144], 40.00th=[38144], 50.00th=[38144], 60.00th=[38656], | 70.00th=[38656], 80.00th=[39168], 90.00th=[39680], 95.00th=[43264], | 99.00th=[47360], 99.50th=[49920], 99.90th=[52480], 99.95th=[54528], | 99.99th=[85504] write: IOPS=162k, BW=634MiB/s (665MB/s)(15.0GiB/24209msec); 0 zone resets slat (usec): min=2, max=744, avg= 5.54, stdev= 2.45 clat (nsec): min=1002, max=843151, avg=92648.20, stdev=18028.23 lat (usec): min=5, max=848, avg=98.24, stdev=19.04 clat percentiles (usec): | 1.00th=[ 64], 5.00th=[ 72], 10.00th=[ 76], 20.00th=[ 79], | 30.00th=[ 82], 40.00th=[ 85], 50.00th=[ 90], 60.00th=[ 93], | 70.00th=[ 97], 80.00th=[ 106], 90.00th=[ 120], 95.00th=[ 129], | 99.00th=[ 147], 99.50th=[ 155], 99.90th=[ 176], 99.95th=[ 186], | 99.99th=[ 206] bw ( KiB/s): min=222288, max=761392, per=98.81%, avg=641985.14, stdev=90922.65, samples=49 iops : min=55572, max=190348, avg=160496.33, stdev=22730.68, samples=49 lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=38.57% lat (usec) : 100=46.17%, 250=15.26%, 500=0.01%, 1000=0.01% cpu : usr=48.58%, sys=51.34%, ctx=17, majf=0, minf=58280 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=2485856,3932160,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): READ: bw=1522MiB/s (1596MB/s), 1522MiB/s-1522MiB/s (1596MB/s-1596MB/s), io=9710MiB (10.2GB), run=6381-6381msec WRITE: bw=634MiB/s (665MB/s), 634MiB/s-634MiB/s (665MB/s-665MB/s), io=15.0GiB (16.1GB), run=24209-24209msec Disk stats (read/write): ram0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00% linux-block (brd-memcpy) # -- 2.29.0