[no subject]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bcache:
=================================================
write: IOPS=39.1k, BW=153MiB/s (160MB/s)(5120MiB/33479msec); 0 zone
resets
    slat (usec): min=4, max=157364, avg=12.47, stdev=138.93
    clat (nsec): min=1168, max=474615k, avg=11808.80, stdev=927287.74
     lat (usec): min=11, max=474622, avg=24.28, stdev=937.81
    clat percentiles (nsec):
     |  1.00th=[   1256],  5.00th=[   1304], 10.00th=[   1320],
     | 20.00th=[   1400], 30.00th=[   1448], 40.00th=[   1672],
     | 50.00th=[   8640], 60.00th=[   9152], 70.00th=[   9664],
     | 80.00th=[  10048], 90.00th=[  11328], 95.00th=[  19072],
     | 99.00th=[  27776], 99.50th=[  36608], 99.90th=[ 173056],
     | 99.95th=[ 856064], 99.99th=[2039808]
   bw (  KiB/s): min=28032, max=214664, per=99.69%, avg=156122.03, stdev=51649.87, samples=66
   iops        : min= 7008, max=53666, avg=39030.53, stdev=12912.50, samples=66
  lat (usec)   : 2=41.55%, 4=4.59%, 10=32.70%, 20=16.37%, 50=4.45%
  lat (usec)   : 100=0.10%, 250=0.17%, 500=0.02%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.03%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=11.93%, sys=38.61%, ctx=1311384, majf=0, minf=382
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1310718,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=153MiB/s (160MB/s), 153MiB/s-153MiB/s (160MB/s-160MB/s), io=5120MiB (5369MB), run=33479-33479msec

Disk stats (read/write):
    bcache0: ios=0/1305444, sectors=0/10443552, merge=0/0,
ticks=0/21789, in_queue=21789, util=65.13%, aggrios=0/0, aggsectors=0/0,
aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
  ram0: ios=0/0, sectors=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
  pmem0: ios=0/0, sectors=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

CBD cache:
==============================================
  write: IOPS=133k, BW=520MiB/s (545MB/s)(5120MiB/9848msec); 0 zone
resets
    slat (usec): min=3, max=2786, avg= 5.84, stdev=36.41
    clat (nsec): min=852, max=132404, avg=959.09, stdev=436.60
     lat (usec): min=4, max=2794, avg= 6.80, stdev=36.45
    clat percentiles (nsec):
     |  1.00th=[  884],  5.00th=[  900], 10.00th=[  908], 20.00th=[916],
     | 30.00th=[  924], 40.00th=[  924], 50.00th=[  932], 60.00th=[940],
     | 70.00th=[  948], 80.00th=[  964], 90.00th=[ 1004], 95.00th=[1064],
     | 99.00th=[ 1192], 99.50th=[ 1432], 99.90th=[ 6688], 99.95th=[7712],
     | 99.99th=[12480]
   bw (  KiB/s): min=487088, max=552928, per=99.96%, avg=532154.95, stdev=18228.92, samples=19
   iops        : min=121772, max=138232, avg=133038.84, stdev=4557.32, samples=19
  lat (nsec)   : 1000=89.09%
  lat (usec)   : 2=10.76%, 4=0.03%, 10=0.09%, 20=0.03%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%
  cpu          : usr=23.93%, sys=76.03%, ctx=61, majf=0, minf=16
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1310720,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=520MiB/s (545MB/s), 520MiB/s-520MiB/s (545MB/s-545MB/s),
io=5120MiB (5369MB), run=9848-9848msec

Disk stats (read/write):
  cbd0: ios=0/1280334, sectors=0/10242672, merge=0/0, ticks=0/0, in_queue=0, util=43.07%

    (5) no need of formating for your existing disk

As a lightweight block storage caching technology, cbd cache does not
require storing metadata on backend disk. This allows users to easily
add caching to existing disks without the need for any formatting
operations and data migration. They can also easily stop using the cbd
cache without complications, The backend disk can be used independently
as a raw disk.

    (6) backend device is crash-consistency

The writeback mechanism of cbd cache strictly follows a log-structured
approach when writeback data. Even if dirty cache data is overwritten by
new data (e.g., the old data from 0-4K is A, and new data overwrites
0-4K with B), the old data A is writeback first, followed by writeback
the new data B to overwrite on the backend disk. This ensures that the
backend disk maintains crash consistency. In the event of a failure of
the pmem device, the data on the backend disk remains usable, though
crash consistency is maintained while losing the data in the cache. This
feature is particularly useful in cloud storage for disaster recovery
scenarios.

It is important to note that this approach may lead to cache space
utilization issues if there are many overwrite operations. However,
modern file systems, such as Btrfs and F2FS, take wear leveling of the
disk into account, so they tend to avoid writing repeatedly to the same
area. This means that there will not be a large number of overwrite
writes for the disk. Additionally, modern databases, especially those
using LSM engines, rarely perform overwrite operations.

Additionally, there is an entry on the TODO list to provide a parameter
backend_consistency=false to allow users to achieve better cache space
utilization. That depends on how urgent the requirment is.

    (7) cache space for each disk is configurable

For each backend, when enabling caching, we can specify cache space size
for this backend. This is different from bcache, where all backing
devices can dynamically share the cache space within a single cache
device. This improves cache utilization by achieving optimal utilization
through time-sharing. However, this can lead to an issue where cache
behavior becomes unpredictable. In enterprise applications, it's
important to have a more precise understanding of the performance of
each disk. When multiple disks dynamically share the cache, the exact
amount of cache each disk receives becomes uncertain. cbd cache assigns
a dedicated cache space for each disk, ensuring that the cache is
exclusive and not affected by others, making the cache behavior more
predictable.

    (8) After all, all the performance test results mentioned
above were executed using the `memmap=20G!4G` option to simulate the `/dev/pmem0` device.

Additionally, the cbd code runs the cbd-tests by default, including the
xfstests suite, it passes xfstests test suite.

If anyone has a real CXL memory device, it would be great if you could
help with the testing. Thanks!


Dongsheng Yang (8):
  cbd: introduce cbd_transport
  cbd: introduce cbd_host
  cbd: introduce cbd_segment
  cbd: introduce cbd_channel
  cbd: introduce cbd_cache
  cbd: introduce cbd_blkdev
  cbd: introduce cbd_backend
  block: Init for CBD(CXL Block Device) module

 drivers/block/Kconfig             |    2 +
 drivers/block/Makefile            |    2 +
 drivers/block/cbd/Kconfig         |   45 +
 drivers/block/cbd/Makefile        |    3 +
 drivers/block/cbd/cbd_backend.c   |  395 +++++
 drivers/block/cbd/cbd_blkdev.c    |  433 ++++++
 drivers/block/cbd/cbd_cache.c     | 2410 +++++++++++++++++++++++++++++
 drivers/block/cbd/cbd_channel.c   |   96 ++
 drivers/block/cbd/cbd_handler.c   |  242 +++
 drivers/block/cbd/cbd_host.c      |  129 ++
 drivers/block/cbd/cbd_internal.h  | 1193 ++++++++++++++
 drivers/block/cbd/cbd_main.c      |  224 +++
 drivers/block/cbd/cbd_queue.c     |  574 +++++++
 drivers/block/cbd/cbd_segment.c   |  349 +++++
 drivers/block/cbd/cbd_transport.c |  957 ++++++++++++
 15 files changed, 7054 insertions(+)
 create mode 100644 drivers/block/cbd/Kconfig
 create mode 100644 drivers/block/cbd/Makefile
 create mode 100644 drivers/block/cbd/cbd_backend.c
 create mode 100644 drivers/block/cbd/cbd_blkdev.c
 create mode 100644 drivers/block/cbd/cbd_cache.c
 create mode 100644 drivers/block/cbd/cbd_channel.c
 create mode 100644 drivers/block/cbd/cbd_handler.c
 create mode 100644 drivers/block/cbd/cbd_host.c
 create mode 100644 drivers/block/cbd/cbd_internal.h
 create mode 100644 drivers/block/cbd/cbd_main.c
 create mode 100644 drivers/block/cbd/cbd_queue.c
 create mode 100644 drivers/block/cbd/cbd_segment.c
 create mode 100644 drivers/block/cbd/cbd_transport.c

-- 
2.34.1





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux