Re: RBD persistent writeback cache crash (was: performance)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 8, 2021 at 7:11 PM Wido den Hollander <wido@xxxxxxxx> wrote:
>
> Hi,
>
> So I've been doing some tests with v16.2.4 with a 2TB Samsung PM983 SSD
> mounted under /mnt/rbd-cache
>
> rbd_persistent_cache_mode = ssd
> rbd_persistent_cache_size = 2G
> rbd_persistent_cache_path = /mnt/rbd-cache
> rbd_plugins = pwl_cache
>
> I tried both XFS and EXT4 as the filesystem.
>
> This however leads to fio or 'rbd bench' to crash:
>
> root@infra-138-b16-27:~# fio fio/rbd_rw_1.fio
> rbd_w_iodepth_1: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W)
> 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=1
> fio-3.1
> Starting 1 process
> Segmentation fault1)][13.3%][r=0KiB/s,w=14.7MiB/s][r=0,w=3768 IOPS][eta
> 00m:52s]
> root@infra-138-b16-27:~#
>
> (The IOps seem great!)
>
> My fio test is fairly simple:
>
> [global]
> ioengine=rbd
> clientname=admin
> pool=rbd
> rbdname=fio1
> invalidate=0
> bs=4k
> runtime=60
> direct=1
>
> [rbd_w_iodepth_1]
> rw=randwrite
> iodepth=1
>
> I have tried to trace it with gdb, but I didn't get further with my
> backtrace then:
>
> (gdb) bt
> #0  ContextWQ::process (ctx=0x7fffb8081480, this=0x7fffb8012470) at
> ./src/common/WorkQueue.h:556
> #1  ThreadPool::PointerWQ<Context>::_void_process (this=0x7fffb8012470,
> item=0x7fffb8081480, handle=...) at ./src/common/WorkQueue.h:341
> #2  0x00007fffec600912 in ThreadPool::worker (this=0x7fffb8012018,
> wt=<optimized out>) at ./src/common/WorkQueue.cc:117
> #3  0x00007fffec601801 in ThreadPool::WorkThread::entry (this=<optimized
> out>) at ./src/common/WorkQueue.h:395
> #4  0x00007ffff5c796db in start_thread (arg=0x7fffb17fa700) at
> pthread_create.c:463
> #5  0x00007ffff579e71f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> Has anybody been able to use pwl_cache successfully?

Hi Wido,

Unfortunately "rbd_persistent_cache_mode = ssd" cache has shipped
rather broken.  This particular crash is most likely already fixed
in master, but there are a few more outstanding.  There is a dozen
of "[pwl ssd] ..." tickets in the rbd project, the fixes would be
backported to pacific once the ssd mode is stable enough.

Until then, I would to stick to "rbd_persistent_cache_mode = rwl"
or avoid the pwl_cache plugin entirely.

Thanks,

                Ilya
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux