> 2022年6月21日 13:40,Nikhil Kshirsagar <nkshirsagar@xxxxxxxxx> 写道: > > I figured later that you probably meant for me to change the > SB_JOURNAL_BUCKETS to 8 in bcache-tools and not the kernel? > > Regards, > Nikhil. Hi Nikhil, As I said in previous offline email, you should modify both bcache-tool and kernel code for SB_JOURNAL_BUCKETS, to 8 or 16, and recompile both. With the patch it is very hard to reproduce the deadlock (because it is fixed by this patch), you may observe the free journal space in run time and reboot time. If there is alway at least 1 journal bucket reserved during run time, then you won’t observe the journal no-space deadlock in boot time. But 4.15 kernel is not robust enough for bcache (5.4+ is good and 5.10+ is better), if you are stucked by other bugs during this testing, it is possible. Coly Li > > On Tue, 21 Jun 2022 at 11:06, Nikhil Kshirsagar <nkshirsagar@xxxxxxxxx> wrote: >> >> Thank you Kent, >> >> I've made this change, in include/uapi/linux/bcache.h and will build >> the kernel with it to attempt to reproduce the issue, and create a new >> bcache device. Just wondering if the note about it being divisible by >> BITS_PER_LONG may restrict it to a minimum value of 32? >> >> #define SB_JOURNAL_BUCKETS 8 >> /* SB_JOURNAL_BUCKETS must be divisible by BITS_PER_LONG */ >> >> I have a "cache" nvme disk of about 350 tb and some slow disks, each >> approx 300tb which I will use to create the bcache device once the >> kernel is installed. My bcache setup typically would look like, >> >> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT >> sdb 8:16 0 279.4G 0 disk >> └─bcache0 252:0 0 279.4G 0 disk >> sdc 8:32 0 279.4G 0 disk >> └─bcache2 252:256 0 279.4G 0 disk >> sdd 8:48 0 279.4G 0 disk >> └─bcache1 252:128 0 279.4G 0 disk >> nvme0n1 259:0 0 372.6G 0 disk >> ├─bcache0 252:0 0 279.4G 0 disk >> ├─bcache1 252:128 0 279.4G 0 disk >> └─bcache2 252:256 0 279.4G 0 disk >> >> Regards, >> Nikhil. >> >> On Tue, 21 Jun 2022 at 10:05, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote: >>> >>> On Tue, Jun 21, 2022 at 09:11:10AM +0530, Nikhil Kshirsagar wrote: >>>> Hello all, >>>> >>>> I am trying to reproduce the problem that >>>> 32feee36c30ea06e38ccb8ae6e5c44c6eec790a6 fixes, but I am not sure how. >>>> This is to verify and test its backport >>>> (https://pastebin.com/fEYmPZqC) onto kernel 4.15 (Thanks Kent for the >>>> help with that backport!) >>>> >>>> Could this be reproduced by creating a bcache device with a smaller >>>> journal size? And if so, is there some way to pass the journal size >>>> argument during the creation of the bcache device? >>> >>> Change SB_JOURNAL_BUCKETS to 8 and make a new cache, it's used in the >>> initialization path. >>> >>> Bonus points would be to tweak journal reclaim so that we're slower to reclaim >>> to makes sure the journal stays full, and then test recovery.