On 2017/10/14 下午8:04, Sverd Johnsen wrote: > Yes. bcache-tools ships some udev rules associated helper utils that > are used here. > Hi Sverd, Is it possible for you to test the attached patch ? This is an effort to avoid NULL dereference issue, let's try whether it works. Thanks in advance. Coly Li > On 14 October 2017 at 13:42, Coly Li <i@xxxxxxx> wrote: >> On 2017/10/14 下午7:14, Sverd Johnsen wrote: >>> This is on 4.13.5. Happens sometime at boot, I just reboot and it >>> works fine. No other problems. >>> >>> 40.391116] BUG: unable to handle kernel NULL pointer dereference at >>> 00000000000006bc >>> 40.391663] IP: _raw_spin_lock_irqsave+0x12/0x30 >>> 40.392152] PGD 0 >>> 40.392153] P4D 0 >>> 40.392658] >>> 40.393070] bcache: bch_journal_replay() journal replay done, 21 >>> keys in 10 entries, seq 34810 >>> 40.393427] bcache: register_cache() registered cache device sdc4 >>> 40.394669] Oops: 0002 [#1] PREEMPT SMP >>> 40.395174] Modules linked in: tun(+) vhost tap kvm md_mod bcache [snip] >> >> Hi Sverd, >> >> A fast glance on the code, c->data_bucket_lock from bch_alloc_sectors() >> is very suspicious. c->data_bucket_lock is initialized in >> bch_open_buckets_alloc(), which is called after calling kobject_init() >> when allocate a cache set in bch_cache_set_alloc(). >> >> Cache/cache device register is via sysfs entry, therefor it is possible >> that before spin lock c->data_bucket_lock is initialized, a cache/cached >> device registration request sent into /sys/fs/bcache/register, then >> trigger a NULL deference on the spin lock. >> >> Normally it won't happen if the command is typed by human being. Do you >> use some script to run the bcache automatically ? Then I can do further >> check to confirm whether my guess is correct.
[Patch] bcache: initiate cached device kobjects at last Bcache cached device sysfs entries are initialized eariler than other related kernel data structure, there is possibility that when cached device starts to run by writting to its sysfs but the its kernel resources is not initialized yet, e.g. allocator thread or data bucket spin_lock. This kind of race will trigger kernel panic by NULL dereference. This patch modifies the location where related kobjects are created, to make sure before sysfs entry being avaible to user space, all necessary kernel sources of the cached device are initialized. Signed-off-by: Coly Li <colyli@xxxxxxx> --- diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index fc0a31b13ac4..e1c02d869e0d 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1104,12 +1104,10 @@ static int cached_dev_init(struct cached_dev *dc, unsigned block_size) INIT_LIST_HEAD(&dc->list); closure_init(&dc->disk.cl, NULL); set_closure_fn(&dc->disk.cl, cached_dev_flush, system_wq); - kobject_init(&dc->disk.kobj, &bch_cached_dev_ktype); INIT_WORK(&dc->detach, cached_dev_detach_finish); sema_init(&dc->sb_write_mutex, 1); INIT_LIST_HEAD(&dc->io_lru); spin_lock_init(&dc->io_lock); - bch_cache_accounting_init(&dc->accounting, &dc->disk.cl); dc->sequential_cutoff = 4 << 20; @@ -1138,6 +1136,10 @@ static int cached_dev_init(struct cached_dev *dc, unsigned block_size) bch_cached_dev_request_init(dc); bch_cached_dev_writeback_init(dc); + bch_cache_accounting_init(&dc->accounting, &dc->disk.cl); + + kobject_init(&dc->disk.kobj, &bch_cached_dev_ktype); + return 0; } @@ -1467,11 +1469,6 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb) closure_set_stopped(&c->cl); closure_put(&c->cl); - kobject_init(&c->kobj, &bch_cache_set_ktype); - kobject_init(&c->internal, &bch_cache_set_internal_ktype); - - bch_cache_accounting_init(&c->accounting, &c->cl); - memcpy(c->sb.set_uuid, sb->set_uuid, 16); c->sb.block_size = sb->block_size; c->sb.bucket_size = sb->bucket_size; @@ -1534,6 +1531,10 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb) c->congested_write_threshold_us = 20000; c->error_limit = 8 << IO_ERROR_SHIFT; + kobject_init(&c->kobj, &bch_cache_set_ktype); + kobject_init(&c->internal, &bch_cache_set_internal_ktype); + bch_cache_accounting_init(&c->accounting, &c->cl); + return c; err: bch_cache_set_unregister(c); @@ -1815,7 +1816,6 @@ static int cache_alloc(struct cache *ca) struct bucket *b; __module_get(THIS_MODULE); - kobject_init(&ca->kobj, &bch_cache_ktype); bio_init(&ca->journal.bio, ca->journal.bio.bi_inline_vecs, 8); @@ -1839,6 +1839,8 @@ static int cache_alloc(struct cache *ca) for_each_bucket(b, ca) atomic_set(&b->pin, 0); + kobject_init(&ca->kobj, &bch_cache_ktype); + return 0; }