2016-02-25 15:36 GMT+08:00 Eric Wheeler <bcache@xxxxxxxxxxxxxxxxxx>: > > On Thu, 25 Feb 2016, Eric Wheeler wrote: > > > On Thu, 25 Feb 2016, Eric Wheeler wrote: > > > > > [ +cc: kent ] > > > > > > On Wed, 24 Feb 2016, Marc MERLIN wrote: > > > > > > > On Wed, Feb 24, 2016 at 06:53:05AM +0000, Eric Wheeler wrote: > > > > > Be sure to cherry-pick these from linux 4.5-rc1: > > > > > git cherry-pick 2ef9ccbf~1..627ccd20 > > > > > or use one of the 4.1 or 3.18 longterm kernels. > > > > > > > > So, I added these patches to my 4.4.2 kernel, but it still crashes when > > > > seeing one cache device at boot. > > > > > > > > Crash: > > > > https://goo.gl/photos/8H1DtYjSijK4ngFv6 > > > > > > > > > > > static void read_dirty(struct cached_dev *dc) > > > > [...] > > > > while (!kthread_should_stop()) { > > > > try_to_freeze(); > > > > > > > > w = bch_keybuf_next(&dc->writeback_keys); > > > > if (!w) > > > > break; > > > > > > > > >>>>> BUG_ON(ptr_stale(dc->disk.c, &w->key, 0)); > > > > > > > > if (KEY_START(&w->key) != dc->last_read || > > > > > > Kent, any idea whats going on here? What is this BUG_ON checking? > > > > > > It looks like dirty data is being read immediately after register, > > > possibly due to a crash. > > > > The calltrace in the image [https://goo.gl/photos/8H1DtYjSijK4ngFv6] > > indicates something about kthread_parkme. Is our thread being woken > > unexpectedly by kthread_park? > > > > If so, maybe we can do a better job handling the BUG condition. > > > > What would happen if we did something like this: > > > > +if (kthread_should_park()) { > > + kthread_parkme(); > > + break; > > +} > > > > BUG_ON(ptr_stale(dc->disk.c, &w->key, 0)); > > > > > > /* and maybe this too: */ > > -BUG_ON(ptr_stale(dc->disk.c, &w->key, 0)); > > +if (ptr_stale(dc->disk.c, &w->key, 0)) > > + break; > > > > Or would `continue` be more appropriate? I don't think anything is lost > > at the point we break because the writeback thread will continue to > > iterate and retry the call to read_dirty(). It hasn't kzalloc'ed yet, so > > no cleanup necessary. > > > > Cleary there is a condition that isn't being handled gracfully enough, and > > clearly it cannot continue if the BUG condition is met---but its in a loop > > so can we safely iterate to retry instead of BUGing?? > > > > > > Kent, > > > > Do you think this patch would solve the BUG_ON condition in this case? > > > > > > ============================================== > > diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c > > index ca38362..529310a 100644 > > --- a/drivers/md/bcache/writeback.c > > +++ b/drivers/md/bcache/writeback.c > > @@ -234,7 +234,8 @@ static void read_dirty(struct cached_dev *dc) > > if (!w) > > break; > > > > - BUG_ON(ptr_stale(dc->disk.c, &w->key, 0)); > > + if (ptr_stale(dc->disk.c, &w->key, 0)) > > + goto err; > > > > if (KEY_START(&w->key) != dc->last_read || > > jiffies_to_msecs(delay) > 50) > > @@ -282,6 +283,10 @@ err: > > * freed) before refilling again > > */ > > closure_sync(&cl); > > + > > + if (kthread_should_park()) > > + kthread_parkme(); > > + > > } > > It looks like I can't call kthread_park* functions, perhaps those are > handled internally: > > WARNING: "kthread_should_park" [drivers/md/bcache//bcache.ko] undefined! > WARNING: "kthread_parkme" [drivers/md/bcache//bcache.ko] undefined! > > I think it would still work if the kthread_*park* function calls were > removed from the earlier patch based on the code flow. It should just > iterate and retry---assuming that it can retry. At least it > wouldn't BUG. > > If this happens at cache registration, is it a race between writeback > running too soon and the datastructures not being fully populated? > > Can anyone else comment here? Hi Eric, I'm not sure why you think it's caused by some kthread park. It is a BUG_ON with no doubt, since all keys in writeback_keys should be non-stale otherwise the writeback thread might be writing back some garbage to the backend device. See bch_btree_gc_finish(), the buckets pointed by writeback_keys are marked as GC_MARK_DIRTY, to prevent them be reclaimed by the allocator in the next, so theoretically the keys won't be stale. I guess there is some race between the device register path, the early stage GC and writeback. I think you won't see this BUG_ON after the whole system take off. But once it happens the bucket with wrong generation get persistence in SSD, which means you can't use the cache device any more. -zyh > > > -Eric > > > > /* Scan for dirty data */ > > ============================================== > > > > > > -Eric > > > > > > > > -Eric > > > > > > > > > > > I have to remove the partition for my system to boot. > > > > > > > > Before I destroy it, any other patches I should try? > > > > > > > > And to be fair, it's a huge pain to deal with this, there should be an > > > > easier way to just turn bcache off from the kernel command line. In this > > > > case it was really a lot of work to get back to even a booting system. > > > > > > > > You also said: > > > > > 4.1.18 has the patches, so unless there is something specific in 4.4 that > > > > > you need, I recommend 4.1. We've been running 4.1.17 with patches in > > > > > production for a while and it works great. Haven't tried vanilla 4.1.18 > > > > > yet, but I plan to soon. > > > > > > > > Sadly, I run btrfs, I can't just go to random old kernels like this. > > > > Is bcache not stable in up to date kernels? > > > > > > > > Thanks, > > > > Marc > > > > -- > > > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > > > Microsoft is to operating systems .... > > > > .... what McDonalds is to gourmet cooking > > > > Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html