Am Tue, 07 Feb 2017 13:35:58 +0100 schrieb "Jens-U. Mozdzen" <jmozdzen@xxxxxx>: > Hi *, > > we're facing an obsure problem with a fresh bcache setup: > > After creating a 8TB (netto) RAID5 device (hardware RAID > controller), setting it up for bcache (using an existing cache set) > and populating it with data, we got struck by massive dmesg reports > of "[sdx] Bad block number requested" during writeback of dirty data. > Both with our 4.1.x kernel, as well as a 4.9.8 kernel. > > After recreating the backing store with 3 TB (netto) and recreating > the bcache setup, population went without any noticable errors. > > While the 8TB device was populated with only the same amount of data > (2.7 TB), block placement was probably across all of the 8TB space > available. > > Another parameter catching the eye is block sizes - the 8 TB backing > store was created in a way such that 4k block size was exposed to > the OS, while the 3 TB backing store was created so that 512b block > size was reported. The caching set is on a PCI SSD with 512b block > size. > > So with backing:4k and cache:512b and 8 TB backing store size, > bcache went mad during writeback ("echo 0 > writeback_running" > immediately made the messages stop). With backing:512b and cache:512b > and 3 TB backing store size, we had no error reports at all. > > On a second node, we have (had) a similar situation - backing:4k > and cache:512b, but 4 TB backing store size. We've seen the errors > there, too, when accessing an especially big logical volume that > likely crossed some magic limit (block number on "physical volume"?). > We still see the message there today, only much less frequent since > we no longer use that large volume on the bcache device. Other > volumes are there now, probably with a few data spaces at high block > numbers, leading to the occasional error message (every few minutes) > during writeback? > > Even more puzzling, we have a third node, identical to the latter > one > - except that the bcache device is more filled with data and we see > no such error (yet)... > > So here we are - what are we facing? Is it a size limit regarding > the backing store? Or does the error result from mixing block sizes, > plus some other triggers? > > If the former, where's the limit? > > If it is about block sizes, questions pile up: Are the "dos" and > "don'ts" documented anywhere? It's a rather common situation for us > to run multiple backing devices on a single cache set, with both > complete HDDs and logical volumes as backing stores. So it's very > easy to come into a situation where we see either different block > sizes between backing store and caching device or even differing > block sizes between the various backing stores. > > - using 512b for cache and 4k for backing device seems not to work, > unless above is purely a size limit problem > > - 512b for cache and 512b for backing store seems to work > > - 4k for cache and 4k for backing store will probably work as well > > - will 4k for cache and 512b for backing store work (sounds likely, > as there will be no alignment problem in the backing store. OTOH, > will bcache try to write 4k data (cache block) into 512b blocks > (backing store) or will it write 8 blocks then, mapping the block > size differences?) > > - if the latter works, will using both 4k and 512b backing stores in > parallel work if using a 4k cache? > > Any insight and/or help tracking down the error are most welcome! Hmm, I think for me it refused to attach backend and cache if block sizes differ. So I think the bug is there... Once I created backing store and cache store in two separate steps. During attaching, it complained that block sizes don't match and the cacheset cannot be attached. -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html