On 2020/7/11 06:47, Ken Raeburn wrote: > > The long version is written up at > https://bugzilla.redhat.com/show_bug.cgi?id=1783075 but the short > version: > > There are devices out there which set q->limits.io_opt to small values > like 4096 bytes, causing bcache to use that for the stripe size, but the > device size could still be large enough that the computed stripe count > is 2**32 or more. That value gets stuffed into a 32-bit (unsigned int) > field, throwing away the high bits, and then that truncated value is > range-checked and used. This can result in memory corruption or faults > in some cases. > > The problem was brought up with us on Red Hat's VDO driver team by a > bcache user on a 4.17.8 kernel, has been demonstrated in the Fedora > 5.3.15-300.fc31 kernel, and by inspection appears to be present in > Linus's tree as of this morning. > > The easy fix would be to keep the quotient in a 64-bit variable until > it's validated, but that would simply limit the size of such devices as > bcache backing storage (in this case, limiting VDO volumes to under 8 > TB). Is there a way to still be able to use larger devices? Perhaps > scale up the stripe size from io_opt to the point where the stripe count > falls in the allowed range? > > Ken Raeburn > (Red Hat VDO driver developer) > We cannot extend the bit width of nr_stripes, because d->full_dirty_stripes memory allocation depends on it. For the 18T volume, and stripe_size is 4KB, there are 4831838208 stripes. Then size of d->full_dirty_stripes will be 4831838208*sizeof(atomic_t) > 140GB. This is too large for kernel memory allocation. Does it help of we have a option in bcache-tools to specify a stripe_size number to overwrite limit->io_opt ? Then you may specify a larger stripe size which may avoid nr_stripes overflow. Thanks for the report. Coly Li