Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Sat, 4 Mar 2023 16:47:25 +0000

On Sat, Mar 04, 2023 at 12:08:36PM +0100, Hannes Reinecke wrote:
> We could implement a (virtual) zoned device, and expose each zone as a
> block. That gives us the required large block characteristics, and with
> a bit of luck we might be able to dial up to really large block sizes
> like the 256M sizes on current SMR drives.
> ublk might be a good starting point.

Ummmm.  Is supporting 256MB block sizes really a desired goal?  I suggest
that is far past the knee of the curve; if we can only write 256MB chunks
as a single entity, we're looking more at a filesystem redesign than we
are at making filesystems and the MM support 256MB size blocks.

The current work is all going towards tracking memory in larger chunks,
so writing back, eg, 64kB chunks of the file.  But if 256MB is where
we're going, we need to be thinking more like a RAID device and
accumulating writes into a log that we can then blast out in a single
giant write.

fsync() and O_SYNC is going to be painful for that kind of device.