On Mon, Feb 15, 2021 at 04:58:05PM +0000, Johannes Thumshirn wrote: > On 11/02/2021 16:48, David Sterba wrote: > > On Thu, Feb 11, 2021 at 03:26:04PM +0000, Johannes Thumshirn wrote: > >> On 11/02/2021 16:21, David Sterba wrote: > >>> On Thu, Feb 11, 2021 at 09:58:09AM +0000, Johannes Thumshirn wrote: > >>>> On 10/02/2021 21:02, David Sterba wrote: > >>>>>> This series implements superblock log writing. It uses two zones as a > >>>>>> circular buffer to write updated superblocks. Once the first zone is filled > >>>>>> up, start writing into the second zone. The first zone will be reset once > >>>>>> both zones are filled. We can determine the postion of the latest > >>>>>> superblock by reading the write pointer information from a device. > >>>>> > >>>>> About that, in this patchset it's still leaving superblock at the fixed > >>>>> zone number while we want it at a fixed location, spanning 2 zones > >>>>> regardless of their size. > >>>> > >>>> We'll always need 2 zones or otherwise we won't be powercut safe. > >>> > >>> Yes we do, that hasn't changed. > >> > >> OK that I don't understand, with the log structured superblocks on a zoned > >> filesystem, we're writing a new superblock until the 1st zone is filled. > >> Then we advance to the second zone. As soon as we wrote a superblock to > >> the second zone we can reset the first. > >> If we only use one zone, > > > > No, that can't work and nobody suggests that. > > > >> we would need to write until it's end, reset and > >> start writing again from the beginning. But if a powercut happens between > >> reset and first write after the reset, we end up with no superblock. > > > > What I'm saying and what we discussed on slack in December, we can't fix > > the zone number for the 1st and 2nd copy of superblock like it is now in > > sb_zone_number. > > > > The primary superblock must be there for any reference and to actually > > let the tools learn about the incompat bits. > > > > The 1st copy is now fixed zone 16, which depends on the zone size. The > > idea is to define the superblock offsets to start at given offsets, > > where the ring buffer has the two consecutive zones, regardless of their > > size. > > > > primary: 0 > > 1st copy: 16G > > 2nd copy: 256G > > > > Due to the variability of the zones in future devices, we'll reserve a > > space at the superblock interval, assuming the zone sizes can grow up to > > several gigabytes. Current working number is 1G, with some safety margin > > the reserved ranges would be (eg. for a 4G zone size): > > > > primary: 0 up to 8G > > 1st copy: 16G up to 24G > > 2nd copy: 256G up to 262G > > > > It is wasteful but we want to be future proof and expecting disk sizes > > from tens of terabytes to a hundred terabytes, it's not significant > > loss of space. > > > > If the zone sizes can be expected higher than 4G, the 1st copy can be > > defined at 64G, that would leave us some margin until somebody thinks > > that 32G zones are a great idea. > > > > We've been talking about this today and our proposal would be as follows: > Primary SB is two zones starting at LBA 0 > Seconday SB the two zones starting with the zone that contains the address 16G For the secondary SB on a file system < 16GB, how do you think of using the last two zones (or zones #2, #3 will do)? Then, we can assure to have two SB copies even on such a file system. > Third SB the two zones starting with the zone that contains the address 256G > or not present if the disk is too small. > > This would make it safe until a zone size of 8GB and we'd have adjacent > superblock log zones then. > > How does that sound? > > Byte, > Johannes >