On 10/30/20 9:51 AM, Naohiro Aota wrote:
Superblock (and its copies) is the only data structure in btrfs which has a
fixed location on a device. Since we cannot overwrite in a sequential write
required zone, we cannot place superblock in the zone. One easy solution is
limiting superblock and copies to be placed only in conventional zones.
However, this method has two downsides: one is reduced number of superblock
copies. The location of the second copy of superblock is 256GB, which is in
a sequential write required zone on typical devices in the market today.
So, the number of superblock and copies is limited to be two. Second
downside is that we cannot support devices which have no conventional zones
at all.
To solve these two problems, we employ superblock log writing. It uses two
zones as a circular buffer to write updated superblocks. Once the first
zone is filled up, start writing into the second buffer. Then, when the
both zones are filled up and before start writing to the first zone again,
it reset the first zone.
We can determine the position of the latest superblock by reading write
pointer information from a device. One corner case is when the both zones
are full. For this situation, we read out the last superblock of each
zone, and compare them to determine which zone is older.
The following zones are reserved as the circular buffer on ZONED btrfs.
- The primary superblock: zones 0 and 1
- The first copy: zones 16 and 17
- The second copy: zones 1024 or zone at 256GB which is minimum, and next
to it
If these reserved zones are conventional, superblock is written fixed at
the start of the zone without logging.
Signed-off-by: Naohiro Aota <naohiro.aota@xxxxxxx>
---
fs/btrfs/block-group.c | 9 ++
fs/btrfs/disk-io.c | 41 +++++-
fs/btrfs/scrub.c | 3 +
fs/btrfs/volumes.c | 21 ++-
fs/btrfs/zoned.c | 311 +++++++++++++++++++++++++++++++++++++++++
fs/btrfs/zoned.h | 40 ++++++
6 files changed, 413 insertions(+), 12 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index c0f1d6818df7..e989c66aa764 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1723,6 +1723,7 @@ int btrfs_rmap_block(struct btrfs_fs_info *fs_info, u64 chunk_start,
static int exclude_super_stripes(struct btrfs_block_group *cache)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
+ bool zoned = btrfs_is_zoned(fs_info);
u64 bytenr;
u64 *logical;
int stripe_len;
@@ -1744,6 +1745,14 @@ static int exclude_super_stripes(struct btrfs_block_group *cache)
if (ret)
return ret;
+ /* shouldn't have super stripes in sequential zones */
+ if (zoned && nr) {
+ btrfs_err(fs_info,
+ "Zoned btrfs's block group %llu should not have super blocks",
+ cache->start);
+ return -EUCLEAN;
+ }
+
I'm very confused about this check, namely how you've been able to test without
it blowing up, which makes me feel like I'm missing something.
We _always_ call exclude_super_stripes(), and we're simply looking up the bytenr
for that block, which appears to not do anything special for zoned. This should
be looking up and failing whenever it looks for super stripes far enough out.
How are you not failing here everytime you mount the fs? Thanks,
Josef