You did your test on a 80T file system, but that's not where someone would be using meta_bg. Meta_bg ges used for much larger file systems than that! With meta_bg, we have 3 block group descriptors every 64 block groups. Each block group describes 128M of memory. So for that means we are going to have 3 entries in the system zone tree for every_ 8GB of file system space, 383,216 entries for every PB. Given that each entry is 40 bytes, that means that the block_validity entries will consume 15 megabytes per PB. Now, one third of these entries overlap with the flex_bg entries (meta_bg groups are in the first, second, and last block group of each meta_bg, where are 64 block groups in 4k file systems), and of course, the default flex_bg size of 16 block groups means that there are 524,288 entries per PB. So if we include all backup sb and block groups, in a 1 PB file system, there will be roughly 786,432 entries in a 1 PB file system. (I'm ignoring the entries for the backup superblocks, but that's only about 20 or so extra entries.) So for a flex_bg 1PB file system, the amount of memory for a block_validity data structure is roughly 20M, and including all backup descriptors for meta_bg on a flex_bg + meta_bg setup is roughly 30M. I agree with you that for a non-meta_bg file system, including all of the backup superblock and block group descriptors is not going to be large. But while protecting the meta_bg group descriptors is worthwhile, protecting the backup meta_bg's is not free, and will increase the size of the tree by 33%. I'm also wondering whether or not Lustre (where they do have some file systems that are in the PB range) have run into overhead issues with block_validity. What do folks think? - Ted