Thank you Bob for your analyses!
09.09.2015 23:22, Bob Peterson wrote:
----- Original Message -----
Hi Bob,
Is there something I can do to help solving this issue?
Best,
Vladislav
Hi Vladislav,
I took a look at your gfs2 file system metadata.
There is nothing corrupt or in error on your file system. The system
statfs file is totally correct.
The reason you cannot create any files is because there isn't a single
resource group in your file system that can satisfy a block allocation
request. The reason is: gfs2 needs to allocate multiple blocks at a time
for "worst case scenario" and none of your resource groups contain
enough blocks for the "worst case".
Is there a paper which describes that "worst case"? I did not know about
that allocation subtleties.
A big part of the problem is that your file system uses the absolute
minimum resource group size of 32MB (-r32 was used on mkfs.gfs2), and so
there are 17847 of them, with minimal sized bitmaps. GFS2 cannot allocate
the very last several blocks of a resource group because of the calculations
used for worst case. Because your resource groups are so small, you're
basically compounding the problem: it can't allocate blocks from a LOT
of resource groups.
Heh, another person blindly copied parameters I usually use for very
small filesystems. And we used that for tests in virtualized
environments with limited space. As part of testing we tried to grow
GFS2 and found that with quite big resource groups on small enough block
devices we loose significant amount of space because remaining space is
insufficient to fit one more rg. For example, with two 8MB journals,
grow ~256+2*8 fs to ~300MB. If we had 128MB groups that was failing.
with 32MB ones it succeeded.
Well, that just means that one size does not fit all.
Normally, your file system should have bigger resource groups, and fewer
of them. If you used a normal resource group size, like 128MB, or 256MB,
or even 2048MB, a much higher percent of the file system would be usable
because there were be fewer resource groups to cover the same area.
Does that make sense?
Sure.
Is it safe enough to just drop that '-r' parameter from mkfs command
line for production filesystems?
I suspect there will be attempts to migrate to a much bigger block
devices (f.e. 1TB -> 20TB), but I'd do not concentrate on them now...
If you do mkfs.gfs2 and specify -r512, you will be able to use much more
of the file system, and it won't get into this problem until much later.
What could be the rule of thumb for prediction of such errors?
I mean at which point (in MB or %) we should start to care that we may
get such error, depending on a rg size? Is there a point until which we
definitely won't get them?
Thank you very much,
Vladislav
In the past, I've actually looked into whether we can revise the
calculations used by gfs2 for worst-case block allocations. I've still
got some patches on my system for it. But even if we do it, it won't
improve a lot, and it will take a long time to trickle out to customers.
Regards,
Bob Peterson
Red Hat File Systems
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster