On Tue, May 08, 2012 at 11:25:05PM +0800, Zhu Han wrote: > On Tue, May 8, 2012 at 1:47 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > On Tue, May 08, 2012 at 01:10:55PM +0800, Zhu Han wrote: > > > On Tue, May 8, 2012 at 12:40 PM, Dave Chinner <david@xxxxxxxxxxxxx> > > wrote: > > > > On Tue, May 08, 2012 at 11:24:52AM +0800, Zhu Han wrote: > > > > > On Tue, May 8, 2012 at 7:59 AM, Dave Chinner <david@xxxxxxxxxxxxx> > > > > wrote: > > > > > > And so now you've triggered the speculative delayed allocation > > > > > > beyond EOF, which is normal behaviour. Hence there are currently > > > > > > unused blocks beyond EOF which will get removed either when the > > next > > > > > > close(fd) occurs on the file or the inode is removed from the > > cache. > > > > > > > > > > > > > > > > Close(fd) should be invoked before dd quits. But why the extra blocks > > > > > beyond EOF are not freed? > > > > > > > > The removal is conditional on how many times the fd has been closed > > > > with dirty data on the inode. > > > > > > > > > The only way I found to remove the extra blocks is truncate the file > > to > > > > its > > > > > real size. > > > > > > > > If the close() didn't remove them, they will be removed when the > > > > inode ages out of the cache. Why do you even care about them? > > > > > > Our distributed system depends on the real length of files to account the > > > space usage. > > > > That's ..... naive. It's never been valid to assume that the file > > size is an accurate reflection of space usage, especially as it will > > *always* be wrong for sparse files. In the same light, you also > > cannot assume that it is an accurate reflection for non-sparse files > > because we can do both explicit and speculative allocation beyond > > EOF which only du will show. Not to mention that metadata is not > > accounted in the file length, and that can consume a significant > > amount of space, too. > > > > > This behavior make the account inaccurate. > > > > The block usage reported by XFS is both accurate and correct. The > > file size reported by XFS is both accurate and correct. You're > > "account inaccuracy" is assuming that they are the same. Perhaps you > > should be using quotas for accurate space usage accounting? > > > > Anyway, if you really want to stop speculative delayed allocation > > beyond EOF, then use the allocsize mount option to control it. > > > > > Thanks for help. > > I can control the size of pre-allocation, so no data are written beyond the > pre-allocated block range, so no speculative allocation is triggered. > Besides it, our system can sync the accurate space usage of mount point > periodically. > > Can you give any hints about the most lightweight approach to get the > accurate block usage of whole file system? If you are just after the whole filesystem, then statfs(2) will give you blocks used and free. If you are after a finer breakdown, then quotas are probably what you want - they can be used for accounting separately to the space limiting enforcement. Hence you get accurate, up-to-date per user, group or project space accounting without actually limiting space usage at all. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs